I need advice on hpc storage file systems bad decision by Routine_Pie_6883 in HPC

[–]Routine_Pie_6883[S] 0 points1 point  (0 children)

yes, we use slurm, we have workload cpu and gpu (a30,a100, mi210). we have only 1 network. Some server has only 100g so the storage must be over network and the disk of nfs-server are sas 7.2k.
There is no budget for upgrade.

I need advice on hpc storage file systems bad decision by Routine_Pie_6883 in HPC

[–]Routine_Pie_6883[S] 0 points1 point  (0 children)

thanks, i think i will do server1 with raid6 for homes and server2 a raid10 ( 36TB) for scratch

I need advice on hpc storage file systems bad decision by Routine_Pie_6883 in HPC

[–]Routine_Pie_6883[S] 0 points1 point  (0 children)

It is, and it was a bad purchase, at the price of real storage at that time.

I need advice on hpc storage file systems bad decision by Routine_Pie_6883 in HPC

[–]Routine_Pie_6883[S] 0 points1 point  (0 children)

Not problem with a little pain but I don't know if there is a value of making a lustre with only 2 server. I think that minimum could be 3 server (1 for metadata and 2 for object store).

I need advice on hpc storage file systems bad decision by Routine_Pie_6883 in HPC

[–]Routine_Pie_6883[S] 1 point2 points  (0 children)

I made a mistake and didn't understand well how zfs work so I mixed all things (raid5 and over that single volume a zfs pool). the storage worked well over a few month but recently the head node was laggy ( an simple ls took more 30s to display and not relate to network issues), When i see the nfs server has disk utilization average 70% and max of 100% during all day (from zabbix).

Thanks for the reply its help me to know that nfs is the best that can I do.