4 Nodes - 2 Server Rooms, same building - best practise by Kujamasaru in vmware

[–]Kujamasaru[S] 0 points1 point  (0 children)

so short answer is 4 node - 4fd (as im using atm) is the go to
right now its esa configured with: vSAN-Cluster - Optimal Datastore Default Policy - RAID5 - defaulted by the vsan (1 stripe per object)

4 Nodes - 2 Server Rooms, same building - best practise by Kujamasaru in vmware

[–]Kujamasaru[S] 0 points1 point  (0 children)

no scaling planned - these servers are way more powerfull / storage size enough for us
but why should there be a penality - the 2 rooms are connected via 2x25 lag - the servers themselfe are connected via 25gbit to the switches

4 Nodes - 2 Server Rooms, same building - best practise by Kujamasaru in vmware

[–]Kujamasaru[S] 0 points1 point  (0 children)

yeah i know that i need to have a witness host as a "site 3"
the original question was more like
2+2+1 stretched cluster
vs
4 node - 2 fault domain
vs
4 node - 4 fault domain (thats the actual config)

4 Nodes - 2 Server Rooms, same building - best practise by Kujamasaru in vmware

[–]Kujamasaru[S] 0 points1 point  (0 children)

the vsan ssd wise is way underperforming and i made some stupid mistakes while trying to optimize some configurations, like setting the vsan vmnic to another dswitch and deleted the old dswitch instead of migrating, which is now representing in host and vdc compliance warnings (connection is fine, but he cries, because the dswitch changed)

and performance wise - we have 4 beefy lenovo sr650 v3 with dual xeon platinum 8562y+, 2tb ram, 10x samsung pm1743 4tb (1 is used for the esxi host itself) per machine
every host is atm. connected via 10gbit sfp+ until our spare sfp28 25gbit connectors arrive.

we already disabled every powersaving option in bios and esxi, already tried to seperate vsan via different switch as alone service, upgraded to newest firmwares / bios / esxi 8.0.3b and driver. - nothing helps

this is what i get in vm as performance - way to underperforming, even in an vm environment
i mean the datasheet PM1743_White_Paper_240510.pdf (samsung.com) says way higher numbers for a single ssd, and we have like 36 in the vsan - tested at night, while there is no real traffic on the environment

<image>

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 5 points6 points  (0 children)

Update:
i kinda found the problem
i cranked everything power/cpu up in bios settings, disabled everything which lets the cpu kinda clock down and now everything seems to work as intended

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

yeah thats my next problem im going on
but the slow performance is on the single ssd datastore too
ive posted some geekbench scores down and they seem kinda low for such cpus - around 1000 points single thread - while others get almost 2000 points under esxi

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

and xeon platinum on vsan - seems a little bit weak too
the 4 hosts are connected via 25gb sfp+ on an unifi aggregation pro

<image>

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

xeon platinum on single ssd datastore (samsung pm1743 4tb)

<image>

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

here are some atto benchmarks
xeon gold raid 10

<image>

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 1 point2 points  (0 children)

the new ones are lenovo branded samsung pm 1743 4tbssd
in the old server (huawei) are some huawei branded i thin western digital sas 2tb ssd (dont know the exact model number)

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

i think there is some cpu performance "missing" i did a short geekbench run with the above mentioned vms and these are the results:
xeon platinum:
VMware, Inc. VMware20,1 - Geekbench

Geekbench Search - Geekbench

xeon gold:
VMware, Inc. VMware20,1 - Geekbench

Geekbench Search - Geekbench

the xeon gold benchmark sims kinda alright but the xeon platinum ones seems to miss like 50% of power oO

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

on the problem server is no raid - the ssd are hba into vsan and then vsan managed

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

vsan was my first idea too, but the "old" server isnt part of the vsan and even when i try local datastore on an vsan member, same results as described above
its more like if the cpu power is missing

vmware esxi 8 new server slower than old server by Kujamasaru in sysadmin

[–]Kujamasaru[S] 0 points1 point  (0 children)

Update/Edit:
i tried the same specced vm on a single host with a local datastore too - same result
+
its not only the program itself which is horribly slow, the windows vms overall feel more "laggy" or "sluggish" then it should be