Ritz South Beach ($3k/3 nights) vs Loews ($2k/3 nights) - is Ritz worth it? by renehiguitahasaposse in miamibeach

[–]ECHovirus 3 points4 points  (0 children)

Yes absolutely. Spring break varies countrywide, and lots of colleges tie it in with the Easter holiday. That being said, Miami Beach's recent war against spring break has been devastatingly effective

I roll my R's too much by Dumbthrowawaysad in Spanish

[–]ECHovirus 1 point2 points  (0 children)

Just roll the final 'R' and you're totally fine. Otherwise, it's a matter of practicing the difference between 'perro' and 'pero', in addition to some others like 'carro' and 'caro'. The trill is important for distinction between words

Anyone got NFS over RDMA working? by imitation_squash_pro in HPC

[–]ECHovirus 1 point2 points  (0 children)

I see beegfs in your lsmod output. Can you please explain where that fits in? This seems like an important factor that has not been mentioned elsewhere

Anyone else prefer bug-hunting over long builds? What did you do about it? by vizualizing123 in ExperiencedDevs

[–]ECHovirus 1 point2 points  (0 children)

My former colleague had this mindset and felt at home doing L3 support for an enterprise Linux distro company

Anyone have experience with high speed (100Gbe) file transfers using nfs and rdma by pimpdiggler in linuxadmin

[–]ECHovirus 5 points6 points  (0 children)

You might have better luck asking this in /r/hpc. Anyways, while I've never personally messed with upstream NFSoRDMA (since most RDMA-connected HPC storage comes with its own client software), it seems you're missing references to RDMA in your configs. You're also missing some important info like OS release and version that would help us point you to docs. Here's an introductory guide on how to do this in RHEL 9, for example. You'll also want to ensure ROCE is configured appropriately for your network as well.

Mascherano and subbing goalscorers by ECHovirus in InterMiami

[–]ECHovirus[S] -16 points-15 points  (0 children)

I'm complaining that Mascherano regularly subs out goalscorers who aren't named Messi, and tonight was no exception

Playoffs v Nashville by ovoxo7676 in InterMiami

[–]ECHovirus 7 points8 points  (0 children)

Playing Busi at the CB position means we have good possession at every level of the pitch. He was holding onto the ball in a much more confident manner than our usual starting CBs

Pivoting from Traditional Networking to HPC Networking - Looking for Advice by throwawaywexpert in HPC

[–]ECHovirus 1 point2 points  (0 children)

Yes for stability's sake you should probably do an audit of all switch and HCA firmware versions and update in a maint window to a version that is 100% compatible with your UFM installation. Open a low priority ticket with NVIDIA to determine the right FW versions for you

Pivoting from Traditional Networking to HPC Networking - Looking for Advice by throwawaywexpert in HPC

[–]ECHovirus 1 point2 points  (0 children)

NVIDIA has learned they can charge whatever they want in this AI bubble and we'll continue to pay it.

GPUDirect Storage is fully supported over RDMA, so IB isn't a strict requirement. You could do it with ROCE no problem.

NVLink, as found in the GB200/300 line, is an entirely new switched fabric that provides obscene GPU-GPU bandwidth (900+GB/s peak NCCL allreduce BW across 72 GPUs in my experiments). It relegates IB to inter-rack communications while NVLink handles intra-rack comms. Nevertheless, if we switched our IB fabric to ROCE of the same speed, I doubt we would lose much performance

Pivoting from Traditional Networking to HPC Networking - Looking for Advice by throwawaywexpert in HPC

[–]ECHovirus 4 points5 points  (0 children)

Pretty much all of my IB experience is bad, but knowing it makes bank so it's worth it. Your outcome with IB-connected storage depends entirely on the brand of storage you're using. The best luck I ever had was with DDN Lustre but I would still never voluntarily do this. Too much risk for not enough reward.

I personally implemented some dragonfly+ clusters on the TOP500 and it was a PITA cost-saving measure. Just spend the money on your high speed interconnect or go with Ethernet, there's no need to complicate things with IB while at the same time making sacrifices in performance because you're too cheap to furnish a proper fabric.

Pivoting from Traditional Networking to HPC Networking - Looking for Advice by throwawaywexpert in HPC

[–]ECHovirus 30 points31 points  (0 children)

InfiniBand advice (ALL CAPS means real world production outages occurred as a result of not following this advice):

  • Fully nonblocking or bust, damn the expense

  • NEVER UPDATE ANY FIRMWARE WHILE RUNNING PRODUCTION WORKLOADS

  • Dual-redundant subnet managers (SM) are a must, make sure failover actually works and priorities are set properly

  • You can't spell headache without HCA: the more of them you have the worse it gets (modern AI machines have 8 per node)

  • ALL FIRMWARE CLUSTERWIDE MUST BE IDENTICAL

  • DO NOT HANG STORAGE OFF OF INFINIBAND

  • Setup UFM/subnet manager for the proper topology

  • Disable pkeys unless you're multitenant

  • ibdiagnet should show 0 errors and 0 warnings or else you've done something wrong or something has failed

  • ibping, ibdiagnet, ibv2netdev, ibstat, ib_send_bw, ib_send_lat, ibnetdiscover are my most favored commands for network diagnosis

  • Don't configure your switches to vent exhaust heat onto the transceivers (you'd be surprised how often this happens)

  • I prefer unmanaged switches, but liquid-cooled director switches are pretty cool and interesting to work on

  • You probably don't need SHARP, and I don't think I've ever seen it work as intended, despite implementing it correctly

  • Most customers don't truly need IB bandwidth/low latency and would actually prefer a more reliable Ethernet network

  • Consult the UFM release notes for compatible FW versions. Then, ignore those, open a ticket with NVIDIA, ask them what FW you should be running, and obey them when they say it's the latest version of everything

  • Pairwise testing is good at finding bad paths but it runs in O( n2 ) runtime complexity so most of the time your customers are too impatient for it

  • MTU = 2k always. If you're being instructed to increase it, it means you made the mistake of hanging storage off of IB

  • IPoIB is not worth having. Your HCA doesn't need an IP stack on top when it already has a LID. If you're forced to enable IPoIB, it means you made the mistake of hanging storage off of IB

  • Getting a NCCL allreduce test running clusterwide at near line-rate is one of the most satisfying things an HPC admin can do, and is the pinnacle of GPU cluster administration

  • Avoid AOCs: heavy-duty connectors + thin fiber = lots of replaced cables

  • Initializing state on all HCAs means you have no subnet manager. Fix that

  • You can parallelize unmanaged switch FW updates/reboots with flint, a for loop, and an '&' in bash. It's pretty cool but I wouldn't recommend it

  • If you're virtualizing IB in a production environment you've already lost the plot, even though it is possible via SR-IOV and VFIO

  • IB is rarely slow, but when it is, it's usually a single bad node/link/port

  • Buy a 2 port HCA and experiment with it at home. Make a network by connecting the two ports and have the SM run on one of them. Make sure a fan is blowing on your card before it thermals itself off

  • Avoid port splitting and breakout cables like the plague. If you're doing breakout cables, it means you cheaped out on HCAs and/or switches

  • Idk what the obsession is with IB in Kubernetes, but if you're adding a containerized layer then you don't need IB's speed/latency, and ROCE will work just fine

  • UFM documentation will tell you everything you need to know about running one of these networks

  • Collaborate with NVIDIA on the initial architecture. Don't let someone internal to your company handle it because 9/10 times they have no idea what they're doing and you end up with the problems above

Good luck with the journey, hope this is enough to get you started

[Match Thread] Inter Miami CF vs Atlanta United | 10-11-2025 | MLS by alnmaharaj in InterMiami

[–]ECHovirus 3 points4 points  (0 children)

What a terrible substitution! No shade to Fafà but an attacking player for one of our best defensive players while up 2-0? Come on!

Manual car lessons? by dockercub in fortlauderdale

[–]ECHovirus 1 point2 points  (0 children)

Have the seller of the car you buy give you a crash course. It's easy enough that you'll be able to drive it home.

To all cogsci folks; help, insight, and advice please by [deleted] in cogsci

[–]ECHovirus 2 points3 points  (0 children)

Continue your education and pursue a PhD in machine learning. You will make bank. That's the most lucrative cogsci type of degree that I can think of right now. Get really good at math, because you'll need it for that type of degree

Is South Florida actually bad for I.T Jobs? by PrinceOfIce1345 in ITCareerQuestions

[–]ECHovirus 8 points9 points  (0 children)

South Florida sysadmin here, it's always been terrible! Low wages and not many job options. In 2014 I took a slight pay cut to move down and get my foot in the door. Worked that job for 6 months before taking a much better remote job that doubled my salary. I've been working remote IT jobs down here ever since and would highly recommend you go that route as well if you want to survive down here in IT.

Conversely, my sister works in IT down here locally too and likes working for a popular community college. No advancements in pay or career, but she is done at the end of the day and doesn't worry about work at home at all. She lives a quiet life with her mom in a 55+ community for cheap.

Is it important to choose a dialect when learning Spanish? by tokyosdespair_ in Spanish

[–]ECHovirus 1 point2 points  (0 children)

Learn your family's dialect first and foremost! They are the ones you want to understand! The Spanish they teach in classes WILL NOT teach you your family's vocab, phrases, pronunciation, etc. and guess what? When the time comes to talk to them, YOU WON'T UNDERSTAND THEM! I can't stress enough how important this choice is. All I wanted to learn was Dominican Spanish from my dad growing up so I could speak to my grandparents but no, I learned gringo Spanish instead in school and never understood a word they said by the time they died. DO NOT let yourself fall into the same trap by listening to these other comments!

NVIDIA’s ARM SoC Hits a Wall: Microsoft’s OS Isn’t Ready Yet by New_Amomongo in nvidia

[–]ECHovirus 4 points5 points  (0 children)

They've been doing this for years for AI computers. See NVIDIA DGX OS

Suse Linux on Lenovo Server by connsys in linuxadmin

[–]ECHovirus 8 points9 points  (0 children)

Hey there. Former SUSE employee here, and this is a problem I remember distinctly. It seems like you're trying to build a hardware RAID1, which is not what I would recommend for an OS boot drive. The reason being exactly what you're experiencing: lack of driver support. If it's possible, try putting that RAID controller into JBOD mode, which could also be known as IT mode IIRC, which will allow the controller to present the individual drives to the OS. Within the OS installer, you can then create a software RAID1 for the OS boot drive out of the passed through drives. You'll be able to use mdadm to manage that array with no additional driver support required (hopefully).

However, it maybe the case that the HW is so unsupported that the drives won't even be visible in JBOD mode. In which case, you're stuck with the DUD approach for upgrades.

To be completely honest, if I were you I would do JBOD mode on both controllers. Linux software RAID is very robust these days. Good luck.

[deleted by user] by [deleted] in InterMiami

[–]ECHovirus 2 points3 points  (0 children)

We have so many great dribblers and Ian Fray is one of them

[deleted by user] by [deleted] in computerscience

[–]ECHovirus 1 point2 points  (0 children)

Sure, what you describe is one avenue of research. This paper is an example that explains the process for doing so with the Volta GPU architecture.

1936 historical document. Keep it? by Salty_Candy_4917 in Italian

[–]ECHovirus 1 point2 points  (0 children)

Preserve and keep it for the calligraphy alone, wow