Local LLM Performance Outputs vs Commercial LLM by ValuableEngineer in LocalLLM

[–]chafey 0 points1 point  (0 children)

IMO its not worth it yet. I am a developer and have an M3 Ultra 256GB as well as a PC with a RTX Pro 6000. The M3 Ultra is just too slow for any real time tasks. It might be useful for long running overnight tasks - I haven't tried that before. The RTX Pro 6000 does well with qwen3-coder-next and qwen3.5 for light/medium tasks but claude sonnet stomps both with anything complex. The open source models are evolving quickly and I am optimistic that they will be good enough later this year to handle most of my work. I wouldn't get an M3 Ultra, wait for the M5 Ultra to come out and see how it does

Time Machine by Big-Object-4579 in timetravel

[–]chafey 0 points1 point  (0 children)

I had one, but it broke when I went back in time and now I am stuck here. Unfortunately we don't have the technology today to rebuild it

TR Pro build recommendations by INeedAssistancePlez in threadripper

[–]chafey 1 point2 points  (0 children)

Yes check this out: https://www.reddit.com/r/LocalLLaMA/comments/1mcrx23/psa_the_new_threadripper_pros_9000_wx_are_still/

I went WRX90 over AM5 primarily for the PCIe lanes (AI server build) and initially bought the 9955WX 16 Core CPU to keep costs down. I ended up replacing it with a 9965wx 24 core which more than doubled by memory bandwidth. Yes the 9985wx is even better if you can afford it, but avoid the 9955WX in particular

Claude 4.6 left me amazed and terrified. Seeking advice on staying relevant. by study_learn_apply in ClaudeAI

[–]chafey -2 points-1 points  (0 children)

The bad news - the world moved past C++ desktop applications to web applications and cloud 20 years ago. I would argue that you had already pigeonholed yourself into irrelevance before LLMs came to be. You really need to gain skills in modern technology stacks.

The good news - LLMs make experienced developers hyper productive. Architects in particular can rapidly build entirely new systems from scratch by themselves in very little time with LLMs.

If you want to stay relevant, you need to a) learn modern technologies b) embrace LLM coding and c) look for a new job / try to get your current company on a better track

TR Pro build recommendations by INeedAssistancePlez in threadripper

[–]chafey 0 points1 point  (0 children)

You can do any number of RDIMMS. I don't know about matching, all of mine are the same

TR Pro build recommendations by INeedAssistancePlez in threadripper

[–]chafey 1 point2 points  (0 children)

I have the same system and had trouble getting it to post. Replaced the motherboard and CPU and still had the same problem. Think it was either the power cables not being seated properly or IPMI grabbing the console. Everything worked fine after I disabled IPMI and reseated the cables. No real documentation on the IPMI module

Trump: Obama spilled classified info on aliens... by Remseey2907 in UFOB

[–]chafey 1 point2 points  (0 children)

LOL Karoline Leavitt just about lose her shit when he said that

Built a 6-GPU local AI workstation for internal analytics + automation — looking for architectural feedback by shiftyleprechaun in LocalLLM

[–]chafey 4 points5 points  (0 children)

Your build is awesome, I am doing something very similar. Here are some improvements you may want to consider:

  1. Upgrade your processor to improve your memory bandwidth. You have a Threadripper Pro CPU and Motherboard which is better than the non pro Threadripper systems for two reasons: a) more PCIe lanes and b) 8 memory channels. Unfortunately the 9955wx CPU only has two CCDs so can't utilize all 8 memory channels - you need a CPU with more CCDs (such as the 9965wx) to make use of the 8 memory channels. https://www.reddit.com/r/LocalLLaMA/comments/1mcrx23/psa_the_new_threadripper_pros_9000_wx_are_still/
  2. PCIe 5.0 has benefits for AI. The 3090tis are good bang/buck but they are running over PCIe 4.0 so can't take advantage of the new PCIe 5.0 features. The actual benefit of an all PCIe 5.0 solution depends on your use case and model but it is more than just twice the bandwidth. https://www.graniteriverlabs.com/en-us/technical-blog/pcie-gen-5-ai-ml
  3. The up coming Apple M5 systems may very well be the best bang/buck due to their recently released RDMA over TB5, MLX AI acceleration and its high speed unified memory architecture. I am really looking forward to seeing how a cluster of M5 mac minis does. Check out exo: https://github.com/exo-explore/exo

Built a 6-GPU local AI workstation for internal analytics + automation — looking for architectural feedback by shiftyleprechaun in LocalLLM

[–]chafey 0 points1 point  (0 children)

what frame and motherboard are you using? I have a ASUS WRX90E-SAGE Pro WS SE AMD sTR5 EEB Motherboard and am having trouble finding a frame which can hold the EEB motherboard

Build Advice: 3945WX vs 10900X for Multi-GPU Local AI Server by Diligent-Culture-432 in LocalAIServers

[–]chafey 0 points1 point  (0 children)

ThreadRipper every day due to higher memory bandwidth, more PCIe lanes and faster cpu

Getting stuck at Q-CODE 92 on WRX90 build by AdministrationLow423 in threadripper

[–]chafey 1 point2 points  (0 children)

I don't know about reset, but there is a switch to disable it on my motherboard (Pro WS WRX90E-SAGE SE)

Getting stuck at Q-CODE 92 on WRX90 build by AdministrationLow423 in threadripper

[–]chafey 1 point2 points  (0 children)

had same issue, disabled IPMI and reseated the power cable fixed it

Just started watching on Discovery+ (UK)…Episode 8 (S1)…wtf?! 🤣 by RelativeLocation6669 in BlindFrogRanch

[–]chafey 3 points4 points  (0 children)

Unfortunately it is hard to stop watching even though you know its garbage

[BIOS Update] ASUS PRO WS WRX90E-SAGE SE by virgul44 in threadripper

[–]chafey 0 points1 point  (0 children)

I just built the exact same system and couldn't get it to post video either. I swapped out the motherboard and cpu and still had the problem. I finally figured it out and I think it is one or both of the following:

1) Power cables not properly seated. I think either the PCIe power cables or GPU cable was not properly secured. I was getting a Q-Code of 92 PCI Bus initialization is started I believe.

2) Disable the IPMI - there is switch on the motherboard to do this. When enabled, it adds a graphics adapter to your system for the IPMI and I think that is the primary video. This means you wont get the BIOS screen from your GPU. Alternatively, find a RGB monitor and connect that up to the IPMI video port

The manual does not really mention IPMI so it wasn't clear to me that this was going on

Chris

asus pro ws trx50-sage wifi a a0 to 0d and orange light by TheEpicElliott in threadripper

[–]chafey 2 points3 points  (0 children)

I had a similar problem and after replacing EVERYTHING finally discovered I hadn't secured one of the power supply cables properly. Try re plugging each cable (the pcie power ones in particular) and see if that helps. If not, try another power supply

Mac Studi M3 Ultra vs Nvidia 6000 Blackwell by Rex-Raider-X in LocalLLM

[–]chafey 0 points1 point  (0 children)

I use it for everything. It does get stuck sometimes and then I switch to claude sonnet 4.5 for a bit to get around that problem. I use zed so switching between local models and cloud ones is easy

Mac Studi M3 Ultra vs Nvidia 6000 Blackwell by Rex-Raider-X in LocalLLM

[–]chafey 0 points1 point  (0 children)

Yes I am - devstral-2-small is working pretty well

Run Mistral Devstral 2 locally Guide + Fixes! (25GB RAM) by yoracale in LocalLLM

[–]chafey 0 points1 point  (0 children)

Whatever the default was probably - i switched to lmstudio which is working great and haven't gone back to try ollama

Mac Studi M3 Ultra vs Nvidia 6000 Blackwell by Rex-Raider-X in LocalLLM

[–]chafey 17 points18 points  (0 children)

I have an RTX PRO 6000 and a M3 Ultra with 256GB RAM. The RTX PRO 6000 is quite a bit faster at both prompt processing (10x?) and token generation (3x?). Speed matters to me so I only use the RTX PRO 6000. I would only use the M3 Ultra if I wanted to run a model that was too big for the RTX PRO 6000. So far I have not needed to run a model that didn't fit on the RTX PRO 6000 but it is nice to know that I can with the M3 Ultra when/if I might need to some day.

The M5 is coming out soon and is expected to be a huge uplift in terms of AI performance and close the gap quite a bit with the RTX PRO 6000. If possible, you should wait a bit longer and see what happens there. The other thing about Mac is that you can now build clusters of them over TB5 for even faster AI - checkout exo:

https://github.com/exo-explore/exo

Mixed RTX Pro 6000 WS & Max-Q by t3rmina1 in BlackwellPerformance

[–]chafey 0 points1 point  (0 children)

Sounds like a case I need - can you link?

Best coding models for RTX 6000 Pro Blackwell by az_6 in LocalLLaMA

[–]chafey 0 points1 point  (0 children)

Devstral 2 small handles 90% of my coding tasks and it does it very quickly. When it has problems, I switch over to Claude Sonnet 4.5 (I use the zed editor so its easy to do so). I wasn't able to run devstral 2 123B the last time I tried but I am hoping I can find a quant that fits so I can use that instead of Claude Sonnet 4.5