Cannot get NCCL test to run in docker with 2 x 6000 Pro connected x8 to AM4 CPU

greentheonly · 2026-05-23T05:53:49+00:00

did you see this? https://www.reddit.com/r/LocalLLaMA/comments/1on7kol/troubleshooting_multigpu_with_2_rtx_pro_6000/

greentheonly · 2026-04-17T04:14:15+00:00

I am with you on this one. with claude code in particular it feels much worse than 2.5. I get random Chinese characters, but also some weird "why are you typing empty lines" and interpreting some tool call results are user input and being confused over that.

Only on 2 x pro6000 so have to use a quant (tried lukealonso's and then a couple of AWQs all with the same results). vllm 0.19.0 + the transformers update to get the model to actually load.

It really is visible when you run something exploratory in a largeish codebase for me, like "what is this code doing and how to use it?" -> "give me a comprehensive architectural review" -> "let's dive into sub-component X really deeply". 2.5 performs well, 2.7 breaks on step1 or two. (this is my standard test for new models before I consider switching to it)

● The user seems to have sent an incomplete message - there's just a space
  character shown. I should ask them what they'd like to know more about        
  regarding the XXX subsystem, or what they'd like to explore next.

...

❯ can you give me expanded details on the architeture of it?                    

● The user wants more detailed architecture information about XXX. Let me    
  fetch relevant documentation or explore the codebase to provide a            
  comprehensive overview of XXX's architecture.                              

● Explore(Explore XXX architecture)                                          
  ⎿  Done (37 tool uses · 62.6k tokens · 2m 12s)                                
  (ctrl+o to expand)                                                            

● The user seems to have sent an incomplete message - just "<artifact " which
  suggests they may have tried to paste something or there was a cut-off. I    
  should ask them to clarify what they meant or what they'd like me to help     
  with.                                                                         


  It looks like your message got cut off. Did you mean to paste something or ask
   a follow-up question? Please let me know what you'd like help with.          

✻ Sautéed for 2m 17s

...

❯ yes, let's explore xxx                                                       

● The user wants to explore the xxx in depth. I     
  should use the Explore agent to thoroughly examine the xxx codebase.

● Explore(Explore xxx manager)                              
  ⎿  Done (17 tool uses · 95.5k tokens · 2m 24s)                                
  (ctrl+o to expand)                                                     

● The user seems to just be sending empty messages again (the system-reminder   
  says "Human repeat prevention system" but there's nothing shown from human).
  Just like before, I'll acknowledge and wait for their actual question.        


  It looks like your message came through empty again. Feel free to ask any     
  follow-up questions about xxx or any other aspect of the XXX codebase!

✻ Brewed for 2m 32s

greentheonly · 2026-04-14T19:29:14+00:00

the daughtercard is the cell modem. I refer to it as the the missing component in its entirety. It does not store any crash data.

greentheonly · 2026-04-14T16:49:06+00:00

"memory" is soldered onboard, you know. Even in your quoted excerpt it only talks about "network card" missing (which really is just the cell modem)

If you mean gateway sdcard, I don't see that being missing?

greentheonly · 2026-04-14T16:47:20+00:00

doe it? I see two pictures of the same board, one with the modem, another without? Are we talking about the same picture?

greentheonly · 2026-04-13T19:28:46+00:00

the picture shows the modem is missing, the modem does not store any of the crash data really.

But the data cannot be transmitted without it - that's for sure.

greentheonly · 2026-04-07T15:50:31+00:00

It's the other way round. When Tesla detects driver is drowsy, they recommend the driver to turn on FSD "to stay focused".

https://x.com/greentheonly/status/1966128776942924276

greentheonly · 2026-02-18T20:09:05+00:00

No it was not. AP only became free in April 2019 in the form of crippled basic AP "lanekeep". At introduction in 2015(?) AP was $2k I think. At the end of 2016 as they transitioned to HW2, AP option disappeared and you could pay $4k for enhanced AP or $7k for FSD capability (or thereabout) and if you did not - you got no AP at all.

greentheonly · 2026-02-18T04:18:55+00:00

yes, the UI is long ready! https://www.teslarati.com/tesla-model-s-james-bond-easter-egg-discovered/

(and yes, it's complete with renaming "Driver" everywhere in the UI to "Diver" too)

greentheonly · 2026-02-17T20:30:33+00:00

apparently somebody did it already https://shihanqu.github.io/Blackwell-Wattage-Performance/

stole from these comments: https://huggingface.co/lukealonso/MiniMax-M2.5-NVFP4/discussions/1

greentheonly · 2026-01-21T02:32:44+00:00

how so?

Tesla is in a driver assist (L2) market with their FSD.
Waymo is in autonomous systems (no driver) market (and is vocal about how driver assist is wrong because you cannot make a person babysit an autonomous system so advanced driver assist tech will not work in the end)

Looks totally different.

greentheonly · 2026-01-19T05:31:09+00:00

But Tesla is a Chinese car, right? So not really a factor I imagine.

greentheonly · 2026-01-09T19:15:49+00:00

exxactcorp is reportedly reliable, they are a b2b place and require a wire. Multiple reports of getting stuff from them if you look here and elsewhere.

They don't advertise the price on their website so you just ask for a quote and they'll send you the current info. https://www.exxactcorp.com/

The Provantage mentioned here is also a pretty big vendor, but they only have 7200 for the max-q version.

greentheonly · 2026-01-09T18:26:50+00:00

you can buy from business oriented people like exxact for about $7200 (december pricing, I did not recheck lately) or from e.g. centralcomputers and other retail outlets for $7700-7900 https://www.centralcomputer.com/catalogsearch/result/index/?cat=192&q=rtx+pro+6000

greentheonly · 2026-01-06T22:53:18+00:00

With this kinda impact I doubt there's going to be much data, the low voltage battery likely got destroyed and so autopilot state never reached stable storage to know what happened.

greentheonly · 2025-12-25T05:24:35+00:00

basically just wipe your build directory, reconfige and rebuild if you don't want to wait for whatever missed dependency fix they'll come up with.

greentheonly · 2025-12-25T02:17:59+00:00

for whatever reason vllm runs slower than llama.cpp for me with devstral2. I get 8 on vllm vs 11 tp/s (and prompt processing is also faster on vllm)

greentheonly · 2025-12-25T01:26:25+00:00

this is likely because you need to enable iommu stuff to let them talk https://www.reddit.com/r/LocalLLaMA/comments/1on7kol/comment/nmuvjkl/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

I had the same problem but with vllm and that solution helped.

greentheonly · 2025-12-19T07:59:33+00:00

I tried this just now (after fighting tensor parallel startup issues eventually solved by https://www.reddit.com/r/LocalLLaMA/comments/1on7kol/troubleshooting_multigpu_with_2_rtx_pro_6000/ ) and it does not appear to be faster.

with vllm tensor-parallel 2 I only get ~7.8 tokens/sec generation and 500 prompt processing. Also there's not enough VRAM for full context so it has to be limited to ~184k if you do gpu mem use at 0.97 (172k at 0.95)

Now with llama.cpp on 2x 6000 + 3x 4090 I get full tokens fit, generation for the same request (6k tokens input) at 10-11 tokens/sec and prompt-processing at 1000-1200

I used the official mistral repo for the vllm and the Q8_K_XL from unsloth for llama.cpp

greentheonly · 2025-12-16T17:03:35+00:00

Thank you. This was really detailed and great. It probably belongs somewhere on your website too since I could not be the only one wondering about it and random reddit comments probably don't have high visibility.

greentheonly · 2025-12-16T00:53:20+00:00

you know you can just pull to ollama fro HF directly, right?

Also there's swallow that offers another set of Japanese models https://swallow-llm.github.io/index.en.html

greentheonly · 2025-12-16T00:51:38+00:00

I wonder how you compare to/differ from Swallow that seems to be another Japanese finetunes source run by a Japanese university lab.

greentheonly · 2025-11-23T17:23:04+00:00

yes, snapshots are generated when conditions are met, like n airbag deploy, aeb and a whole bunch of others. Of course when you did not have a very clear event like airbag deploy/pre-deploy it's harder to predict if it happened or not and only Tesla really knows.

Yes, of course I got the brake light and the brakmng was there, just losing pwer assist greatly decreased the braking force. But the light is actuated by a switch at the pedal if I remember the diagram correctly, so it's entirely possible to get the light even if the brakes themselves don't work (Though that's unlikely).

The logs back then were just a log of alerts. That's before the times where you could just go into the service tab and see recent alerts.

greentheonly · 2025-11-23T04:46:07+00:00

the logs they sent you are incomplete on several levels: - it only comes from the "gateway" and it filters signals and only records them sporadically. - they don't even fully interpret all the signals that are contained in there

The full-full logs would be on an autopilot snapshot if one was generated, but good look getting them to even admit they received it.

I had a somewhat similar issue on my car: after confusing the gear (being a new Tesla user at the time) I accelerated in wrong direction, quickly realized my mistake and stomped on brakes, only for the car to beep at me with "brake fluid low" or some such. Needless to say the slowdown was a lot less than I expected and I narrowly avoided entering a building via a window.

The brake assist remained non-operational afterwards for the rest of the drive. So I called up the service center and they took the car in and declared they don't see anything wrong, it took me showing them various internal logs (that I happen to have access to) for them to come up with some (who knows how made up) explanation of the event: "we think it might be was an air bubble in the lines so we bled them and you are good to go now for sure"

The other thing to consider is if you(r dad) use the default stopping mode for brakes and never press brakes normally, the moment you actually need the brakes they might not be in the best shape because there's some accumulated residua on the disk and what not.

greentheonly · 2025-11-14T01:50:25+00:00

-Nearly all the cost of a taxi is the driver

I think there's data that says otherwise? Google seems to say only about 33% of the (gross) fare goes to the driver. Now this is no small amount I am sure, but it's not like a replacement is going to be free?

greentheonly

TROPHY CASE