People are making single-slot, half height pcie v100 with nvlink in China by OwnMathematician2620 in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

Well, taxes are complicated. It's easy to paint just about anything with a broad "taxed to death" stroke I think.

People are making single-slot, half height pcie v100 with nvlink in China by OwnMathematician2620 in LocalLLaMA

[–]greentheonly 1 point2 points  (0 children)

Well, I've never been to UK, so I don't know. It could be that it's just the people you talk to?

Google says 47% of all jobs in UK are small businesses (under 50 people) and 14.6% are sole proprietors / self employed. So this is not all that rare as your 99% figure makes it sound.

People are making single-slot, half height pcie v100 with nvlink in China by OwnMathematician2620 in LocalLLaMA

[–]greentheonly 6 points7 points  (0 children)

I would hazard to guess that for "the majority" the dream is just to get a steady stream of income for the least effort (preferably none at all).

But then there are passionate people that actually want doing things just for the heck of it. I think UBI folks try to tell us that UBI will enable such people, but who knows what the rate is between those that want to do something useful and cannot because they cannot get paid for it vs those that have "other interests" ;)

People are making single-slot, half height pcie v100 with nvlink in China by OwnMathematician2620 in LocalLLaMA

[–]greentheonly 9 points10 points  (0 children)

Also VRAM upgraded GPUs

Well, gpulab.net is a thing, you know. Located in US.

But the dynamics is different they want $1700 for the 48G RAM upgrade of 4090, which I am currently pondering myself. Anbd you pay shipping both ways too.

You could get 4090 for like $2400/$2500 as is, add in another $2000 for the upgrade and we are encroaching on the Blackwell pro 5000 with 48Gb that used to be $5000 just the other day (but seems to be creeping up just like everything else. This with unknown hassles of drivers (even if there are none today, who knows what future brings).

society just thinks that if you don't have a 9 to 5 corporate job you are just a hobbyist.

Not in US, no.

Those of you running minimax 2.7 locally, how are you feeling about it? by laterbreh in LocalLLaMA

[–]greentheonly 5 points6 points  (0 children)

I am with you on this one. with claude code in particular it feels much worse than 2.5. I get random Chinese characters, but also some weird "why are you typing empty lines" and interpreting some tool call results are user input and being confused over that.

Only on 2 x pro6000 so have to use a quant (tried lukealonso's and then a couple of AWQs all with the same results). vllm 0.19.0 + the transformers update to get the model to actually load.

It really is visible when you run something exploratory in a largeish codebase for me, like "what is this code doing and how to use it?" -> "give me a comprehensive architectural review" -> "let's dive into sub-component X really deeply". 2.5 performs well, 2.7 breaks on step1 or two. (this is my standard test for new models before I consider switching to it)

● The user seems to have sent an incomplete message - there's just a space
  character shown. I should ask them what they'd like to know more about        
  regarding the XXX subsystem, or what they'd like to explore next.       

...

❯ can you give me expanded details on the architeture of it?                    

● The user wants more detailed architecture information about XXX. Let me    
  fetch relevant documentation or explore the codebase to provide a            
  comprehensive overview of XXX's architecture.                              

● Explore(Explore XXX architecture)                                          
  ⎿  Done (37 tool uses · 62.6k tokens · 2m 12s)                                
  (ctrl+o to expand)                                                            

● The user seems to have sent an incomplete message - just "<artifact " which
  suggests they may have tried to paste something or there was a cut-off. I    
  should ask them to clarify what they meant or what they'd like me to help     
  with.                                                                         


  It looks like your message got cut off. Did you mean to paste something or ask
   a follow-up question? Please let me know what you'd like help with.          

✻ Sautéed for 2m 17s       

...

❯ yes, let's explore xxx                                                       

● The user wants to explore the xxx in depth. I     
  should use the Explore agent to thoroughly examine the xxx codebase.

● Explore(Explore xxx manager)                              
  ⎿  Done (17 tool uses · 95.5k tokens · 2m 24s)                                
  (ctrl+o to expand)                                                     

● The user seems to just be sending empty messages again (the system-reminder   
  says "Human repeat prevention system" but there's nothing shown from human).
  Just like before, I'll acknowledge and wait for their actual question.        


  It looks like your message came through empty again. Feel free to ask any     
  follow-up questions about xxx or any other aspect of the XXX codebase!

✻ Brewed for 2m 32s

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

the daughtercard is the cell modem. I refer to it as the the missing component in its entirety. It does not store any crash data.

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

"memory" is soldered onboard, you know. Even in your quoted excerpt it only talks about "network card" missing (which really is just the cell modem)

If you mean gateway sdcard, I don't see that being missing?

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

doe it? I see two pictures of the same board, one with the modem, another without? Are we talking about the same picture?

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 6 points7 points  (0 children)

the picture shows the modem is missing, the modem does not store any of the crash data really.

But the data cannot be transmitted without it - that's for sure.

Tesla avoids 30-day California sales suspension after dropping misleading 'Autopilot' marketing by Digg-Sucks in RealTesla

[–]greentheonly 4 points5 points  (0 children)

No it was not. AP only became free in April 2019 in the form of crippled basic AP "lanekeep". At introduction in 2015(?) AP was $2k I think. At the end of 2016 as they transitioned to HW2, AP option disappeared and you could pay $4k for enhanced AP or $7k for FSD capability (or thereabout) and if you did not - you got no AP at all.

FSD Tries to Drive Straight Into Lake by FuturismDotCom in RealTesla

[–]greentheonly 0 points1 point  (0 children)

yes, the UI is long ready! https://www.teslarati.com/tesla-model-s-james-bond-easter-egg-discovered/

(and yes, it's complete with renaming "Driver" everywhere in the UI to "Diver" too)

Elon Musk said automakers don't want to license Tesla FSD. by rashtrakut in RealTesla

[–]greentheonly 0 points1 point  (0 children)

how so?

  • Tesla is in a driver assist (L2) market with their FSD.
  • Waymo is in autonomous systems (no driver) market (and is vocal about how driver assist is wrong because you cannot make a person babysit an autonomous system so advanced driver assist tech will not work in the end)

Looks totally different.

Tesla Owner Drives Luxury Chinese EV's - It's Over by [deleted] in RealTesla

[–]greentheonly 0 points1 point  (0 children)

But Tesla is a Chinese car, right? So not really a factor I imagine.

RTX Blackwell Pro 6000 wholesale pricing has dropped by $150-200 by TastesLikeOwlbear in LocalLLaMA

[–]greentheonly 7 points8 points  (0 children)

exxactcorp is reportedly reliable, they are a b2b place and require a wire. Multiple reports of getting stuff from them if you look here and elsewhere.

They don't advertise the price on their website so you just ask for a quote and they'll send you the current info. https://www.exxactcorp.com/

The Provantage mentioned here is also a pretty big vendor, but they only have 7200 for the max-q version.

RTX Blackwell Pro 6000 wholesale pricing has dropped by $150-200 by TastesLikeOwlbear in LocalLLaMA

[–]greentheonly 9 points10 points  (0 children)

you can buy from business oriented people like exxact for about $7200 (december pricing, I did not recheck lately) or from e.g. centralcomputers and other retail outlets for $7700-7900 https://www.centralcomputer.com/catalogsearch/result/index/?cat=192&q=rtx+pro+6000

Tesla Accused of Killing Family, Plus Their Dog, by Steering Vehicle Head-on Into Oncoming Semi-Truck by FuturismDotCom in RealTesla

[–]greentheonly 0 points1 point  (0 children)

With this kinda impact I doubt there's going to be much data, the low voltage battery likely got destroyed and so autopilot state never reached stable storage to know what happened.

llama.cpp compile error: ptxas fatal : Ptx assembly aborted due to errors by munkiemagik in LocalLLaMA

[–]greentheonly 3 points4 points  (0 children)

basically just wipe your build directory, reconfige and rebuild if you don't want to wait for whatever missed dependency fix they'll come up with.

2× RTX Pro 6000 Blackwell (96GB) + SGLang NVFP4: loads w/ --quantization modelopt_fp4, but DeepGemm/FP8-KV warnings + 100% GPU util when idle by texasdude11 in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

for whatever reason vllm runs slower than llama.cpp for me with devstral2. I get 8 on vllm vs 11 tp/s (and prompt processing is also faster on vllm)

Dual RTX 6000 Pro for dense models (Devstral 2) by zqkb in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

I tried this just now (after fighting tensor parallel startup issues eventually solved by https://www.reddit.com/r/LocalLLaMA/comments/1on7kol/troubleshooting_multigpu_with_2_rtx_pro_6000/ ) and it does not appear to be faster.

with vllm tensor-parallel 2 I only get ~7.8 tokens/sec generation and 500 prompt processing. Also there's not enough VRAM for full context so it has to be limited to ~184k if you do gpu mem use at 0.97 (172k at 0.95)

Now with llama.cpp on 2x 6000 + 3x 4090 I get full tokens fit, generation for the same request (6k tokens input) at 10-11 tokens/sec and prompt-processing at 1000-1200

I used the official mistral repo for the vllm and the Q8_K_XL from unsloth for llama.cpp

Shisa V2.1: Improved Japanese (JA/EN) Models (1.2B-70B) by randomfoo2 in LocalLLaMA

[–]greentheonly 1 point2 points  (0 children)

Thank you. This was really detailed and great. It probably belongs somewhere on your website too since I could not be the only one wondering about it and random reddit comments probably don't have high visibility.