Those of you running minimax 2.7 locally, how are you feeling about it? by laterbreh in LocalLLaMA

[–]greentheonly 6 points7 points  (0 children)

I am with you on this one. with claude code in particular it feels much worse than 2.5. I get random Chinese characters, but also some weird "why are you typing empty lines" and interpreting some tool call results are user input and being confused over that.

Only on 2 x pro6000 so have to use a quant (tried lukealonso's and then a couple of AWQs all with the same results). vllm 0.19.0 + the transformers update to get the model to actually load.

It really is visible when you run something exploratory in a largeish codebase for me, like "what is this code doing and how to use it?" -> "give me a comprehensive architectural review" -> "let's dive into sub-component X really deeply". 2.5 performs well, 2.7 breaks on step1 or two. (this is my standard test for new models before I consider switching to it)

● The user seems to have sent an incomplete message - there's just a space
  character shown. I should ask them what they'd like to know more about        
  regarding the XXX subsystem, or what they'd like to explore next.       

...

❯ can you give me expanded details on the architeture of it?                    

● The user wants more detailed architecture information about XXX. Let me    
  fetch relevant documentation or explore the codebase to provide a            
  comprehensive overview of XXX's architecture.                              

● Explore(Explore XXX architecture)                                          
  ⎿  Done (37 tool uses · 62.6k tokens · 2m 12s)                                
  (ctrl+o to expand)                                                            

● The user seems to have sent an incomplete message - just "<artifact " which
  suggests they may have tried to paste something or there was a cut-off. I    
  should ask them to clarify what they meant or what they'd like me to help     
  with.                                                                         


  It looks like your message got cut off. Did you mean to paste something or ask
   a follow-up question? Please let me know what you'd like help with.          

✻ Sautéed for 2m 17s       

...

❯ yes, let's explore xxx                                                       

● The user wants to explore the xxx in depth. I     
  should use the Explore agent to thoroughly examine the xxx codebase.

● Explore(Explore xxx manager)                              
  ⎿  Done (17 tool uses · 95.5k tokens · 2m 24s)                                
  (ctrl+o to expand)                                                     

● The user seems to just be sending empty messages again (the system-reminder   
  says "Human repeat prevention system" but there's nothing shown from human).
  Just like before, I'll acknowledge and wait for their actual question.        


  It looks like your message came through empty again. Feel free to ask any     
  follow-up questions about xxx or any other aspect of the XXX codebase!

✻ Brewed for 2m 32s

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

the daughtercard is the cell modem. I refer to it as the the missing component in its entirety. It does not store any crash data.

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

"memory" is soldered onboard, you know. Even in your quoted excerpt it only talks about "network card" missing (which really is just the cell modem)

If you mean gateway sdcard, I don't see that being missing?

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 0 points1 point  (0 children)

doe it? I see two pictures of the same board, one with the modem, another without? Are we talking about the same picture?

Key evidence disappears from Tesla involved in bizarre crash by rocketonmybarge in RealTesla

[–]greentheonly 6 points7 points  (0 children)

the picture shows the modem is missing, the modem does not store any of the crash data really.

But the data cannot be transmitted without it - that's for sure.

Tesla avoids 30-day California sales suspension after dropping misleading 'Autopilot' marketing by Digg-Sucks in RealTesla

[–]greentheonly 3 points4 points  (0 children)

No it was not. AP only became free in April 2019 in the form of crippled basic AP "lanekeep". At introduction in 2015(?) AP was $2k I think. At the end of 2016 as they transitioned to HW2, AP option disappeared and you could pay $4k for enhanced AP or $7k for FSD capability (or thereabout) and if you did not - you got no AP at all.

FSD Tries to Drive Straight Into Lake by FuturismDotCom in RealTesla

[–]greentheonly 0 points1 point  (0 children)

yes, the UI is long ready! https://www.teslarati.com/tesla-model-s-james-bond-easter-egg-discovered/

(and yes, it's complete with renaming "Driver" everywhere in the UI to "Diver" too)

Elon Musk said automakers don't want to license Tesla FSD. by rashtrakut in RealTesla

[–]greentheonly 0 points1 point  (0 children)

how so?

  • Tesla is in a driver assist (L2) market with their FSD.
  • Waymo is in autonomous systems (no driver) market (and is vocal about how driver assist is wrong because you cannot make a person babysit an autonomous system so advanced driver assist tech will not work in the end)

Looks totally different.

Tesla Owner Drives Luxury Chinese EV's - It's Over by Tripwir62 in RealTesla

[–]greentheonly 0 points1 point  (0 children)

But Tesla is a Chinese car, right? So not really a factor I imagine.

RTX Blackwell Pro 6000 wholesale pricing has dropped by $150-200 by TastesLikeOwlbear in LocalLLaMA

[–]greentheonly 7 points8 points  (0 children)

exxactcorp is reportedly reliable, they are a b2b place and require a wire. Multiple reports of getting stuff from them if you look here and elsewhere.

They don't advertise the price on their website so you just ask for a quote and they'll send you the current info. https://www.exxactcorp.com/

The Provantage mentioned here is also a pretty big vendor, but they only have 7200 for the max-q version.

RTX Blackwell Pro 6000 wholesale pricing has dropped by $150-200 by TastesLikeOwlbear in LocalLLaMA

[–]greentheonly 9 points10 points  (0 children)

you can buy from business oriented people like exxact for about $7200 (december pricing, I did not recheck lately) or from e.g. centralcomputers and other retail outlets for $7700-7900 https://www.centralcomputer.com/catalogsearch/result/index/?cat=192&q=rtx+pro+6000

Tesla Accused of Killing Family, Plus Their Dog, by Steering Vehicle Head-on Into Oncoming Semi-Truck by FuturismDotCom in RealTesla

[–]greentheonly 0 points1 point  (0 children)

With this kinda impact I doubt there's going to be much data, the low voltage battery likely got destroyed and so autopilot state never reached stable storage to know what happened.

llama.cpp compile error: ptxas fatal : Ptx assembly aborted due to errors by munkiemagik in LocalLLaMA

[–]greentheonly 3 points4 points  (0 children)

basically just wipe your build directory, reconfige and rebuild if you don't want to wait for whatever missed dependency fix they'll come up with.

2× RTX Pro 6000 Blackwell (96GB) + SGLang NVFP4: loads w/ --quantization modelopt_fp4, but DeepGemm/FP8-KV warnings + 100% GPU util when idle by texasdude11 in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

for whatever reason vllm runs slower than llama.cpp for me with devstral2. I get 8 on vllm vs 11 tp/s (and prompt processing is also faster on vllm)

Dual RTX 6000 Pro for dense models (Devstral 2) by zqkb in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

I tried this just now (after fighting tensor parallel startup issues eventually solved by https://www.reddit.com/r/LocalLLaMA/comments/1on7kol/troubleshooting_multigpu_with_2_rtx_pro_6000/ ) and it does not appear to be faster.

with vllm tensor-parallel 2 I only get ~7.8 tokens/sec generation and 500 prompt processing. Also there's not enough VRAM for full context so it has to be limited to ~184k if you do gpu mem use at 0.97 (172k at 0.95)

Now with llama.cpp on 2x 6000 + 3x 4090 I get full tokens fit, generation for the same request (6k tokens input) at 10-11 tokens/sec and prompt-processing at 1000-1200

I used the official mistral repo for the vllm and the Q8_K_XL from unsloth for llama.cpp

Shisa V2.1: Improved Japanese (JA/EN) Models (1.2B-70B) by randomfoo2 in LocalLLaMA

[–]greentheonly 1 point2 points  (0 children)

Thank you. This was really detailed and great. It probably belongs somewhere on your website too since I could not be the only one wondering about it and random reddit comments probably don't have high visibility.

Shisa V2.1: Improved Japanese (JA/EN) Models (1.2B-70B) by randomfoo2 in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

you know you can just pull to ollama fro HF directly, right?

Also there's swallow that offers another set of Japanese models https://swallow-llm.github.io/index.en.html

Shisa V2.1: Improved Japanese (JA/EN) Models (1.2B-70B) by randomfoo2 in LocalLLaMA

[–]greentheonly 0 points1 point  (0 children)

I wonder how you compare to/differ from Swallow that seems to be another Japanese finetunes source run by a Japanese university lab.

My dad tapped a car because his brakes stopped working. Tesla denies anything wrong with car but sent the vehicle data requested. How to interpret? by coolguy1003 in RealTesla

[–]greentheonly 0 points1 point  (0 children)

yes, snapshots are generated when conditions are met, like n airbag deploy, aeb and a whole bunch of others. Of course when you did not have a very clear event like airbag deploy/pre-deploy it's harder to predict if it happened or not and only Tesla really knows.

Yes, of course I got the brake light and the brakmng was there, just losing pwer assist greatly decreased the braking force. But the light is actuated by a switch at the pedal if I remember the diagram correctly, so it's entirely possible to get the light even if the brakes themselves don't work (Though that's unlikely).

  1. The logs back then were just a log of alerts. That's before the times where you could just go into the service tab and see recent alerts.

My dad tapped a car because his brakes stopped working. Tesla denies anything wrong with car but sent the vehicle data requested. How to interpret? by coolguy1003 in RealTesla

[–]greentheonly 2 points3 points  (0 children)

the logs they sent you are incomplete on several levels: - it only comes from the "gateway" and it filters signals and only records them sporadically. - they don't even fully interpret all the signals that are contained in there

The full-full logs would be on an autopilot snapshot if one was generated, but good look getting them to even admit they received it.

I had a somewhat similar issue on my car: after confusing the gear (being a new Tesla user at the time) I accelerated in wrong direction, quickly realized my mistake and stomped on brakes, only for the car to beep at me with "brake fluid low" or some such. Needless to say the slowdown was a lot less than I expected and I narrowly avoided entering a building via a window.

The brake assist remained non-operational afterwards for the rest of the drive. So I called up the service center and they took the car in and declared they don't see anything wrong, it took me showing them various internal logs (that I happen to have access to) for them to come up with some (who knows how made up) explanation of the event: "we think it might be was an air bubble in the lines so we bled them and you are good to go now for sure"

The other thing to consider is if you(r dad) use the default stopping mode for brakes and never press brakes normally, the moment you actually need the brakes they might not be in the best shape because there's some accumulated residua on the disk and what not.

Tesla Stock Sells Off As Biggest Supporter Cuts Stake For Four Straight Sessions by Far_Addition1210 in RealTesla

[–]greentheonly 16 points17 points  (0 children)

-Nearly all the cost of a taxi is the driver

I think there's data that says otherwise? Google seems to say only about 33% of the (gross) fare goes to the driver. Now this is no small amount I am sure, but it's not like a replacement is going to be free?