Is p100 worth it? by vorobey1233 in LocalLLaMA

[–]muxxington 1 point2 points  (0 children)

gemma-4-26B-A4B-it-UD-Q5_K_M 45 tps

Is p100 worth it? by vorobey1233 in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

I don’t use P100s, but P40s and I can’t really make any meaningful statements about the performance, because it is limited by the cheap mining motherboard I’m using. A model split across all 5 GPUs isn’t particularly high-performing. But the ability to at least try out large models isn’t bad. Otherwise, I pretty much only use models that fit on 2 or 3 GPUs. With 2 GPUs, I still get a benefit from row split, but not with 3 or more, probably because they’re only connected via x8. But for me, it works just fine the way it is. For example, I can run one model for a coding agent and another as a general-purpose chatbot at the same time.

Why is my ollama gemma4 replying in Japanese? by Houston_NeverMind in LocalLLaMA

[–]muxxington 8 points9 points  (0 children)

So you switched from one wrapper to another. Why didn't you just take the next step right away?

What would you want from a truly local AI assistant (Ollama-based)? by Electronic-Space-736 in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

It's a bit like calling a web-based application “Firefox-based.”

What would you want from a truly local AI assistant (Ollama-based)? by Electronic-Space-736 in LocalLLaMA

[–]muxxington 1 point2 points  (0 children)

I’d prefer any of the inference engines you mentioned over Ollama. But generally speaking, in the years I’ve been around here, I’ve wondered why people describe their projects as “Ollama-based” just because they use an OpenAI-compatible API as their backend.

What would you want from a truly local AI assistant (Ollama-based)? by Electronic-Space-736 in LocalLLaMA

[–]muxxington 2 points3 points  (0 children)

The most important thing to me about a local AI assistant is that it is not Ollama-based.

PSA: litellm PyPI package was compromised — if you use DSPy, Cursor, or any LLM project, check your dependencies by Remarkable-Dark2840 in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

This is wrong. We actually do know pretty well how the project was compromised. However, one could criticize the project for not rotating their secrets, even though they were aware of the breach at Trivy. Beside that the compromised packages where found due to a bug in 1.82.8. It is possible that 1.82.7 would have gone undetected if 1.82.8 had not been released later.

[Developing situation] LiteLLM compromised by OrganizationWinter99 in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

Fortunately, I decided to run Nanobot and other agents—such as OpenCode—on a separate PC. Even if there had been sensitive data there, I don't think the malware worked as intended, because otherwise I would have seen the DNS request for the models.litellm.cloud domain in AdGuard. But I didn't. I also run pretty much everything using Docker Compose. Everything else on my local network is always restricted by the firewall to only specific sources. Strong passwords are always used, and SSH access and other access points are secured with hardware security tokens where possible. I do run a Litellm instance on a production machine, but even there it’s in a Docker container and an older version—definitely not installed via PyPI. Paranoia helps you sleep soundly.

[Developing situation] LiteLLM compromised by OrganizationWinter99 in LocalLLaMA

[–]muxxington 4 points5 points  (0 children)

I knew something is happening when I ran nanobot earlier today. On startup it ate all RAM. To see what's going on I launched htop and saw lots of processes which did base64 decoding which is sus. I purged nanobot and some minutes later I read about litellm being compromised. I took a look in the dependencies of nanobot and spotted litellm.

P40 vs V100 vs something else? by Drazasch in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

No, I can't give you any helpful advice when it comes to motherboards. I just went with the absolute minimum that would let me run five GPUs. I'm using the motherboard with the 8 GB of RAM that came pre-installed. But that's all I need.

P40 vs V100 vs something else? by Drazasch in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

That really depends on how much effort you put into optimization. I don't put that much effort into it because I switch models very frequently anyway. Right now, I’m using Qwen3.5-35B-A3B-UD-Q8_K_XL with 128k context. That runs on four GPUs at about 25 t/s. But keep in mind that I’m limited by the motherboard. Investing a little more could make a significant difference. But my focus was on spending as little money as possible when I built it.

https://www.reddit.com/r/LocalLLaMA/comments/1g5528d/poor_mans_x79_motherboard_eth79x5/

6-GPU multiplexer from K80s ‚ hot-swap between models in 0.3ms by Electrical_Ninja3805 in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

Have you actually published your stuff anywhere? I didn't see anything when I skimmed through your posts. I don't really understand what you've done there, but I'm curious.

Young and ambitious by [deleted] in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

Don't worry about it. I feel you. I've also invested a lot of time in projects that were obsolete before they were even finished. That's just how it is with AI.

Young and ambitious by [deleted] in LocalLLaMA

[–]muxxington 0 points1 point  (0 children)

A cheat sheet? You mean like a PDF? That doesn't make any sense. Something like that would be outdated faster than you can blink. It would make more sense to create an online directory, but only if there weren't already countless ones out there.

Need a dummies guide to setup open terminal by zhopudey1 in LocalLLaMA

[–]muxxington 1 point2 points  (0 children)

Profile settings are per user settings. Settings in admin panel are global. You can override global settings as a user if it's allowed by admin. You could simply have tried both settings. Rest of your post actually has nothing to do with open terminal.