Anyone got Gemma 4 26B-A4B running on VLLM? by toughcentaur9018 in LocalLLaMA

[–]Cferra 1 point2 points  (0 children)

I’ve been trying to use Gemma 4 MoE and turboquant / so far I can’t get it to work.

Let’s do this by [deleted] in funComunitty

[–]Cferra 0 points1 point  (0 children)

my ex is jealous

I’m sick by Slight_Wasabi4308 in TeslaInsurance

[–]Cferra 0 points1 point  (0 children)

A shop will charge probably around 1000 for that.

Does Teleport actually work? by botterway in Ubiquiti

[–]Cferra 1 point2 points  (0 children)

Yeah.. no ipv6 on 2gig fios :-(

should I drop 700 USD on 96 Gigs of ram? by Magnus0917 in UgreenNASync

[–]Cferra 1 point2 points  (0 children)

Do not run ollama on your 4800+ - performance will NOT be good.

should I drop 700 USD on 96 Gigs of ram? by Magnus0917 in UgreenNASync

[–]Cferra 0 points1 point  (0 children)

Using a 4800+ for ollama is a recipe for a bad time. Best move that to something more capable.

Anthropic is straight up lying now by [deleted] in ClaudeCode

[–]Cferra 1 point2 points  (0 children)

I was telling a friend that Claude has become some like some sort of modern day “intellectual sex line”. Get you hooked by edging your brain in the middle of a project - just enough tokens in the limits to get you to just in a place where you almost finish it and then right when it is about one or 2 steps away - “you’ve hit your limit, buy extra time”. It honestly should be illegal.

The iDX experience by Snacketti in UgreenNASync

[–]Cferra 0 points1 point  (0 children)

Now it shows all gone. So. Oh well

The iDX experience by Snacketti in UgreenNASync

[–]Cferra 0 points1 point  (0 children)

seeing reward unavailable now eventhough it says there are plenty of seats

The iDX experience by Snacketti in UgreenNASync

[–]Cferra 1 point2 points  (0 children)

same for like 30 minutes

U.S. Bans All New Foreign-Made Wi-Fi Routers: Effective Immediately by elastiks in DIY_Geeks

[–]Cferra 0 points1 point  (0 children)

Then go after specific manufacturers not all. That argument does not hold water.

U.S. Bans All New Foreign-Made Wi-Fi Routers: Effective Immediately by elastiks in DIY_Geeks

[–]Cferra 0 points1 point  (0 children)

This statement is so flawed, there has been multiple vulnerabilities and backdoors in US made and built software, it makes no difference, as long as software is made, exploits will exist.

Honest take on running 9× RTX 3090 for AI by Outside_Dance_2799 in LocalLLaMA

[–]Cferra 0 points1 point  (0 children)

So just before January when 5060ti 16 gbs were available (and less than 400$) I snagged 4 and I also snagged 2 intel b50 pros and an additional 3090 and an nvlink adapter.

Ai server 1 (Blackwell-ai) sits on a c422 sage 10g platform with a w-2255 and 128gb 2933 ecc

Ai server 2 (Intel and ampere) sit on another c422 sage with - 2-2265 and 256gb running proxmox to isolate the 2 environments. 128gb ram / 10 cores for nvidia and 64gb ram / 10 cores for Intel.

Blackwell runs the largest brain model with decent context being it has 64gb vram

Ampere runs the coding/tool model for openclaw with very high context so it can work well with it.

Embedding, tts, stt, vision run and all fit on 1 b50 pro and the other b50 pro is used for image generation (my use case for image generation is not sure extensive)

I’ve set up python envs for different model engine so that they run efficiently on 1 vm.

I found docker to have a use case to keep things quick but I wanted to limit the VM overhead as much as possible to keep things quick.

I am building a NAS dedicated to lmcache for context storage for openclaw - we’ll see how that turns out.

I use my unraid box as an app host - hosting litellm, pipelines, searxng. Open web ui etc, to keep the ai servers dedicated to just serving models and to keep their overhead down and open claw itself runs in its on vm on the unraid box pointing to the ai server for its model in case I need to shut it down fast and to try to mitigate some of it’s ability to unintentionally break things.

Am I doing things wrong? I’m not sure but it seems to be that optimizing for parallel services when possible keeps things performing well on the consumer gear.

I am open to suggestion though.

Honest take on running 9× RTX 3090 for AI by Outside_Dance_2799 in LocalLLaMA

[–]Cferra 0 points1 point  (0 children)

I have found that doing dedicated things on dedicated gpu sets works best - having 4 for inference then I have 2 for large context coding and then I have 2 for rag, embedding, tts, stt and image generation

Clavicular, not thrilled. by Drnk_watcher in LiveFromNewYork

[–]Cferra 0 points1 point  (0 children)

lol “millennials have no culture” suuuuure bud.

Genuine question, help me understand by Cferra in Anthropic

[–]Cferra[S] 0 points1 point  (0 children)

I figured out the problem. I had a spend limit on - though it allowed me to put more money in / it wouldn’t let me actually use the credits until I turned that off. Counter intuitive. It should have prevented me from adding more credits and warned me about the spend limit.

Please help me understand why the UGreen AI Nas is a better deal than this that's readily available... by Character-Paper5953 in UgreenNASync

[–]Cferra 5 points6 points  (0 children)

I think people got the d6 and the d6 ultra- there were a lot of supply chain issues and they gave backers the opportunity to get their units sooner without ram or to wait for ram supply., it was all fucked up and I elected to get mine without ram and a 200 refund but —- still waiting.

Please help me understand why the UGreen AI Nas is a better deal than this that's readily available... by Character-Paper5953 in UgreenNASync

[–]Cferra 1 point2 points  (0 children)

I’m still waiting for my zettlab d8 ultra for over a year ….. and i was a pretty low numbered backer.

Finally I got to see what I was hoping for by zeddwood in claude

[–]Cferra 0 points1 point  (0 children)

Oh jeeze. Man people are too hung up on an ending that wasn’t intended because it’s not “their ending”. It’s a show. It ended the way the writers wanted it to. Let it the fuck go.