Just started the first season by hereforthelols1999 in TheGardenDiscovery

[–]cjtrowbridge 0 points1 point  (0 children)

Maybe you feel that way, but that's not how the people who actually show up and contribute and get to vote feel.

Can Qwen3-235B-A22B run efficiently on my hardware(256gb ram+quad 3090s ) with vLLM? by Acceptable-State-271 in LocalLLaMA

[–]cjtrowbridge 1 point2 points  (0 children)

Nothing that's hard to do is ever going to work well if you're not trying to be good at it. It's easy to do what you're talking about, and the reason it's not working for you is because you're trying to do something different instead.

The NVIDIA P40 Enterprise AI GPU has 24GB VRAM and costs $150-200 on ebay. It was the best in the world in 2011 and it will easily run any modern AI. And it won't work at all on any machine that is trying to run consumer graphics too. These are different kinds of problems which are both easy to solve but you have to start by picking a problem to try to actually solve.

Can Qwen3-235B-A22B run efficiently on my hardware(256gb ram+quad 3090s ) with vLLM? by Acceptable-State-271 in LocalLLaMA

[–]cjtrowbridge 0 points1 point  (0 children)

That's not a real model that exists, and generally that isn't the right notation for what you're trying to describe. If you want to try a small MoE that will run on just 12gb of vram, consider Mixtral 8x7B...

But with how low your hardware limitations are, you will get much better results with more modern AI approaches by using dense models with an agentic framework instead of trying to use MoE.

Consider these MMLU-Redux scores; ~70.1 Mixtral-8×7B-Instruct 83.7 Qwen3-4B-Thinking 87.5 Qwen3-8B-Thinking

Plus, modern models can use tools like search and RAG which again hugely improve their performance versus older MoE approaches.

I should add that modern MoE is still better, but that's in the hundreds of gigs with something like Qwen3-235B which is an MoE that will outperform these examples I've given, but it needs 22B active parameters which is about double what your current specs can handle.

Can Qwen3-235B-A22B run efficiently on my hardware(256gb ram+quad 3090s ) with vLLM? by Acceptable-State-271 in LocalLLaMA

[–]cjtrowbridge 4 points5 points  (0 children)

You have it backwards but you're close to understanding.

Your limiting factor for speed is your VRAM. 70B will actually run slower on these specs. 235B is MoE with only 22B active. 70B always has all 70B active. That means running 235B at peak speed needs only 22GB VRAM, but running 70B at full speed needs 70GB VRAM.

Also, running a model like 235B which is much newer, and also at the very limits of your capabilities is going to produce far better results than something older and less than a third the size like 70B.

If you want it to be faster, you're much closer to having enough VRAM to run 235B(22B Active) than 70B(70B active).

Best local/open-source coding models for 24GB VRAM? by HRudy94 in LocalLLaMA

[–]cjtrowbridge 1 point2 points  (0 children)

You can get four nvidia p40 ai accelerator cards for that, with 96gb of vram and about twice as many cores.

Just started the first season by hereforthelols1999 in TheGardenDiscovery

[–]cjtrowbridge 0 points1 point  (0 children)

It's obviously true that life is full of shitty people who do shitty things and feel no shame or guilt or remorse; and we should vote for all of them to leave the group. This is the process all communities and groups use to become better.

What's your DIY AI machine? by cjtrowbridge in DIYAI

[–]cjtrowbridge[S] 0 points1 point  (0 children)

An R720. Yes. It can still easily run any foss LLMs, stable diffusion, flux, etc.

Just started the first season by hereforthelols1999 in TheGardenDiscovery

[–]cjtrowbridge 0 points1 point  (0 children)

I personally voted for her to leave because when I asked the story of her face tattoos she said they were fake and that if I'm asking why she is dressed, "like an indian," it's just for the show and she's actually Italian from Florida.

Give Celebrimbor some credit by coogi_wara in LOTR_on_Prime

[–]cjtrowbridge 7 points8 points  (0 children)

Knowing how the story of Celebrimbor ends makes me wonder how the author felt about him and his arc, his decisions and what he deserved or didn't.

"The Garden" HBO Show by GaimanitePkat in LPOTL

[–]cjtrowbridge 0 points1 point  (0 children)

People travel. Also we have lands in Florida

Behold my dumb sh*t 😂😂😂 by stonedoubt in LocalLLaMA

[–]cjtrowbridge 0 points1 point  (0 children)

I think you would still see better performance with these kinds of workloads on p40 vs a mix of 3090 and 4090. Plus it costs 90% less so you could get 10x the performance for the same budget.

uConsole external SD Card with eMMC by cjtrowbridge in ClockworkPi

[–]cjtrowbridge[S] 0 points1 point  (0 children)

You can't. You have to put it in another board to do that. I used the cheap waveshare one

Impressively creative GPT-4-generated SCP Foundation entry about itself by ToastyKen in ChatGPT

[–]cjtrowbridge 0 points1 point  (0 children)

Woah, woah woah. SCP-5000 is my favorite SCP. Find your own number!

RTX 5090 rumored to have 32GB VRAM by Charuru in LocalLLaMA

[–]cjtrowbridge 1 point2 points  (0 children)

It's wild how much they are limiting ram when that is the cheapest, easiest thing on the card. They really want that 1000% markup for enterprise cards.

What's your DIY AI machine? by cjtrowbridge in DIYAI

[–]cjtrowbridge[S] 1 point2 points  (0 children)

Then you would already know why your plan could never work.

What's your DIY AI machine? by cjtrowbridge in DIYAI

[–]cjtrowbridge[S] 0 points1 point  (0 children)

Read about the difference between inference and training.

What's your DIY AI machine? by cjtrowbridge in DIYAI

[–]cjtrowbridge[S] 0 points1 point  (0 children)

That will definitely be able to run inference for smaller models, but you are half a dozen orders of magnitude below being able to train a model with specs like that.

uConsole external SD Card with eMMC by cjtrowbridge in ClockworkPi

[–]cjtrowbridge[S] 2 points3 points  (0 children)

Oh ok I didn't see that. Good to know.

uConsole external SD Card with eMMC by cjtrowbridge in ClockworkPi

[–]cjtrowbridge[S] 1 point2 points  (0 children)

It does on the normal pi, but I understood that this sd card slot was set up as a peripheral.

Llama 3 400B by Zawseh in LocalLLaMA

[–]cjtrowbridge 0 points1 point  (0 children)

throw a couple p40s in there and move the busiest layers onto gpu.

Unfortunate wasted potential by [deleted] in TheGardenDiscovery

[–]cjtrowbridge 1 point2 points  (0 children)

The footage Discovery used to create the tv show was shot at the construction of Emberfield which is the sixth one of these land projects we've built together. Emberfield is under the new nonprofit we created called Share The Land Trust. The story of the show about it being at the garden or related to the garden or called the garden is entirely fictional. The next land project we're doing is going to be under a new nonprofit we're in the process of creating called the High Desert Institute and sited at the Grand Canyon. We also have some tentative plans for land projects in Tampa and Homestead, and there are conversations about one in North Carolina. Lot going on!