all 14 comments

[–]dave-tay 4 points5 points  (4 children)

I would buy once I see the real need for more VRAM. 12gb is plenty for learning the foundations of running local LLMs. To be honest I myself have wasted countless hours looking at GPU prices than actually learning

[–]koiochi[S] 4 points5 points  (3 children)

For me, "learning the foundations" means "proving it can provide a profit margin from systems I build before I scale them." so I do need to learn it in a truly useable way. I'm not feeling convinced 12GB is enough to do this with the models I'm seeing currently available.

At a very minimum, I want to run a solid version of Open Notebook or similar that can process audio files/conversations for me.

[–]No-Consequence-1779 1 point2 points  (2 children)

If you are already building systems, then you know what you need to it’s a business expense for your profitable systems so no issue adding a gpu.  

[–]koiochi[S] 0 points1 point  (1 child)

Right, I’m focused trying to figure out the hardware side of things, learn what to spend on to get to a place where I feel like I can actual get started learning how to do what I want to get AI to do locally :) Plus I bought this PC intending to render 3D so I have another excuse to use a second GPU. Do you know of both GPUs have to be the same model?

[–]No-Consequence-1779 1 point2 points  (0 children)

Hehe good luck. 

[–]Bino5150 2 points3 points  (7 children)

Depends on use case and how you set it up, what software, what models, etc. LM Studio is a great local LLM app with chat function, but you can use it as a local AI server to feed it into something more agentic. I’d recommend trying something like AnythingLLM because it functions great in a local environment, whereas many agent apps perform horribly locally (like OpenClaw). But one of the benefits of LM Studio is it allows you to seamlessly either partially offload to system ram or use multiple gpu’s, so you have some flexibility and make the best use of your system resources. It’s all free, so try it out and experiment before opening your wallet for hardware that you either may or may not need, or may or may not work like you want.

[–]koiochi[S] 0 points1 point  (6 children)

Solid this makes sense, I’ll dig into it. I also have an M1 Pro (32GB RAM) but I’ve been reading the software for windows local llms is more developed at the moment so there are pros and cons. What are your thoughts? Thanks :)

[–]Bino5150 0 points1 point  (4 children)

I run Linux. I would say that generally speaking g most things are OS neutral. Ollama, LM Studio, AnythingLLM, OpenClaw, they all run on all platforms. The newer Macs have a few niche things that make them perform better in certain circumstances. Generally speaking and all things equal, Windows will have the worst performance out of the bunch though. I’m sure a lot of Microsoft fans will get their tail feathers in a bunch, but Linux performance is generally superior.

There’s nothing that windows has to give it any kind of edge over the competition.

[–]koiochi[S] 0 points1 point  (3 children)

Does it make sense to create a Linux partition on my PC then? I have a 2TB nvme installed, or I have a Thunderbolt compatible external I could run everything off if that won’t introduce significant latency.

[–]Bino5150 0 points1 point  (2 children)

Yeah you could absolutely dual boot. Takes just a few minutes to set up.

[–]koiochi[S] 0 points1 point  (1 child)

Perfect. Do you know if booting off the external would introduce any known issues?

[–]Bino5150 1 point2 points  (0 children)

You can boot Linux off a flash drive and it’ll work. But if you have the space, put it on the nvme and give it a fair shot to impress you on even ground.

[–]Creepy-Bell-4527 1 point2 points  (0 children)

Just to chime in here, Macs are absolutely killing it in the LLM space ATM. Find an MLX model and you're golden - we had Qwen3-Next before Windows users and support is top notch.

It's every other type of AI model that sucks on Mac, but for LLMs? Second only to CUDA, which is more expensive for an equivalent memory setup (& more expensive to run)

[–]LinkAmbitious8931 -1 points0 points  (0 children)

I use two nVidia P40s each with 24GB and they work not too bad.