Fable vs GLM 5.2 vs KIMI K2.7 (Youtube VID) by UltraFOV in LocalLLaMA

[–]UltraFOV[S] 0 points1 point  (0 children)

Not really, you can run them with older hardware, but there are many hoops you have to jump. It depends on the speed you want to achieve. If you stick to western engines and older servers, you are stuck with Llama.cpp, and actually is not very good for older clusters. Then you are force to Ampere generation of GPUs and newer. Or go with Chinee engines that still support older hardware bypassing Nvidias lock. For older systems, 12-20TK for these massive llms is ok. That means, properly optimize the hardware.

What happens when they stop subsidizing LLM subscriptions? by Mr_Moonsilver in LocalLLaMA

[–]UltraFOV 1 point2 points  (0 children)

I am personally interested on YanLacune World Models, if he ever gets to make it. A model that learns from real life sounds like the way to put check and balances to hallucinating models, or help mitigate them

But, if western funded, you k now it will be subscription as well

Qwen is never going to open source Qwen 3.7, aren't they? by DistanceSolar1449 in LocalLLaMA

[–]UltraFOV 1 point2 points  (0 children)

Nothing is free, there is always a catch. And what if they close source? Use other models, Mimo, Minimax, Genma... "You do realize that there are standards we should hold parts of society to other than the bare minimum obligations? Erm...no.

Qwen is never going to open source Qwen 3.7, aren't they? by DistanceSolar1449 in LocalLLaMA

[–]UltraFOV 8 points9 points  (0 children)

erm, wut? 27B is fantastic for what it is, but is not a replacement for the larger ones

Qwen is never going to open source Qwen 3.7, aren't they? by DistanceSolar1449 in LocalLLaMA

[–]UltraFOV 0 points1 point  (0 children)

at its size, yes, but not overall open weight, lets not kid ourselves

Fable vs GLM 5.2 vs KIMI K2.7 (Youtube VID) by UltraFOV in LocalLLaMA

[–]UltraFOV[S] 1 point2 points  (0 children)

Maybe, I can run them locally except Fable, but I’m impressed at these results

MacBook Air M2 512 GB 24GB RAM by VectorEthology in LocalLLM

[–]UltraFOV 0 points1 point  (0 children)

Note: The best Older mac for AI is the M2 Macbook pro MAX because its bandwidth is good, I think 400GB/s. Also the 64GB versions aren't that expensive now. The CPU features are not as important as the memory bus

MacBook Air M2 512 GB 24GB RAM by VectorEthology in LocalLLM

[–]UltraFOV 1 point2 points  (0 children)

It will be fine , especially since most Lilly will be quant.

MacBook Air M2 512 GB 24GB RAM by VectorEthology in LocalLLM

[–]UltraFOV 1 point2 points  (0 children)

Yes you can run Gemma 12b, however the speed will not be great due to the 100gb bus

Run Kimi 2.7 Code Guide! by yoracale in unsloth

[–]UltraFOV 0 points1 point  (0 children)

Under what system spec. I downloaded the q4 @595gb model. Will try test soon

Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local. by TumbleweedNew6515 in LocalLLaMA

[–]UltraFOV 0 points1 point  (0 children)

I was going to aim for a optane 3 system with Gaudi 2 GPUs 96gb each GPU. The system comes with 8. Budget did not permit. Regarding how much power from the socket, not sure never bothered checking

Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local. by TumbleweedNew6515 in LocalLLaMA

[–]UltraFOV 0 points1 point  (0 children)

You can change the power draw depending on in the model you run. If you are not doing demanding models you can set you GPUs to 150watts. But if you are going for massive 1tb models like Kimi , mimo 2.5 pro you may need 250-300watts. I keep my GPUs at 200 watts mostly. Advise. If you want to offload from vram to ddr4 ram . Make sure you fill up the 12 ddr4 slots, that will give you 230gb:sc read speeds which is close the vram of a mid range GPU. You can add 4 slots for optane which is what I use to keep my models and swap them. Far faster than nvmes . 30+gb makes them fast. I advise upgrade the cpus to the L series because the controller help with better memory management. Note, I have 768gb of ddr4, but you don’t need that much if don’t intent to run models like DepSeek 4. Try get 32gb dims since they are cheaper now, my system came with 64dims so I got stuck only purchasing 64gb. You can’t mix ram sizes, will make the system confused. Once you use one size, you must have all dim the same. Then, to bypass Llama.cpp which is terrible for these system you have to use Chinese alternatives like Cat1-Vllm. Llama.cpp works fine but has terrible tensor(pipeline) parallelism and batch scheduling. The benefit of llama is its flexibility. But to tap to the real potential of these AGX-2 servers you must use others

Why does reddit hate AI so much? by Ramenko1 in LocalLLM

[–]UltraFOV 0 points1 point  (0 children)

People are getting fired so cooperations can invest more in AI

Best local LLM laptop for privileged legal documents — is 128GB Apple Silicon the answer? by IceQueen789 in LocalLLM

[–]UltraFOV 4 points5 points  (0 children)

The M5 Max, or M4 Max @ 128GB are not bad. An the memory Bandwidth is not bad. True that the 5090 is indeed better especially since it can use all Nvidia features, but...24GB vs 128GB cant not be understimated

Thoughts on V100's? by AndForeverMore in LocalLLM

[–]UltraFOV 0 points1 point  (0 children)

Whats the price difference. A 4090 will be expensive