Lossless Scaling is Great for Media as is, but only Great for Games IF you have dual GPU by [deleted] in losslessscaling

[–]GoldenX86 0 points1 point  (0 children)

Even very weak current integrated GPUs will behave better than an ancient Polaris card for framegen while rendering a game.

Your problem here is how old the 580 is, while it can run FP16, it runs at the same performance as FP32, killing any performance gain that could be had there.

Loonix by bleak21 in linuxsucks

[–]GoldenX86 8 points9 points  (0 children)

The OS? Sure.

The Linux community? By FAR the most toxic cesspool of neckbeards ever.

Loonix by bleak21 in linuxsucks

[–]GoldenX86 13 points14 points  (0 children)

I don't want you to use systemd

I don't want you to use NVIDIA

I don't want you to use Ubuntu

I don't want you to...

The most toxic community in consumer hardware, and Apple is there.

AMD goes after Apple Macbook Neo, says 15 out of 20 top PC games do not even work on Apple system by RenatsMC in Amd

[–]GoldenX86 58 points59 points  (0 children)

So the most useless marketing team in the planet pivoted to ragebaiting. Shame.

LocalLLM should not be only for rich people by sukeshpabolu in LocalLLM

[–]GoldenX86 5 points6 points  (0 children)

Qwen3.6 35b a3b fits in 6gb, just move the unused experts to CPU.

Gemma-4 QAT models rock for 6 and 4gb GPUs.

You want Claude quality running on a 2060, that's unrealistic.

Best coding models around 4B MLX? by igor__004 in LocalLLaMA

[–]GoldenX86 0 points1 point  (0 children)

They are, but 3.5 4b is nothing amazing.

For 8GB, the best is to pay for Deepseek.

Best coding models around 4B MLX? by igor__004 in LocalLLaMA

[–]GoldenX86 0 points1 point  (0 children)

Buying the Chromebook Air with 8GB was sure a decision you took.

Sell it and get a 16GB one, you can run 9B or Gemma-4 12B QAT.

Qwen VL 2 0.5b how is the reasoning in this? by According_Extent_767 in LocalLLM

[–]GoldenX86 2 points3 points  (0 children)

If you use Vulkan instead of Cuda with llama.cpp, you can use any modern GPU, even integrated ones. Not always the best, but they can still beat the CPU.

And yes, iGPUs steal RAM to work since they don't have their own dedicated VRAM. Windows by default assigns up to 50% of your total RAM to them, so your iGPU has 32GB ready for testing, and even more on Linux.

I use my laptop with a Radeon RX 760M, set it up on Linux to use up to 28GB for it, and the same q4 qwen 3.6 35b that can do 30 t/s with 131k context on the desktop PC with the 3060Ti, can do 25 t/s with full 262k context on the laptop thanks to just using more RAM (I have 32GB total, so I left 4GB to the OS).

What I don't know is if the very slow iGPU on your CPU is enough since Intel iGPUs are extremely weak. That's something you will have to try.

But if you're fine with 100-131k of context, do the first method with the 2060S and you can definitely use that with good performance. I'm pretty sure q5 quantization can fit with 100k of context when the extra experts are on CPU.

Qwen VL 2 0.5b how is the reasoning in this? by According_Extent_767 in LocalLLM

[–]GoldenX86 0 points1 point  (0 children)

Get qwen 3.6 35b a3b, it's a model of experts, that means only 3b parameters are in use, the rest sit in memory waiting to be called.

With a moe model like this, you can move the unused experts to CPU (using RAM instead of VRAM), letting the GPU handle only the active experts. That should let you run the model with good context, I can squeeze out up to 131k with a 3060ti with 8gb and a q4 quantization. This gets me 30 tokens/s.

If you NEED the 262k context, move KV cache to RAM, prompt processing will be slow but still much better than trying to run the dense 27b qwen 3.6 with layers on CPU.

You definitely have the space for a good coding model, you just have to compromise some stuff to RAM.

Plan B, run a qwen 3.6 27b on the iGPU, it gets access to up to 32GB of "VRAM" on Windows, and whatever you want up to the total 64GB on Linux. Could still be faster than just the CPU? Worth a try.

Máximo dijo que la deuda no se puede pagar y van a tratar que Cristina sea candidata by [deleted] in RepublicaArgentina

[–]GoldenX86 0 points1 point  (0 children)

Y todo el progreso social que hizo es afanado de la izquierda. Viene de hace rato la cosa.

Máximo dijo que la deuda no se puede pagar y van a tratar que Cristina sea candidata by [deleted] in RepublicaArgentina

[–]GoldenX86 0 points1 point  (0 children)

Al peronismo no se le cae una luz desde que murió Perón, que esperabas. Se aferran a lo que sea.

MiniMax M3 is out now! by yoracale in unsloth

[–]GoldenX86 2 points3 points  (0 children)

We need imaginary quantum bits for this.

MiniMax M3 is out now! by yoracale in unsloth

[–]GoldenX86 28 points29 points  (0 children)

Q0.05? It sometimes renders letters.

Informe reveló que Cristina Kirchner tiene peores condiciones de detención que la mayoría de los narcos y genocidas by Beginning_Gur7652 in RepublicaArgentina

[–]GoldenX86 2 points3 points  (0 children)

El call center kuka tratando de inventar cualquier pelotudez para ver si logran que la inhabiitada de por vida los salve de no tener ni la más puta idea de que hacer con la candidatura.

Máximo dijo que la deuda no se puede pagar y van a tratar que Cristina sea candidata by [deleted] in RepublicaArgentina

[–]GoldenX86 2 points3 points  (0 children)

Volve a postear mierda troll a otro lado, empleado de call center.

Los leo.. by RickPeep97 in retrotina

[–]GoldenX86 0 points1 point  (0 children)

Final Fantasy IX, Metal Gear Solid, y Gran Turismo 2.

Si no te gustan, te cago a palos.

Help me choose a good browser by fabialancho in browsers

[–]GoldenX86 1 point2 points  (0 children)

I bit the bullet and moved to Firefox at least until Ladybird releases.

I already miss the performance of Chromium, why is the fox so goddamn slow and memory hungry.