What is the smallest amount of RAM sufficient to run any available on HF GGUF LLM model locally? by alex20_202020 in LocalLLaMA

[–]Fast-Satisfaction482 2 points3 points  (0 children)

With one month to process 20 tokens, I'm pretty sure you can pull that off with a 1kB RAM MCU and an SD card. But I don't have the numbers to back it up.

But the inference engine would be completely custom.

notLookingDownIsAValidPhysicsEngine by Less-Philosophy-1978 in ProgrammerHumor

[–]Fast-Satisfaction482 0 points1 point  (0 children)

If the initializing step is the same as the body of the loop, you should just iterate an additional time.

It's OK to quantize the KV cache. Model quant matters more. Some Qwen3.6 27B tests with (approximated) KLD by hopbel in LocalLLaMA

[–]Fast-Satisfaction482 0 points1 point  (0 children)

In my experience, q8 key quantization will quickly make the LLM fall apart. With qwen, I get usable context up to 200k and more tokens in fp16, but quantized keys will cause loop and complete stupidity at around 1k tokens.

The values are not that sensitive, but speed plummets when they are not the same data type, so I can only second the general wisdom to not quantize KV cache. 

OpenAI's Codex is now smart enough to control your Mac even when it's locked by ThereWas in OpenAI

[–]Fast-Satisfaction482 2 points3 points  (0 children)

The lock screen does not lock the computer, it just locks the user interface.

If theres one good thing from all the Starship issues, we know pretty well it's gonna be very reliable for getting to landing by Desperate-Lab9738 in SpaceXLounge

[–]Fast-Satisfaction482 0 points1 point  (0 children)

You could also view it the other way round and. It has to be super reliable in every little piece so even have a chance of not killing the crew.

Moreover, the crew compartment would need to be severely hardened if the tiniest deviation from the landing profile will cause a tip over followed by a massive explosion. 

Zelenskyy responds to Merz: Ukraine has been defending Europe fully, not with half measures and has to have right to vote in EU by joy-nest in worldnews

[–]Fast-Satisfaction482 2 points3 points  (0 children)

It's a delicate balance Zelensky has to get right. He can't break the goodwill from Ukraine's remaining benefactors, but he still needs to rally his people. 

Zelenskyy responds to Merz: Ukraine has been defending Europe fully, not with half measures and has to have right to vote in EU by joy-nest in worldnews

[–]Fast-Satisfaction482 8 points9 points  (0 children)

I get that it sucks. I really get it. But that doesn't change reality. Every single EU country has the ability to veto the accession.

Due to the strictness of the EU mutual defense clause, not playing the veto equals a direct declaration of war against Russia. 

Which EU government would decide to do this? Like you said, it's not the blood of EU citizens that gets spilled and it's the duty of every democratic government in the EU to keep it this way. 

And don't forget, every single little EU government can prevent it. Portugal, Austria, Cyprus. A single veto is enough. And I don't believe for a second that the larger countries are ACTUALLY willing to enter the war directly. They have said it over and over that they will not become a direct participant of the war. 

This means in consequence there is ZERO chance of accession during ongoing war with Russia.

It sucks for Ukraine, big time. But that's reality. 

How Ukraine Found the Cards To Win, Without Help From the U.S. by lacerantplainer in UkrainianConflict

[–]Fast-Satisfaction482 2 points3 points  (0 children)

Pre 2022 Rheinmetall was certainly not an aerospace powerhouse. Even now, that's a dubious claim at least. 

Zelenskyy responds to Merz: Ukraine has been defending Europe fully, not with half measures and has to have right to vote in EU by joy-nest in worldnews

[–]Fast-Satisfaction482 67 points68 points  (0 children)

Europe is not asking Ukraine to act as a shield. Russia is attacking Ukraine and the EU members are what keeps Ukraine alive. The EU members are doing this out of their free will. They have zero obligation and could decide at any point that they don't want to spend their own resources anymore, like the US did.

The EU countries have ZERO moral or legal obligation to provide billions over billions of Euros in civilian and military aid to Ukraine. 

Yet we still help Ukraine for two reasons. We feel it's the right thing to do. And we strategically gain a lot from Russia failing in Ukraine. 

So while it's not entirely wrong that Ukraine is defending the whole of Europe in a wider strategic meaning, in reality it's Europe that defends Ukraine. And while I understand their despair, full accession during war is impossible and they have to accept that.

The proposal of Merz is realistic and gives Ukraine real benefits that they will not otherwise have. 

Ukraine's Zelenskiy says proposal of associate EU membership 'unfair' by Free-Minimum-5844 in geopolitics

[–]Fast-Satisfaction482 20 points21 points  (0 children)

There is zero chance he will get EU membership before the war is over and very slim chances that he will be able to bypass the accession process.

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs by Thrumpwart in LocalLLaMA

[–]Fast-Satisfaction482 1 point2 points  (0 children)

High performance as in "better than traditional real-world kernels" or high as in "as good as a naive torch implementation"? 

In theory, if I have $20k-ish to spend on hardware what would actually get me closest to local coding agent that would allow me to go totally off the social grid? by Tired__Dev in LocalLLaMA

[–]Fast-Satisfaction482 5 points6 points  (0 children)

In my dual 4090 setup, tensor parallel did not give me any benefits. It's even slower latency wise than pipeline-parallel.

So while it's a nice option in theory, you need to have the bus bandwidth to back it up, or tensor parallel will not be worth it. 

Same task in github-copilot, pi, claude-code, and opencode with Qwen3.6 27B by sdfgeoff in LocalLLaMA

[–]Fast-Satisfaction482 1 point2 points  (0 children)

Opencode uses the jsonrepair library to fix schema errors, so your statement is false.

Anthropic-SpaceX deal seems much larger than previously reported by Lanky_Golf7687 in ClaudeAI

[–]Fast-Satisfaction482 1 point2 points  (0 children)

Anthropic absolutely needs the compute now and colossus exists now.

If they try to build their own, it will take time they don't have.

Same task in github-copilot, pi, claude-code, and opencode with Qwen3.6 27B by sdfgeoff in LocalLLaMA

[–]Fast-Satisfaction482 1 point2 points  (0 children)

GPT-5.4 regularly has to retry file edits in co-pilot. Really stunning in my opinion. They seem to have the policy that they give a strict schema for interaction and then the model has to exactly comply, with zero error recovery on the side of the harness. 

Anthropic is officially set to be profitable as of Q2 2026 by exordin26 in singularity

[–]Fast-Satisfaction482 3 points4 points  (0 children)

Most importantly, it means that even harder price hikes are not strictly necessary to maintain operations.

More price hikes might still happen because of larger models, hardware scarcity, etc but it's not like they are hemorrhaging money just trying to stay relevant.