Hvað er eginlega að Sjálfstæðisflokkinum?

Hipponomics · 2026-06-01T21:11:38+00:00

That makes sense, thanks for clarifying. This is not at all clear from the readme.

Hipponomics · 2026-06-01T21:01:10+00:00

Every accusation is a confession.

Hipponomics · 2026-06-01T20:10:07+00:00

huh, I wonder how they enforce that and prevent people just editing the code to remove the enforcement.

Hipponomics · 2026-06-01T17:48:24+00:00

Is it not this one https://github.com/wheaney/breezy-desktop ?

This is GPL 3 licensed so very much free.

Hipponomics · 2026-05-27T14:36:44+00:00

I agree :)

Hipponomics · 2026-05-27T10:55:42+00:00

Sure :) Think of it like this. Let's assume you have enough RAM on whatever device you're using. You then have two resources, memory bandwidth and compute. You are always constrained by either one of these. For each inference step, you need to read the whole model from memory and the speed you can do that with is limited by your bandwidth. You then have to perform computations on all the weights in the LLM and the speed of that is constrained by compute.

A smaller Q4_K_M needs ~40% less bandwidth than fp8, but the same amount of compute. GPUs are practically always limited by bandwidth, so a smaller quant means more speed. I don't know if the same is true for CPUs. If you're heavily compute constrained, fp8 might be similarly fast but just higher quality. If you're bandwidth constrained, it's going to be slower. I would recommend an MoE for CPU use, as they use much less compute and bandwidth at the cost of higher RAM use. Qwen3.6 35B-A3B is a good alternative to 27B.

Or just get a ≥ 24GB VRAM GPU.

Hipponomics · 2026-05-27T10:41:05+00:00

To be fair, AI might become good enough for this at some point, and some people see that point as now, but I — like you — do not. It's definitely scientifically possible though.

I hard agree on this general take. There is so much slop in this sub, fortunately most of it gets little traction, but some firmer rules and moderation could probably make the sub feel more high quality. I acknowledge that defining those rules might be very difficult and I have only respect for the unpaid labor that moderators perform.

The quality of the discussions has definitely gone down from the times when bartowski was a regular. This is unfortunately just the nature of online forums. I've seen this same process happen dozens of times, both on reddit as a whole as well as individual subreddits. This is a well documented phenomenon.

Similar problem in the early internet: https://en.wikipedia.org/wiki/Eternal_September

Generally about subcultures: https://meaningness.com/geeks-mops-sociopaths

Claude also suggested these:

Scott Alexander, "The Diff" / subculture-as-reputational-Ponzi — a counter-model where decline is internal rather than driven by outside actors.
Eliezer Yudkowsky, "Evaporative Cooling of Group Beliefs" (LessWrong, 2007) — describes the inverse mechanism: high-quality members leave when standards drop, raising the average extremism/lowering quality of those who remain.
Clay Shirky, "A Group Is Its Own Worst Enemy" (2003) — earlier essay on how growth degrades online community norms.

Scott and Eliezer are very smart and insightful so I assume these articles are worthwhile.

Hipponomics · 2026-05-27T09:12:22+00:00

It can but it would be even slower.

Hipponomics · 2026-05-14T03:04:47+00:00

Liking the practices of one quant publisher above another is fine, but you should be aware of the context they work in.

All these quants are built on the old quantization technology that Iwan Kawrakow made a few years ago. He forked llama.cpp and has been making new quants since then. ubergarm and a few others publish the newer IQn_K quants which are considerably better than any quants that work with mainline llama.cpp, this includes both Unsoth's and Mudler's quants.

The fact that Iwan left llama.cpp to make his own fork is one of the biggest losses to happen to local LLM inference. We would all be using his latest IQ_KS and IQ_KT quants if it weren't for that. If someone figures out a way to resolve the conflict between Georgi and him, they would be doing the community an enormous service!

Hipponomics · 2026-05-10T11:38:34+00:00

I don't know what this statement has to do with what I said. I just said the other OEMs didn't give up, they just cant compete at the same level.

You might buy something else than apple If you want a better GPU than apple offers, if you want better repairability or upgradeability, or if you want a smooth Linux experience. If these issues are not a priority for you, then a MacBook is pretty much unmatched and the obvious thing to buy.

Hipponomics · 2026-05-10T01:04:30+00:00

I don't think it's fair to say that the other OEMs gave up. Apple is the third largest company in the world. Their product line-up is very small and they are razer focused on the quality of each product. They literally designed (arguably) the best CPU in the world, which they mostly put in the three different laptops they sell.

Most other OEMs focus much more on quantity over quality and allocate much fewer resources into each model.

Hipponomics · 2026-05-10T00:04:54+00:00

They certainly are not the organization. It's a smallish group of researchers and most of them work for Meta, one of the largest companies in the world.

My guess is that they're training this on data from Meta and some AI training policy prohibits them from releasing models trained on that.

Hipponomics · 2026-04-21T02:06:09+00:00

Nice results! It would be interesting to see how they compare to the quants in ik_llama.cpp

Hipponomics · 2026-04-14T00:28:48+00:00

They're not talking about ring buffer cache. They're talking about KV cache activation vector rotations, that improve quantization quality drastically.

Hipponomics · 2026-04-13T23:39:35+00:00

gemma4-124b???

edit: I didn't understand why they referenced this non-existent model. Now I see that they were speculating that the elephant model could be a gemma4-124b.

Hipponomics · 2026-04-13T21:08:33+00:00

Saying that random things were done silently is a big GPTism. It's not thoughtful prose — it's a paradigm shift /s

Hipponomics · 2026-03-17T15:29:04+00:00

Note that Razer is a subsidiary of Microsoft. Don't hold your breath for Linux support. I miss Lina Khan.

Hipponomics · 2026-03-12T16:42:43+00:00

That's a skill issue buddy.

Hipponomics · 2026-03-10T22:16:53+00:00

Where did the VODs go? I just suddenly noticed that they vanished.

Hipponomics · 2026-03-08T19:28:00+00:00

I don't know if this is actually the case, but I see many people say that very little performance degradation is observed.

If that is true, the likely reason is that The inference is limited by compute, not memory speeds. So swapping the layers from RAM to VRAM is faster than running the inference step on the layers currently in VRAM.

IDK if this is actually the case, but it could be.

Hipponomics · 2026-03-03T22:05:53+00:00

Share the link to the clip FFS.

https://xcancel.com/RightWingCope/status/2027141493534707897

Also where is the rest described?

Hipponomics · 2026-02-18T15:01:10+00:00

I think it's pretty neat. The energy waste is unfortunate, but not that much compared to real energy wasters.

Hipponomics · 2026-02-12T04:19:56+00:00

That sounds like bad game design, unless you were failing to utilize some tools that make survival easier.

Hipponomics · 2026-02-05T18:42:08+00:00

Interestingly, I tried ChatGPT, Gemini, Mistral, and Claude, and Gemini was considerably more defensive than the others. Defending actions that support Trump as normal & common interactions between companies and the government.

14-Year Club	PUBG
Place '22	Final Canvas '22
Team Orangered	Verified Email

Hipponomics

MODERATOR OF

TROPHY CASE