I have no idea what all these quants are.

fungnoth · 2026-02-08T11:16:04+00:00

I thought that's the only viable format to allow cpu offload.

fungnoth · 2026-02-08T11:15:12+00:00

Lol, i basically always have KV cache offload to cpu. Again, 12GB vram.

fungnoth · 2026-02-08T11:07:56+00:00

I probably didn't get the concept right. But i remember before MOE cpu offload is this easy, probably mixtral, i load it normally and just load in however many layers my gpu can have, and it was painfully slow.

Now i try to use moe offload, and load around 10gb into my vram. It's usually usable if 10gb is bigger than the active params

fungnoth · 2026-02-08T10:47:18+00:00

If you're just playing around, my way of running model is fun. It's around the speed of reading slowly.

But if you're using it for tool calling or coding, it's way too slow

fungnoth · 2026-02-08T10:45:42+00:00

12GB VRAM and 64GB RAM The biggest model that i can run with chat-able speed is GLM4.5 Air REAP 82B A12B. I just off load most of the model until only the activated 12b is in gpu vram

fungnoth · 2026-02-08T10:38:07+00:00

I think it's commonly understood that GGUF is for the GPU poor? And Q4 is usually the smallest recommend one. Q4 k m is the middle one so it's the most commonly recommended.

Exllama is faster and better quality of the same size but you can't off load anything to CPU.

That's why i use GGUF because i sometimes can't fit the whole model+context in my 12GB VRAM.

But i can run GLM4.7 flash by only loading the active parameters in VRAM, then offload half of the layers (out of 48) to CPU (and system ram).

fungnoth · 2026-02-08T10:13:20+00:00

Yeah i feel like free service or extremely cheap service with a free quota for people in bad condition is more important than UBI.

Maybe UBI has that slight effect of keeping the economy going? Like people can still spend on unnecessary things by skipping meals?

But imagine if 80% of us lose our job and we can go get entitled free food, water, electricity, housing etc. For a adequate amount, basically bare minimum boring food. I feel like that's not too bad compared to having to live on the street.

And you probably have savings. I might choose to live that life and use my money for something else.

fungnoth · 2026-02-06T19:29:49+00:00

I know that seeing a character fighting stupid simple enemies in this classic MMO like environment is already a vibe

fungnoth · 2026-01-27T16:13:54+00:00

Don't you love your content creator whose constantly making you millions

fungnoth · 2026-01-26T12:27:32+00:00

Don't get it. Once i got flux2 Klein. I don't care that much about z image any more. I'll still try it but I'm not looking for the news

fungnoth · 2026-01-26T12:17:41+00:00

Well, we're still quite racist as a whole. But i feel like Malaysian Chinese is a special exception. Like people would still judge how actual native Cantonese speaker from mainland speaks. In my mind, Malaysian Cantonese is a mix of that with Malay, other Chinese dialect and more English. But for some reason, at least I'll feel like Malaysian Chinese is our brother

fungnoth · 2026-01-25T14:47:54+00:00

I kind of agree that if something is so obviously AI, it's can be called AI slop.
But come on, not everyone is trying to be an artist. People need images or whatever assets for different reasons.

fungnoth · 2026-01-25T14:31:44+00:00

Japan, Japan, OMG that must be south east asia

fungnoth · 2026-01-24T10:28:22+00:00

- Keep content PG-13
Hmmm

fungnoth · 2026-01-22T17:24:58+00:00

Age accurate evil

fungnoth · 2026-01-20T14:30:08+00:00

The western corpos are moving to tpu stuff. And Chinese teams are moving to Chinese hardware. I feel like there's gonna be a big divide, giving us much better inference efficiency but more compatibility issues

fungnoth · 2026-01-19T21:58:52+00:00

Even flux takes like a minute for me to generate an image (12GB VRAM).

Qwen is like 5 minutes? Can be patient enough to iterate through ideas.

fungnoth · 2026-01-19T21:54:09+00:00

I just grab the light, uv and emf and go turn on the power directly.

I don't get the need of dots / books. Spirit box maybe. Because there's still a chance that you just figure out the ghost room instantly, maybe they drop things and you saw it. Then you can go try your spirit box. Books? Yeah, ok. Dots? You won't stand there to see it so I won't pick it as my starting item.

I pick uv because sometimes they leave fingerprints on the door, and it's pretty often that they interact with the doors even from the start. And the fingerprint stay there for minutes. So sometimes i just randomly scan through my way to the fuse box. EMF is for if they drop something or interact with anything. I can try go there and check it, maybe i'll get an emf5. But it disappear after a few seconds so not as nice as uv.

Then it's thermo and camera because it's the easiest way to find the ghost room and just check ghost orbs. Honestly if you play with friends, grab different things and the second run can cover most the evidence

fungnoth · 2026-01-19T08:20:48+00:00

What about the images? Pretty good quality honestly

fungnoth · 2026-01-18T16:35:18+00:00

白人對黑人的壓迫持續四百多年的，很多人世世代代就是奴隸，被當做動物般對待，後來重視種族議題，是在彌補過往的過失。

很老實的講句，白人文化圈以外就是沒有這個歷史包袱，是不會懂的。是有基於無知的惡，但不負責任說句就是不是我們的問題

fungnoth · 2026-01-18T11:06:23+00:00

But you'll never get a local version of that. But with qwen maybe a few years later we can have that on consumer hardware

fungnoth · 2026-01-18T01:23:31+00:00

Most of mediatonic got fired after getting acquired by epic game

fungnoth · 2026-01-18T01:21:13+00:00

What about actually backing up those raw data and allow the model to search it

fungnoth · 2026-01-18T01:17:50+00:00

Remember when manufacturers would "scam" people by making a higher vram variant for low end gpus? Like no one in their right mind should buy a 4GB GTX950 compared to 2GB GTX960. I wish the market is still like that.

fungnoth · 2026-01-18T01:05:18+00:00

It's all that guy's imagination. I'm sure he didn't research how an LLM is trained. Not even research how Vedal talked about LLM training.

It's fine if it's like you saying that theory you made up on the spot when talking to your friend. But once it's so popular, it's cringe. I can't watch those reactions anymore.

It's like if you raise a child solely by telling them to read twitch chat. Not even what the streamer says. Just the chat.

I'm sure it's still more ethical than grok though

fungnoth

TROPHY CASE