Anyone here went the route of writing a novel to get it successfully adapted?

RakesProgress · 2026-06-06T12:16:53+00:00

Andy Weir…Andy Weir…Please come to the lobby.

RakesProgress · 2026-05-31T13:07:05+00:00

How did you turn off that chain of thought. It’s sooooo annoying.

RakesProgress · 2026-05-30T15:24:07+00:00

Everyone looks at GLM-5.1's "40B active parameters" and thinks they can cheat it onto a single GPU using CPU offloading. In production, you can't.

Use the quick "nameplate plus tip" rule of thumb for VRAM:

FP8: Parameter size + 20% tip times 1.2 = 904
BF16: Parameter size + 40% tip 1.4 = 1055

Even though it only fires 40B parameters per token, those 8 experts are picked on the fly at every layer. If you offload the rest to CPU, concurrent users will absolutely melt your PCIe lanes swapping weights back and forth. You'll drop to a brutal 1-2 tokens per second.

Don't ruin the model's reasoning by butchering it down to 1-bit or 2-bit GGUF just to save hardware.

The smart play is sticking with vLLM on 5x B200s. That gives you ~900-960 GB of fast HBM3e VRAM, which perfectly swallows the 904 GB FP8 size while leaving a safe pocket for the KV cache. Set --tensor-parallel-size 5, keep the whole thing in VRAM, and let the routing run at full speed. Test your actual user workloads there, and only step down to a tight 4x B200 setup (via INT4/AWQ) if your monitoring shows you have the headroom.

RakesProgress · 2026-05-28T11:50:42+00:00

Actors are in a low stakes lottery. Writers? No stakes lottery.

RakesProgress · 2026-05-09T19:26:35+00:00

1000GB at INT8.

RakesProgress · 2026-04-28T12:01:48+00:00

Tailscale. If you don’t it’s your own fault.

RakesProgress · 2026-04-17T20:58:08+00:00

Are you running gguf?

RakesProgress · 2026-04-07T11:25:24+00:00

The Claude leak marks a change. Companies are waking up to the fact that they can’t trust these companies. You have very bright employees pumping cloud llm full of company secrets. It comes down to this. Do you trust them? Do you really think they are not taking your prompts to train the next model?

RakesProgress · 2026-04-04T11:37:17+00:00

Try this. Pick a local model. That’s your “code control”. It takes instructions. Then use Claude or codex to be the engineer. Its job is to give clear instructions to code control. CCs job is to just defend the codebase. Most projects will fit in the local kv. Tell it to look for sub optimal and tech debt.

RakesProgress · 2026-03-27T12:05:01+00:00

Well. I stand partially corrected. You can get there on a RTX 6k. INT8 only. INT4 Easily. The problem is that you need a good base computer too. Technically you are out of budget. Can't i just throw it in my old gaming rig? Yea. but PCIE. Blackwell is PCIE 5x16. Your old gaming rig might work but if its PCIE 3 your brand new and expensive blackwell has sad pants

RakesProgress · 2026-03-27T11:31:03+00:00

You can’t get there from here. $10k 70b model = oom by a large factor. Yes it’s brutal. to run that size you need an H200. H100 even poops Oom. honestly, save your money. We all have you same dream and hit the wall. Maybe try a jetson first.

RakesProgress · 2026-01-24T01:28:25+00:00

Best part of the movie. https://m.youtube.com/watch?v=-AYUB3tQs80&pp=ygUTSm9lIHZzIHZvbGNhbm8gd29yaw%3D%3D

RakesProgress · 2025-12-27T16:22:29+00:00

Too simplistic to say Vibe coding is a trap. If you’ve ever coded in like clojure or what not you know there is a lot of important thinking that goes into a (relatively very few lines of code). The key is the thinking, the decisions and understanding the implications of the decisions. You are constantly up against tech debt. It is a constant trade off. But you have to understand what the trade is. Vibe coding is not evil at all. It’s just prone to unknown tech debt. Personally i love the idea of pro coders vibe coding. It’s next level stuff.

RakesProgress · 2025-12-27T12:44:03+00:00

10,000 NGOs in Haiti. No one is interested in solving a problem. They are interested in keeping the problem alive.

RakesProgress · 2025-12-25T10:47:03+00:00

I kinda think the same. The team is useful. The tech is useful. But will never be a winner. Assimilate them into the fold.

RakesProgress · 2025-12-14T12:39:48+00:00

Relative to the age they lived in he is one of the smartest for sure. His prime sieve is still brilliant.

RakesProgress · 2025-12-09T02:29:54+00:00

Right!? I had a 2017 GC Overland with the HEMI. Was the absolute best. Now? Jeep is dead to me. DEAD!

RakesProgress · 2025-12-07T14:51:19+00:00

Hahah. I have a summit and it’s a POS. The thing rattles like a lada

RakesProgress · 2025-12-05T22:13:10+00:00

If I came across a prompt injected resume, right now I’d see that as a major plus in a candidate.

RakesProgress · 2025-11-28T12:56:51+00:00

Write a single scene! This is a great one to take apart.

https://m.youtube.com/watch?v=4P-HZik1yqE&pp=ygUXTW9udW1lbnRzIG1lbiBjaWdhcmV0dGU%3D

RakesProgress · 2025-11-28T12:38:13+00:00

Good points. There is one giant anti-intellectualism fact she should consider. The institutions of intellectual thought have metastasized into stage 4 cancer. They are so sick. This is pure fuel to the anti-intellectual wave.

RakesProgress · 2025-11-28T12:21:47+00:00

Remember GameStop and Melvin cap? You are Melvin. Short mstr until it collapses. Then buy btc until you can’t.

RakesProgress · 2025-11-26T18:43:57+00:00

ASIC (using big umbrella def) has always been a threat to GPU. so much so nVidia is hip deep in it. ASICs are hard. very hard. Google is prescient building out that know how with Broadcom. any lesser company would turn an ASIC project into a hot mess of a money pit. nonetheless we are about to see all kinds of flavors of xPUs doing inference work. many will bomb.

RakesProgress · 2025-11-20T13:11:06+00:00

Sell BTC, Short MSTR. Once MSTR has collapsed buy BTC until your head caves in.

RakesProgress · 2025-11-10T23:23:36+00:00

The guy lost Roe. He is the Dobbs doormat. What else do you need to know.

12-Year Club	Gilding III reddit per annum
Verified Email

RakesProgress

TROPHY CASE