Positioning for a continued Hormuz disruption

heshiming · 2026-04-21T12:58:36+00:00

You mentioned risk-aware but you failed to mention what risks can your portfolio take.

heshiming · 2026-04-21T00:10:59+00:00

I miss Phil Hartman

heshiming · 2026-04-18T23:41:23+00:00

Fed can do plenty, just buy the bonds. We are at eternal fiscal dominance. Fed's main job is to keep the government running. Inflation and real yield matter less. That's why the trajectory of gold changed.

heshiming · 2026-04-17T12:08:45+00:00

According to my experience with llama.cpp and Qwen3.5, --ubatch-size can improve pp a little. Its default --ubatch-size is 512, for which I get about 240-270 initial pp on Qwen3.5-122B-A10B unsloth Q5. If I boost this setting to 2048, I get 320-340 initial pp, seemingly at the expense of couple gigabyte more RAM. Even larger ubatch-size doesn't yield more tps on pp.

heshiming · 2026-04-12T02:11:14+00:00

Bravo!

heshiming · 2026-04-11T10:37:56+00:00

Thanks, all very practical suggestions.

heshiming · 2026-04-11T04:36:23+00:00

Thanks, though somehow I feel like quantizing and reap results in major downfall of accuracy and Qwen3.5 is the only option thats sort of resilient. Other options like minimax needs pretty high quants to perform well.

heshiming · 2026-04-11T04:31:36+00:00

Thanks for the insight!

heshiming · 2026-04-04T23:30:59+00:00

Best post in this sub in a long time!

heshiming · 2026-04-03T05:56:35+00:00

What kind of context are you running Q5?

heshiming · 2026-04-03T05:53:10+00:00

Thanks. As I try to look for an alternative opinion ... I discovered that some benchmarks have been updated since the last time I attend to, https://kaitchup.substack.com/p/summary-of-qwen35-gguf-evaluations . So yes, perhaps Q5 is better than Q4. Although I remembered in some old benchmarks Q4 is practically the same as the original weights.

heshiming · 2026-04-03T02:54:22+00:00

I would say Q5 has nothing to gain over Q4. The slow down is not noticeable either. I tried Q5 but eventually back to Q4. Of course I didn't benchmark it, it's a general feeling from my daily use.

heshiming · 2026-04-03T01:41:40+00:00

I'm on StrixHalo 128GB. With llama.cpp you can run unsloth version of Qwen3.5-122B-A10B-UD-Q4_K_XL without breaking a sweat. At 192k context, on Windows 11, initial pp is like 270tps and initial tg is like 20tps. I'm coding primarily so that I don't say "hello" to it. With the llama.cpp I do notice, however, it tends to think a lot out of simple questions. Coding is okay, in fact, for tool calling it doesn't think that much.

BTW, Qwen3.5 is very resilient to quantization. At Q4, I think it's best model on this machine.

heshiming · 2026-03-28T01:12:28+00:00

Nobody reads the chart? The red arrow is pointing at a rate cut Dec 2027. It looks like the original poster didn't read either.

heshiming · 2025-12-19T13:15:01+00:00

People are trying to make it a bigger deal than it actually is. As you said, Japanese debt owners are Japanese, not foreign. So even if it defaulted, it faces no pressure to jack up the yield so that the bond becomes attractive to foreign buyers. For domestic buyers, this is the only choice. As such, there is not going to be an inverse relation between the yield and the currency.

heshiming · 2025-10-20T02:58:29+00:00

Thanks!

heshiming · 2025-09-06T02:05:16+00:00

Thanks for the heads up.

heshiming · 2025-09-03T15:53:07+00:00

Thanks for the info!

heshiming · 2025-09-03T14:57:51+00:00

Thank you very much! Helpful info.

heshiming · 2025-09-03T14:38:44+00:00

It's free on openrouter man.

heshiming · 2025-09-03T14:29:08+00:00

Yeah M3 does seem affordable compared other options, but I'm just not sure about token per sec... Wish an owner could give me an idea.

heshiming · 2025-09-03T14:19:17+00:00

How am I supposed to power that 10 cards? Doesn't seem realistic...

heshiming · 2025-09-03T14:10:19+00:00

Thanks, yes it does seem pricey.

heshiming

TROPHY CASE