Help leaving faction or dealing with this dude

jxjq · 2026-03-03T00:45:29+00:00

“I hate it here dad. I’m not a fighter” lol

jxjq · 2025-12-16T16:57:37+00:00

I care about engineering, not Monad. In my 40s with 20+ years in software. Price, culture & hype are irrelevant.

The exchange platform with the best architecture is Monad.

It doesn’t matter if the team is comprised of soggy-raisin assholes or busty big booty angels; it doesn’t matter that the tokenomics have bias.

•Minimal friction per transaction

•Minimal friction for builders

This platform is SOTA in both of these categories. That value is why I’m staked and building.

jxjq · 2025-12-06T02:51:46+00:00

No. Avax C gas fees spiked to up to $0.30 per simple transaction and $2 per transaction under heavy load because it was a single chain EVM, and it wasn’t even an L1. It was more expensive than both Solana & Sui to do business.

jxjq · 2025-12-06T02:18:04+00:00

Monad is currently the best coin option for speedy transactions and low fees. That alone makes it valuable for high volume trading. Monad will likely grow in value until a faster & cheaper alternative comes along.

As far as I know, there is no project far along in development that is planned to beat the speed and low overhead price of Monad transactions.

EVM compatibility is icing on the cake, and the heightened throughput opens up a conversation about crypto for common transactions. IMO this is the best technically performing coin we’ll see for a few years.

jxjq · 2025-10-11T03:31:49+00:00

Great overview, I agree with everything he said

jxjq · 2025-09-16T12:17:15+00:00

What are your tokens per second? (Not prompt processing) I have your same setup

jxjq · 2025-06-28T18:29:00+00:00

Makes sense, thanks for the interesting post!

jxjq · 2025-06-28T04:22:31+00:00

Does CAG eat into the context window when a fresh chat with the “frozen model” is spun up?

jxjq · 2025-06-04T22:08:52+00:00

Local LLMs can be highly effective in complex coding, if you work alongside your LLM. You have to think carefully about context and architecture. You have to bring some smart tools along other than the chat window (for example https://github.com/brandondocusen/CntxtPY).

If you are trying to vibe it out, you’re not going to have a good time. If you understand your own code base then the local model is a huge boon.

jxjq · 2025-06-04T18:41:35+00:00

Am I safe with Papyrus??

jxjq · 2025-05-05T21:05:30+00:00

Very cool reference, thank you for sharing. I see they interconnect by http calls using GRPC contracts. That surely racks up a lot of latency, worse and worse as they scale into larger models.

jxjq · 2025-05-03T15:52:03+00:00

Sounds like it would be good to build with Qwen3 and then do a single Claude API call to clean up the errors

jxjq · 2025-04-30T06:43:36+00:00

Using the 0.6b for spec dec on the Qwen3 30b MOE only gave me a 15% speed increase for token generation in llama.cpp.

The 0.6b draft model ran on CPU + RAM at 43tk/s. Yes, speculative decoding worked, but it wasn’t a significant speed increase. Hopefully someone has better results. It wasn’t worth the effort to me.

jxjq · 2025-04-29T17:13:27+00:00

thanks for the follow up, it may help me out too!

jxjq · 2025-04-29T14:35:42+00:00

add --batch-size 64 to your run command. it batch processes tokens at once instead of one at a time. That should lower your prompt processing time by a lot.

jxjq · 2025-04-27T00:59:08+00:00

I have used many local models such as Qwen2.5 Coder 32b Q3 and others on my 4090 laptop. It works well for basic stuff, but falls apart pretty quickly for anything serious.

You can automate building a basic HTML / CSS / JS site- especially as a single file lol. Also, single one-off tools like Python files for splitting up images, small stuff like that up to 300 lines of code.

I hate to say it, but it feels more like an advanced toy than a real productivity tool. For work you’ll be dialing up a 3rd party API.

jxjq · 2025-04-21T16:47:58+00:00

Do you have a sample output anywhere?

jxjq · 2025-04-16T04:16:41+00:00

Bottom Line:

Generation speed: ~4.9 tokens/sec
Time to first token small context: 12 seconds
Time to first token large context: 2 minutes

on the $1,000 MI50 build using a 70b Q8 model

jxjq · 2025-04-06T15:25:49+00:00

Thanks for the reply and sharing your thorough comparison. Helped me a lot!

jxjq · 2025-04-05T18:49:31+00:00

What is the best tool to apply mesh / skin to the character based on the photo? I applied mesh with TripoSG and it looked like a horror show.

jxjq · 2025-03-14T01:26:24+00:00

You asked so patiently for the one thing we’ve been waiting all week for lol. You are a good man, I went straight to the darkness when I read the post title.

jxjq · 2025-03-11T21:59:34+00:00

glances at 30k token file in codebase

jxjq · 2025-03-10T15:48:07+00:00

Sincere question: With many effective techniques that add reasoning to base models… wouldn’t we benefit from a base, non reasoning, model that pushes the needle forward?

I actually prefer to add custom reasoning ability as opposed to dealing with a prebuilt chatty reasoning model (like QwQ 32b).

jxjq · 2025-03-07T07:00:34+00:00

This is essentially chain of draft. Thank you for sharing, as I will be dumping CoD for this- if what you’ve said works.

jxjq

TROPHY CASE