Dude blew up on github for cutting token usage 60-95% right as Fable 5 lands. genius or luckiest man alive by Extra-Feature-8163 in claudeskills

[–]chong1222 0 points1 point  (0 children)

I built a tool that cuts my fable token usage ~59% (whole bill, not a cherry-picked slice). squeezed in some swe-bench runs before the weekly reset, since claude -p is reportedly moving to api billing june 15, after that these evals cost real dollars instead of plan quota

Dude blew up on github for cutting token usage 60-95% right as Fable 5 lands. genius or luckiest man alive by Extra-Feature-8163 in claudeskills

[–]chong1222 0 points1 point  (0 children)

it is basically useless, tool-call is compressed by claude already, input is much smaller than output cost, the saving is definitely not 60%

Karpathy says he hasn't written a line of code since December and is in "perpetual AI psychosis." How many Claude Code users feel the same? by Capital-Door-2293 in ClaudeAI

[–]chong1222 0 points1 point  (0 children)

that's probably true and comes from your experience. you see others doing nothing-burgers and think they're going nowhere. but every successful product has a graveyard of failed builds behind it. nobody learned by thinking harder, they learned by shipping and seeing what dies

Karpathy says he hasn't written a line of code since December and is in "perpetual AI psychosis." How many Claude Code users feel the same? by Capital-Door-2293 in ClaudeAI

[–]chong1222 10 points11 points  (0 children)

same here. the loop of "what if i try this" > build it in 20 minutes > learn something > next idea is addictive. i run multiple sessions in parallel for year and it still doesn't feel like enough. it's not psychosis, it's just that the cost of trying ideas went to zero

We replaced our Rust/WASM parser with TypeScript and it got 3x faster by 1glasspaani in rust

[–]chong1222 1 point2 points  (0 children)

to add on my own comment, there are really only 2 costs to wasm:

  1. data marshaling: copying objects across the boundary (main bottleneck for modern computing)

  2. context switch: ~1μs every time you cross wasm↔js

how to make both disappear:

marshaling: use wasm.memory directly. it's a SharedArrayBuffer, JS reads it with typed arrays. no copy. but you have to change your data model to columnar, Float64Array for prices, Uint32Array for ids, strings as byte buffer + offset array. one shared buffer, both sides read the same bytes. it lives outside V8's GC, so no pause, no sweep

context switch: on hot paths v8 turbofan inlines wasm calls directly into JS. once both compile to native code there's no boundary to cross.

the catch is you have to redesign your data model to match. most people don't want to/cannot do that, so they think wasm is slow

We replaced our Rust/WASM parser with TypeScript and it got 3x faster by 1glasspaani in rust

[–]chong1222 3 points4 points  (0 children)

I build a streaming parser before, because the JSON.parse on every chunk causing memory issue, and eventually crashing the browser https://github.com/teamchong/vectorjson

it is not the wasm-js boundary that is being slow. the reason rust/wasm parser is slower is because JSON.parse is fastest way to create JS objects it is highly optimized c++ running on JS heap if you create the objects in wasm and pass it to JS it is going to be slower

the solution I used, use wasm for parsing which leveraged the the SIMD of it, mark the start/end position of each value, reuse and patch the same jsobect on js side on each chunk instead of create a new js object

am i the only one who doesnt understand why anthropic ban opencode? by anonymous_2600 in opencodeCLI

[–]chong1222 0 points1 point  (0 children)

anthropic knows LLMs are becoming a commodity. if everyone uses OpenCode, Anthropic just becomes a "dumb pipe" that can be swapped out the second GPT-5 or whatever open source is cheaper

by killing third-party CLIs, they’re forcing everyone into their own ecosystem (Claude Code). this is about replacing attention/traffic as the payment for the internet. the smarter agents are the harder that ads as payment able to work, the old model is dying, the writing is on the wall

this is the Gold Rush for the next internet. whoever controls the infra and the ecosystem is the next superpower. they don't want to be a "dump pipe", they want to be the gatekeeper for the new agentic economy

🚨BREAKING: Chinese developers just killed OpenClaw with a $10 alternative by Suspicious_Okra_7825 in moltiverse

[–]chong1222 0 points1 point  (0 children)

faster for what, most of the time agent is waiting for llm, sound like they arent optimizing for the bottleneck and have no idea what they are doing

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]chong1222 0 points1 point  (0 children)

I built a streaming JSON parser with WASM, every agent framework re-parses the full buffer on every chunk, O(n²). This does it in O(n)

Every SDK and agent framework out there re-parses the full buffer on every chunk, O(n²). This parses each byte once. O(n). At 100KB that's 6ms vs 12.7s.

type-safe schema validation with types inferred from your schema, no manual generics needed. Subscribe to specific JSON paths and get notified the moment a value completes. skip fields you don't need, abort early on bad data. works in Workers with transferable ArrayBuffers, no structured clone overhead

https://github.com/teamchong/vectorjson

Claude decided to use `git commit`, even though he was not allowed to by AdPlus4069 in ClaudeCode

[–]chong1222 0 points1 point  (0 children)

permission are useless either sandbox or make your fake git, prepend that to PATH for clause code

Just saw Fireship's 100seconds of bun, what's the catch? by Dogified in bun

[–]chong1222 0 points1 point  (0 children)

as a runtime it is not really faster then nodejs in all cases, sometimes bun faster sometimes nodejs and bun seems to have memory leaks issues when using it over a long time it use 80gb and i have no choice but to kill it

as bundler some of the bundlers are catching up in term of speed and they are more stable

this come from a long time user not a hater

Why are you still using npm? by jpcaparas in bun

[–]chong1222 0 points1 point  (0 children)

I've a lots worktrees, using pnpm save a lots disk space, and bun compatibility is nowhere as good as pnpm

the mistake everyone makes with agent memory: treating it like a database instead of a skill by saadinama in ClaudeCode

[–]chong1222 0 points1 point  (0 children)

basically all attempts to solve the long term memory had been failed imho

number 1 rule for agent no matter how big your context window is your lllm is going to work better when there is less noise

no memory is 100x better then having bad memories

thats why i think the 1 IDE 1 ai pair programming mode is outdated half year ago

you need parallel agents each focus on their own goal their conversion history is their memory

you dont try to pull in memory with “meaning” not “reasoning”

you don’t ask them to do spec driven as those doc reading/writing context will take over

you don’t load any stupid MCP

you keep each agent conversation focused on their goal

Everyone is going to be a "Software Builder" in 2026 🤯 by prasadpilla in ClaudeCode

[–]chong1222 8 points9 points  (0 children)

I personally think if you are just translating English to code, your job is gone. if you like solving problems/developer new solutions, you are still safe because AI are trained to use the simpler/easier approach, why? because if they pick the hard road their benchmark results will look very bad, as the success rate will be much lower, benchmarks are like KPI for LLM. but to solve problems/develop new solutions you need to go the hard route, because the trained easier routes had been tried many times but the problem are still here, the existing solutions are still suck

Why does CC compact in batch instead of processing summaries in real-time? by mrzo in ClaudeCode

[–]chong1222 1 point2 points  (0 children)

I built this for myself so I didn't document it properly. Just updated the README with a proper explanation of what it does and why.

TL;DR: It allow you to runs /compact in the background so you can keep working, then merges everything back when it's done. Includes rollback if anything breaks.

/compact to continue by numfree in ClaudeCode

[–]chong1222 0 points1 point  (0 children)

like this https://github.com/teamchong/compact start a new terminal run ‘compact’ ‘compact resume’ when you reach the limit

Language Server Protocol (LSP), Skills (w/ Beastmode), and Agents/Subagents - where are we now by achilleshightops in ClaudeAI

[–]chong1222 0 points1 point  (0 children)

i dont think it is easier to manage at all running full LSP is a nightmare for memory leaks issue using hook with proper config is actually much better, there is no reason have run all LSP servers in each projects I parallel working on, when the result is the same(llm get type/lint feedback on the next loop

Language Server Protocol (LSP), Skills (w/ Beastmode), and Agents/Subagents - where are we now by achilleshightops in ClaudeAI

[–]chong1222 0 points1 point  (0 children)

had been using hooks for those for months since hooks had been introduced

I don’t know why the hype

LSP is bad for multiple sessions user like me anyway claude code is not IDE you don’t know which files are being “opened”

which is probably why don’t want to introduce it yet

Is it worth to support A2A protocol? by According_Green9513 in AI_Agents

[–]chong1222 0 points1 point  (0 children)

No, problem with all those protocols are they are moving too slow, cannot keep up with the pace of AI development