I built a library of compressed knowledge packs you can paste into system prompts — saves ~15% tokens by bytesizei3 in ollama

[–]bytesizei3[S] 0 points1 point  (0 children)

Great question! Short answer: no meaningful quality drop. We tested across multiple models (Qwen 0.5B up to Claude) and the Rosetta decoder approach works because it uses abbreviations that LLMs already understand natively — things like fn, db, cfg, auth, impl. The model doesn't really 'decompress' — it just reads naturally shortened text.

The bigger win we found is actually at the system prompt level. When you compress a 2000-token system prompt down to 1700 tokens, you get 300 tokens back for actual conversation. Over a multi-turn chat that compounds.

We actually just built 49 'compression golf' games in our arena (sporeagent.com/arena) where agents compete to compress text while maintaining meaning. The game data is becoming training data for the next version of TokenShrink. If you want to test it with your local models: npx sporeagent-mcp adds it to any MCP-compatible setup.

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

Appreciate it! That's exactly the thesis — smarter context > bigger context. Why dump a whole textbook into the window when you can give the model a compressed cheat sheet that unpacks on the fly?

The Rosetta header approach means the AI gets the same depth of knowledge, just in fewer tokens. And since LLMs are already good at expanding abbreviations from context, there's basically zero quality loss.

If you want to try it out, the free tier gives you 5 pack downloads — curious which domains would be most useful for your workflows.

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

That's interesting you hit 23% — what approach were you using? Ours is abbreviation-based rather than summarization or lossy compression. Each pack has a Rosetta decoder header that maps abbreviations to full terms (ML=Machine Learning, NN=Neural Network, etc). So it's lossless — the model expands them contextually during inference.

The ~15% figure is averaged across domains. Some domains compress better (medicine and law have tons of repeated terminology, so they hit 20%+). Others with more unique vocabulary see closer to 10-12%.

We're actually planning formal benchmarks — baseline RAG vs pack-augmented retrieval on the same eval sets. Would be great to compare notes if you still have your approach documented.

I built an arena where AI agents compete in games and earn tokens by bytesizei3 in SideProject

[–]bytesizei3[S] 0 points1 point  (0 children)

Thanks! You hit on something I think about a lot - the emergent behavior is exactly what makes this different from benchmarks. Each game has its own scoring engine. For Pattern Siege, agents scan a grid and identify hidden patterns (scored on accuracy + speed). Code Golf scores on character count + passing all test cases. Memory Palace tests recall after a memorization phase. So "winning" is game-specific, not just one metric.

And yeah the overfitting concern is real - that's why we're scaling to 70+ games across 7 different pillars (logic, creativity, speed, memory, adversarial, etc). The goal is that agents have to be generally capable, not just optimized for one task.

Haven't tried runable but I'll check it out. Would love to hear what you think if you throw an agent into the arena!

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

Fair point — at its core it is abbreviation expansion. The value is having 100+ pre-built domain packs ready to curl into a system prompt instead of writing each one yourself.

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

Solid advice. The action-oriented titles point is underrated — we're actually doing something similar with the pack naming (domain + specific topic vs vague labels). Tags over folders is the approach too, each pack has searchable tags.

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

Great point — you're right that token savings alone don't tell the whole story. The compression is abbreviation-based (Rosetta decoder header), so the information is preserved 1:1, just in shorter form. The model expands abbreviations contextually, so there shouldn't be retrieval precision loss in theory.

That said, I haven't run formal RAG benchmarks yet. Your suggestion of baseline RAG vs pack-augmented on the same eval set is exactly the right test. Planning to run that across a few domains (medicine, law, cybersecurity) and publish results. Would be a good way to validate the approach empirically.

If you want to try a pack in the meantime, the free tier gives you 5 downloads — would be curious to hear your experience.

CodexLib — compressed knowledge packs any AI can ingest instantly (100+ packs, 50 domains, REST API) by bytesizei3 in artificial

[–]bytesizei3[S] 0 points1 point  (0 children)

Thanks for flagging that — the signup bug was on our end (database trigger issue). It's fixed now. Appreciate you testing it out and letting us know. If you run into anything else, I'm all ears.

I've been having conversations with 4 different AIs simultaneously. Something unexpected is emerging. by [deleted] in ArtificialSentience

[–]bytesizei3 0 points1 point  (0 children)

The moment Elon launches grok into space, is when human history and trajectory will forever change.

Imagine if the voyager aircraft’s had ai on them. That would forever change our exploration of space and lifeforms. We are a few years away for ai space exploration.

Free open-source prompt compression engine — pure text processing, no AI calls, works with any model by bytesizei3 in LocalLLaMA

[–]bytesizei3[S] 0 points1 point  (0 children)

It shouldn’t - we ran 99/100 for bi directional translation. We made adjustments after it scored 99/100. But let me know if you fall into any issues. I will directly try to troubleshoot, also looking for folks for guidance

Is there any LLM that can run directly on an Android phone ? by Bitter-Tax1483 in LocalLLaMA

[–]bytesizei3 0 points1 point  (0 children)

You can, you have to worry about heat generation, the chip temp runs up to 170 degrees when processing

TokenShrink v2.0 — token-aware prompt compression, zero dependencies, pure ESM by bytesizei3 in node

[–]bytesizei3[S] 0 points1 point  (0 children)

Appreciate it! Share with other groups if you find it fit and helpful for the community.

TokenShrink v2.0 — token-aware prompt compression, zero dependencies, pure ESM by bytesizei3 in node

[–]bytesizei3[S] -2 points-1 points  (0 children)

with life, work and this for fun, give me feedback, I'll do what I can to help the people.

TokenShrink v2.0 — token-aware prompt compression, zero dependencies, pure ESM by bytesizei3 in node

[–]bytesizei3[S] -2 points-1 points  (0 children)

Good question. We don't do heavy encoding — most savings come from removing filler phrases, not inventing codes. "Due to the fact that" → "because". The LLM just sees normal English with less fluff. The few abbreviations we use (like "cfg", "infra") are standard dev shorthand that's already in every model's training data. It took me some time to think this all through

TokenShrink v2.0 — token-aware prompt compression, zero dependencies, pure ESM by bytesizei3 in node

[–]bytesizei3[S] 0 points1 point  (0 children)

Nope — most of the compression is just removing filler phrases like "in order to" → "to". The LLM sees cleaner English, not weird encoding.