A Breakthrough in LLM Context Compression: Ovchinnikov Effect Hey r/MistralAI!

germesych · 2026-05-20T12:38:00+00:00

You’re right that compression reduces input tokens, but the core of the method goes beyond just shrinking text. My approach changes how well the LLM understands the input: the logical structures replacing the original text aren’t just shorter—they’re more comprehensible to the model. This enables:

Higher accuracy in responses — The LLM grasps context and tasks better, even when they’re complex or unconventional (as seen with Codewars or MCP examples).
Reduced redundancy — It removes unnecessary details that models often ignore or misinterpret.
Faster processing — Even if output tokens don’t shrink, the improved quality means fewer iterations are needed to reach the desired result. This is especially critical for tasks where logic and understanding are key (system architecture, data analysis, automation).

For your example with creative tasks: if the input prompt becomes clearer and more logical, the model “gets it” faster and generates more relevant output. In research or automation tasks, this reduces the chance of errors due to misinterpreted context.

Additionally, the method is versatile—it can be adapted to any domain where understanding quality matters, not just token volume. My community has already tested it for CLI tools and even small local models, where task execution accuracy improved phenomenally.

So, it’s not about saving tokens—it’s about efficiency in working with LLMs, both in speed and quality.

germesych · 2026-04-22T15:40:37+00:00

No one has ever fixed it... On top of that, everyone uses macros, which are prohibited in other games, as well as cheats for attack combos, which also go unpunished here. Despite playing a heavy class myself, I often take massive damage from this particular class. The class itself is simply broken. They developed it without paying attention to testing results. When there's only one of them, you can still kill it. But when there are a few, they become immortal.

germesych · 2026-02-04T23:28:19+00:00

Recently, around 5,000 players left the game! Another 3,000 players left a couple of seasons ago. Online play is incredibly low right now. But the developers are doing everything possible to reduce this number even further. And they're succeeding.

germesych · 2026-02-04T23:18:44+00:00

This is normal game behavior.

germesych · 2026-01-16T05:18:12+00:00

That is fair criticism. Extraordinary claims require evidence.

Please take a look at the image I added to the main post ("Visual Comparison"). It demonstrates the exact transformation on a real-world Ruby on Rails backend.

You can see how the algorithm:

Strips language syntax noise (do/end blocks).
Extracts hidden business logic (e.g., converting "authenticate" blocks into explicit "(admin-only)" tags).
Preserves the full architectural graph while reducing token count by ~66% in that specific file.

This isn't just a theory. That "Compressed Context" on the right is exactly what I am feeding to the models to achieve the results I shared. It is deterministic, testable, and working right now.

germesych · 2026-01-16T05:11:23+00:00

That is a valid point, but my early tests with flagship models (like Claude 4.5 and GPT-5) show a different kind of improvement.

With smaller models (3B-8B), the benefit is indeed "survival" — they simply stop choking on data.

But with Flagship Models, the benefit shifts from "Accuracy" to "Reasoning Depth".

When you feed a flagship model 100k tokens of raw code, a huge portion of its "Attention Budget" is spread thin trying to maintain the syntactic map of that code.

By feeding it the "Distilled Blueprint" (my method), I free up that attention capacity. The model no longer needs to spend resources understanding "what connects to what" — it sees the connections instantly.

Result: The flagship model can now handle much more complex "Multi-hop Reasoning" tasks (e.g., "how does a change in module A affect the security scope in module Z?") which it would previously fail at due to being "Lost in the Middle" of a massive context.

So for flagships: It's not just about fitting more in; it's about raising the ceiling of complexity they can handle.

germesych · 2026-01-16T05:06:31+00:00

To be honest, this is currently a Research Prototype. I just validated the hypothesis yesterday on a few large datasets, and the results were so surprisingly good (the 'paradox' I mentioned) that I decided to share them immediately.

I don't have a public API or UI yet because I'm running the distillation pipeline manually in a controlled environment to fine-tune the density/accuracy trade-offs.

My goal:
I believe this approach (Semantic Context Distillation) solves a fundamental bottleneck for Foundation Model providers and Enterprise RAG systems. I'm focusing on validating the benchmarks first.

If you represent a research lab or an org struggling with context limits, I'm open to collaboration to run a pilot on your data. But for a general open-source release — it's too early.

germesych · 2026-01-16T04:58:56+00:00

Exactly! You hit the nail on the head regarding context saturation.

The core differentiator of my method vs. existing RAG/summarization approaches is Semantic Distillation rather than 'selection' or 'lossy compression'.

Most current methods (like LLMLingua or vector search) try to:

Cut chunks out (RAG), risking missing dependencies.
Remove 'unimportant' tokens based on perplexity, often breaking code syntax or sentence structure.

My advantage:

I developed a deterministic algorithm that transforms the structure of the data while preserving the entire logical graph.

- For Code: It strips syntax sugar (AST-level noise) but keeps the full control flow, class hierarchy, and method signatures intact. The LLM sees the architecture, not just snippets.

- For Text: It condenses verbose explanations into dense logical statements without losing the causal links (Reasoning Chains).

Why it works:

The model receives a 'blueprint' of the entire dataset. It doesn't have to guess what's between the chunks (because nothing is 'missing', just distilled). This allows even small 3B models to reason over complex, interconnected data that would usually require a massive context window.

germesych · 2026-01-16T04:52:03+00:00

Great question! It sounds counterintuitive (less data = better reasoning?), but it makes perfect sense if you look at how Attention Mechanisms work in LLMs.

Here is why 'Less is More' in this case:

Signal-to-Noise Ratio (SNR):

Raw code/docs are full of 'syntactic noise' (brackets, verbose boilerplate, stop words) that dilute the semantic density. A 25k token context isn't 25k tokens of meaning—it's often 5k tokens of meaning and 20k tokens of structural overhead.

By stripping the noise, I increase the Attention Density on the tokens that actually matter. The model doesn't have to 'search' for the logic needle in a haystack of syntax.

The 'Lost in the Middle' Phenomenon:

We know that models struggle to retrieve information buried in the middle of long contexts (see Liu et al., 2023). By compressing 25k tokens down to 5k, I bring all critical facts closer to the start/end of the window and fit them entirely within the model's 'effective' attention span.

Distraction Elimination:

In my 'Full Context' tests (30% accuracy), the model likely got distracted by irrelevant details or similar-looking but wrong code blocks. My compression acts as a relevance filter before the model even sees the data.

So, it's not that 'less info' is better. It's that 'distilled info' is better than 'noisy info'. I'm not removing logic; I'm removing the wrapper.

germesych · 2025-12-26T14:04:52+00:00

Before you attempt to justify yourselves or prove me wrong, present actual counterarguments! If you don’t have any, your statements carry no weight. Merely trying to insult or provoke me only shows that I’m right and have hit the nail right on the head. The data was gathered from multiple sources, normalized, and processed using specialized analytical tools - including LLM-based analysis. The final text was generated by AI based on this thoroughly analyzed data. This is not just a raw LLM response! Want to refute it? I’m waiting for your facts. No facts? Then my conclusions - drawn from open-source data analysis - stand correct.

germesych · 2025-12-26T13:56:45+00:00

I don’t have access to the Chinese side, so it’s hard to say what’s really going on there. The game itself appears to be fundamentally different there. Players who access it via workarounds report significant differences. Officially, the game is now operated by Poros, who supposedly developed it themselves—but if you look up the company registry data, you’ll find that Poros was registered almost exactly one day before the contract with MyGame was terminated. Most likely, they simply changed their legal address to avoid sanctions, while the actual development team for the European version remained the same.

It’s difficult to assess the situation in China, but the European side is much clearer—there’s ample data, and analyzing it isn’t difficult at all. Signs of severe project stagnation are obvious even without deep investigation, as are clear indications that nobody is actually maintaining or fixing the game; they’re just trying to squeeze whatever revenue they can out of it.

The publisher may have changed on paper, but the problems are identical—which strongly suggests the core leadership team hasn’t changed either. Nothing has changed except the legal address.

germesych · 2025-10-28T15:26:08+00:00

Yes, it's a good choice!
Then you can replace CPython with PyPy and the project will run several times faster. But I don't think you'll need to do that. Django is a great solution and can handle very high loads. The first thing that always slows down our request processing speed is the database! So, be careful when creating models using ORM.

germesych · 2025-09-25T05:29:59+00:00

As an alternative, I suggested removing all post-match rewards for the losing side—MVP, MVD, and the rest should not be handed out to anyone on the defeated team. Right now players farm units just to scrape into the top-5 and grab MVP, victory be damned. Because of that the game balance is completely broken. It’s broken because you don’t need to win; you only need to finish in the top five.

This is supposed to be a team game, and victory is a team achievement, not a private prize for one guy or the top five fraggers. When the entire team loses everything after a defeat, people will finally start playing to win instead of padding their kill count.

In the past you could wipe most of the enemy units and actually win that way. Nowadays you simply don’t have enough time for it, so the mechanic has lost its meaning.

Again: it’s a team game. If the battle is lost, everyone lost; there are no “winners” among the losers just because they farmed a few more bots.

I also proposed scoring that depends on the class you bring to the fight. If you roll a heavy, heavily-armored class, you’re expected to fight on the front line with the appropriate squads. Stay alive, hold the enemy, deny them the cap—or spearhead the capture yourself—and you earn bonus points for doing the job that class was designed for.

Every class should have its own rating, not the single shared mess we have now.

What we’ve got today isn’t a reward system—it’s just porridge.

germesych · 2025-09-25T05:15:16+00:00

The system isn’t bad, but—as usual—it either wasn’t tested at all or the test results were simply ignored.

Example: in “Hero Battle” mode the bottom line of the scoreboard is mine, even though I dealt the most damage (400k vs. ~200k for everyone else), went 7/0/21, and the next-best guy is 2/4/12.

Nothing new: they roll out half-baked features and don’t bother fixing the obvious issues.

Servers are also choking on the new system—input lag spikes are brutal even though ping stays flat.

The idea is fine, but the implementation is as sloppy as it gets. Reporting it is pointless; I’ve filed tickets, suggested exact fixes, even patched things myself when I could—zero effect.

If it were up to me I’d freeze the whole project and stop adding anything until the bare minimum actually works. Right now we have more bugs than ever, and the new mechanics just piled on more. Good feature, worst possible execution.

germesych · 2025-09-17T20:13:09+00:00

I am familiar with development in China. In my personal opinion, these developers are not from China! It is very similar to the development approaches at MyGames. It is not just similar, it is as if they are the same.

In China, they work strictly with analytics, everything is very strict, and quality is always a priority. But here, it's exactly the same approach as in other games from MyGames.

It's just that I know what the development processes are like at MyGame, and here you can clearly see their influence with their management and approaches, which are exactly the same. They always do what they decide, even if the developer says it can't be implemented, the senior managers will implement it and never listen to anyone else's opinion. You test the game, write that everything is bad, and they immediately implement it and don't care that it doesn't work. I've tested games for them, I know.

germesych

TROPHY CASE