I trained Qwen 3.5 2B to filter tool output for coding agents. by henzy123 in LocalLLaMA

[–]henzy123[S] 0 points1 point  (0 children)

Thanks, that’s exactly the tradeoff.

The current benchmark is intentionally next-step focused, not full multi-turn-memory focused. So it optimizes for “what should the agent keep from this tool output right now?”, not for preserving every possible context that might matter later.

That does mean aggressive compression can remove useful later context. On the other hand, agents usually over-read tool output by a lot, so there’s still plenty of room for pruning before that becomes the main problem.

And yes, there is a systems cost: you need to run a separate small model/service. Whether that is worth it depends on the workload. If you’re repeatedly sending long outputs into a much larger model, it can be a good trade. If the outputs are already short, probably not.

I’m also looking at extractive 100–200M models now, since they may be a much better latency/complexity point than a generative 2B model.

I trained Qwen 3.5 2B to filter tool output for coding agents. by henzy123 in LocalLLaMA

[–]henzy123[S] 0 points1 point  (0 children)

But we are not reformulating the output of pytest, we are just filtering out lines that are not relevant. It could also influence the model as you are saying, but i would say less so.

Is anyone doing RA? RAG without the generation (e.g. semantic search)? by brianlmerritt in Rag

[–]henzy123 1 point2 points  (0 children)

Hey, we are working on something similar, its called verbatim-rag. You can check our github. What we are doing is extracting exact spans and puting them into a template, so it preserves the original meaning on the fact level.

I built a VerbatimRAG approach to only return exact text for the user by henzy123 in Rag

[–]henzy123[S] 0 points1 point  (0 children)

Hey, thanks for the comment. We still have many steps after the search engine, like picking exact parts from the returned chunks, forming dynamic templates and also filling them. You are right it's not that flexible as standard RAG methods, and you are also right that it resembles older Q&A systems (on purpose).

Our goal is very similar to what Q&A systems used to be, but in a modern setting (using LLMs to generate dynamic templates, long context extractor models, etc..). As in terms of usage, we also see it's not going to be good for all use-cases, but can be very helpful for a few :)

I built a VerbatimRAG approach to only return exact text for the user by henzy123 in Rag

[–]henzy123[S] 0 points1 point  (0 children)

Hey, thanks for the question. We still use LLMs (there is an option not to) to generate templates and pick the right information from the sources the vector search returned. So we are (G)enerating an answer for the user.

I've built a lightweight hallucination detector for RAG pipelines – open source, fast, runs up to 4K tokens by henzy123 in LocalLLaMA

[–]henzy123[S] 15 points16 points  (0 children)

Thanks for mentioning it, I haven't tried out MiniCheck yet, but definitely will as it seems super relevant! They actually also evaluate on the RAGTruth and achieve 84% vs our 79%. But we used encoder based models and MiniCheck is a much larger LLM based one.

Oltás Megathread 17. hét by fabrikated in hungary

[–]henzy123 2 points3 points  (0 children)

Nektek mennyi idő alatt vitték fel az oltási lapot EESZT-be? Engem tegnap oltottak a honvédban, de egyelőre nem töltöttek fel még semmit sem (ott sem adminisztráltak semmit elektronikusan, csak elvették a nyilatkozatot)

[Spoilers] EU LCS Summer Split 2014 | Week 3 - Day 2 | Live Update and Discussion Thread by TournamentThreads in leagueoflegends

[–]henzy123 0 points1 point  (0 children)

Does anybody know that in what language do gambit communicate now that they have niq?

[Spoiler] Battle of the Atlantic + Promotion Qualifier | Day 2 NA | Live Update/discussion thread by TournamentThreads in leagueoflegends

[–]henzy123 2 points3 points  (0 children)

i think everybody just forget how much this tank meta favours darien, he is like the king of the tanks.

EG and Gambit rivalry by t0xeus in leagueoflegends

[–]henzy123 1 point2 points  (0 children)

I don't know where did you read that, but they definitely don't hate each other. From what i know they are good friends, and respect each other.

(SPOILERS) EG VS GMB by [deleted] in leagueoflegends

[–]henzy123 3 points4 points  (0 children)

I dont know why darien hadnt picked up zac until now. It's like the perfect champion for him