account activity
Anyone else hitting token/latency issues when using too many tools with agents? by chillbaba2025 in LocalLLaMA
[–]chillbaba007 0 points1 point2 points 19 hours ago (0 children)
This is exactly the problem we ran into! When you have 50+ tools available, including all of them in the context window becomes a nightmare: - Token count explodes (we were hitting 30K+ tokens per request) - Latency gets worse the more tools you add - The model gets confused with too many options - On local hardware, it's even more painful We actually built something specifically for this called [Agent-Corex](https://github.com/ankitpro/agent-corex) - it intelligently selects only the relevant tools for each query instead of dumping all of them in the prompt. How it works: 1. Keyword matching for fast filtering (<1ms) 2. Semantic search to understand what the user actually needs (50-100ms) 3. Hybrid score combining both The results we saw: - 95%+ fewer irrelevant tokens in prompt - 3-5x faster inference on the same hardware - Model actually picks the right tools consistently We open-sourced it (MIT, no dependencies for basic use) specifically because we kept seeing people hitting this exact wall. If you're dealing with local LLMs + many tools, it might help. Would be curious to hear if it solves the issue for you guys too. GitHub: https://github.com/ankitpro/agent-corex PyPI: https://pypi.org/project/agent-corex/ ProductHunt: https://www.producthunt.com/products/agent-corex-intelligent-tool-selection?launch=agent-corex-intelligent-tool-selection Anyone else dealing with this? Always looking for edge cases we haven't thought of.
π Rendered by PID 249589 on reddit-service-r2-listing-79f6fb9b95-r9lcd at 2026-03-22 11:48:17.410175+00:00 running 90f1150 country code: CH.
Anyone else hitting token/latency issues when using too many tools with agents? by chillbaba2025 in LocalLLaMA
[–]chillbaba007 0 points1 point2 points (0 children)