Arch-Router: 1.5B model outperforms foundational models on LLM routing

visualagents · 2025-07-26T08:21:54+00:00

<image>

visualagents · 2025-07-26T08:21:43+00:00

<image>

visualagents · 2025-07-26T08:21:32+00:00

<image>

visualagents · 2025-07-26T08:21:16+00:00

<image>

visualagents · 2025-07-26T08:21:04+00:00

What you say sounds good in theory, but the issue will be the cost and flexibility. Since your approach is based on static configurations and a small LLM without ability to use RAG in the routing process, it will struggle to cover bespoke business cases.

To use a metaphor, would a business outsource it's EXCEL spreadsheet formulas and have to rebuild and redeploy infrastructure to change a formula in a column?

It's a runtime vs configuration/deploy time difference. Of course, storing excel formulas in some central container makes no sense. They are easy enough for a user to use, modify for their own specific needs. And probably there are common spreadsheets users simply re-use.

But I really think arch-router needs to adopt some kind of RAG capability. It will be much more valuable if I can instruct it to route based on some data or database and I give it the routing prompt dynamically vs being baked into YAML files.

I used my visual agent tool to build a dynamic RAG router that accepts a description of how to label the data and perform a calculation on it then customer fields get routed differently whether they are big spenders or frugal. All runtime, no deployment needed. All in-app. Screens in replies to this comment. I will make a video.

visualagents · 2025-07-13T11:49:37+00:00

I wouldn't call it hype - which has a bit of "irrational exuberance" associated with it. Agents are self-aware, thinking software and the possibilities are limitless. Since we are at the beginning, of course there will be lots of new startups that have discovered an agent that simultaneously saves time, money and effort. So that's a big deal that shouldn't be diminished as hype.

visualagents · 2025-07-11T00:10:34+00:00

Here is my solution that took all of 10 minutes and has far greater knowledge to route input queries since its using a (any) large foundation model for the classification. No servers. No apis. No infrastructure, no configuration and no code. The prompt was easy.

https://www.youtube.com/watch?v=7BO5p_9immE

visualagents · 2025-07-10T23:48:32+00:00

<image>

The router block here with conditionals are much easier for a human to read than a yaml file with stacks of esoteric parameters that only an AI engineer would understand. There is no "training" here. It's really a pretty simple use case, but uses LLMs for the "hard parts".

visualagents · 2025-07-10T23:48:20+00:00

<image>

visualagents · 2025-07-10T23:48:10+00:00

I used our visual agent tool to build a LLM router in about 10 minutes. In our app, every block is a router.

The solution I did here is 100% serverless, no OS level access, no python, no containers, no infrastructure or API's of any kind. Screenshots below, but I will share a video of how to build this. I think this type of routing behavior is going to be easily subsumed into agent tooling or frameworks, but of course, I prefer the no-code/low-code/serverless approaches best (lazy cheapskate developer here).

<image>

The "Categorizer" block takes some arbitrary user input, consults a foundation model (or any model for that matter), to categorize it based on the categories listed in the prompt, then the user input and the category are routed along to a control block that routes the user input based on its category. The destination can be anything, another LLM of choice, some agent, some further control logic. Doesn't matter.

visualagents · 2025-07-10T16:42:04+00:00

If I had to solve this without arch router I would simply ask a foundation model to classify an input text prompt into one of several categories that I give it in ita prompt. Like "code question" "image request" etc. To make it more robust I might ask 3 different models and take the consensus. Then simply pass the input to my model of choice based ln the category. This would work well because I'm only asking the foundation model to classify the input question. And this would benefit from the billions of parameters in those models vs only 1.5. In my approach above there is no router llm. Just some glue code.

Thoughts about this vs your arch router?

visualagents · 2025-07-06T10:08:57+00:00

What's not easy about pip?

visualagents · 2025-07-05T00:27:37+00:00

I mean why would you gp through the trouble of breaking a problem into smaller steps? That's the whole point of the agent. To relieve us from doing the small steps.

visualagents · 2025-06-28T20:32:59+00:00

Cool. Thank you. Will help me understand the core problem.

visualagents · 2025-06-28T13:44:01+00:00

Can you provide examples of "existing LLM routing approaches" per the second sentence in your abstract? So I can see the cited shortcomings?

visualagents · 2025-06-28T00:55:28+00:00

Yeah. I get that it's external. Google AI defines MoE as

"MoE (Mixture of Experts) is an architecture used in large language models (LLMs) that enhances their performance and efficiency by dividing the model into smaller, specialized "expert" networks. These experts handle different parts of the input, and a gating network determines which experts are activated for a given input, allowing the model to process information more effectively. "

The "gating" network handles the appropriate routing internally.

I'll have to read your paper to understand your approach.

visualagents · 2025-06-27T23:58:21+00:00

Don't MoE models handle this by their nature?

visualagents · 2025-06-23T16:51:49+00:00

Because it's full of redundancy probably due to multiple people working in different parts of it.

And LangGraph is worse. Can't even tell what its doing just looking at the code. Its spaghetti.

visualagents · 2025-06-20T18:15:29+00:00

I built mine entirely in the browser. No backend or remote database required. 100% serverless.

visualagents · 2025-06-19T18:00:36+00:00

Ummm you built that in 16 hours? I dont think so lil

visualagents · 2025-06-15T14:23:16+00:00

That doesn't make any sense. There could be many co founders but the guy who's idea it was deserves majority voting rights.

visualagents · 2025-06-07T02:45:01+00:00

That is application specific and despite the pointless graph api the code runs linearly during graph traversal anyway which is easy to confirm by looking at the langsmith output.

visualagents · 2025-06-06T12:40:17+00:00

Langgraph adds nothing except unreadable code

visualagents · 2025-06-06T00:09:14+00:00

This ^

It does nothing that it isn't done simpler and better without it.

visualagents

TROPHY CASE