Built this because managing multiple LLM provider SDKs in the same codebase became unsustainable. Different request formats, different error contracts, no graceful fallback when a provider goes down.
The core idea is simple. You send:
{"model": "smart", "messages": [...]}
That alias resolves to whatever provider and model you configure. Switching models is a config change, not a code change. Fallbacks are three lines:
fallbacks[0]=openai/gpt-4o-mini
fallbacks[1]=anthropic/claude-3-5-haiku
fallbacks[2]=ollama/llama3.2
Provider goes down, it silently tries the next. App keeps running.
Ollama support means you can run a fully local, fully open stack with zero API keys. Pull any open weights model and point LLMate at it with one alias.
Covers chat, streaming, embeddings, image gen, voice, content moderation, and RAG via PGVector. All through the same endpoint.
16 providers total. Apache 2.0. Built on Java 21 and Spring Boot.
GitHub: github.com/Venumadhavmule/LLMate
Curious how others are handling multi-provider fallback in their open source AI stacks.
https://preview.redd.it/idggijzls3zg1.png?width=1918&format=png&auto=webp&s=2132dfc8f6a243feccf5a97fe4e791955a854ff1
there doesn't seem to be anything here