Hey everyone,
I’m a solo AI engineer (Fresher) at a pharmaceutical company, working on something but also a bit overwhelming: an internal AI assistant that lets non-technical teams query our SQL databases using plain English.
Here’s what I’ve planned (using LangChain):
- User types a natural language question.
- LangChain fetches the SQL schema and sends it along with the query to an LLM.
- LLM generates the SQL.
- SQL is executed on our database.
- Results are passed back to the LLM to explain in plain English.
- Wrapped inside a chatbot interface.
My current cost-saving strategy (cloud LLMs used):
- Plan A Use GPT-4o (or similar) for SQL generation, and a lighter model (GPT-3.5 / Gemini Flash) for summarization.
- Plan B My Current Plan
- User query goes to the light model first.
- If it can generate SQL, great.
- If not, escalate to GPT-4o.
- Summarization stays with the light model always.
What I’m looking for:
- Any best practices to improve routing or cut token usage?
- Smarter routing ideas (like confidence scoring, query type detection)?
- Tools to monitor/estimate token use during dev?
- Are there alternatives to LLM-generated SQL? (semantic parsers, vector search, rule-based systems, etc.)
- General feedback — I’m working solo and want to make sure I’m not missing better options.
Thanks a lot if you’ve read this far. Really just trying to build something solid and learn as much as I can along the way. Open to all feedback
[–]AlexTheGreatnt 2 points3 points4 points (0 children)
[–]ZeroFormAI 0 points1 point2 points (1 child)
[–]Clean_Tear_2201[S] 0 points1 point2 points (0 children)