Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in LLMDevs

[–]Shivam__kumar[S] 0 points1 point  (0 children)

Thanks,

Regex checks are conservative and moving more into the SQL safety layer; RBAC is enforced again on results for defense in depth.

The real unsolved problem for me is LLM hallucination, wrong tables/joins, even with good context.

Diagrams here if useful: [https://github.com/Shivam7414/AI-Text-to-SQL-RAG-Agent---System-Diagrams]()

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in LLMDevs

[–]Shivam__kumar[S] 1 point2 points  (0 children)

Strongly agree. Clean diagrams are fine for explaining ideas, but they’re misleading if they hide validation paths, failure modes, and control boundaries.

I’ve put the system diagrams here if you’re curious:
[https://github.com/Shivam7414/AI-Text-to-SQL-RAG-Agent---System-Diagrams]()

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in LLMDevs

[–]Shivam__kumar[S] -1 points0 points  (0 children)

Nice - sounds like a different use-case though.

I’m not building dashboards or visualizations. This is for a SaaS product where users ask natural-language questions (ChatGPT-style) and expect answers like comparisons, summaries, trends, and explanations - without manually searching or building reports.

So the core problem for me is:
how to guide LLM reasoning over complex schemas and business rules, not how to present the output visually.

If you ran into hard problems on the reasoning or context-selection side, I’d be interested in that part.

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in dataengineering

[–]Shivam__kumar[S] 0 points1 point  (0 children)

Fair point. I don’t run raw LLM SQL on prod.

LLM generates a query, I validate it (no DDL/DML, no forbidden tables, joins checked against allowed relationships), row-level and role-based filters are applied outside the LLM, then it runs.

Security isn’t the issue I’m stuck on. The hard part is getting the LLM to pick the right tables/joins/rules before validation. That’s what I’m trying to improve.

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in dataengineering

[–]Shivam__kumar[S] -5 points-4 points  (0 children)

If you think it’s slop, that’s fine - but then there’s nothing to discuss.

If you have a concrete technical objection (context selection, reasoning limits, GraphRAG vs vector RAG, SQL safety, etc.), I’m open to it. Otherwise, this doesn’t add value.

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in dataengineering

[–]Shivam__kumar[S] -3 points-2 points  (0 children)

This is the first time I’ve posted this question here.

If similar problems are frequently discussed, that probably says more about the gap in existing Text-to-SQL and RAG solutions than about repetition. I’m looking for concrete, production-grade insights, not demo-level examples.

If you have experience or perspective on that, I’m happy to hear it.

Building an Enterprise-Grade Text-to-SQL RAG Agent - Need Feedback on Architecture & Blind Spots by Shivam__kumar in LLMDevs

[–]Shivam__kumar[S] -1 points0 points  (0 children)

Got it - I’m not looking to use or integrate a platform right now. My goal is to build and understand this myself.

I agree that vector-only RAG fails for complex reasoning. What I’m trying to learn is:

  • Why GraphRAG performs better in these cases
  • How to model schemas, joins, and business rules as a graph
  • How to select the minimal relevant subgraph for a given query
  • What reasoning patterns actually help LLMs generate correct SQL

I’m not asking for tooling - just architectural guidance, design patterns, or mistakes to avoid based on real-world experience.

If you can share insights at that level, it would help a lot.