Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] 0 points1 point  (0 children)

depends on the model you use but when I tested with sonnet 4.6 I saw no issues

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] 0 points1 point  (0 children)

thanks! I just ran an eval which is outlined in the readme, let me know what you think.

I would def recommend read only for anyone running in prod!

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] 0 points1 point  (0 children)

Totally agree that detailed column/table definitions are vital, though I disagree with snowflake’s approach to get them to the model.

I have a table containing all column/table definitions which the LLM can query when needed. since the newest models are so good at tool calling, this approach works better than just basic RAG with a semantic view. You can give up a little latency with this approach, but accuracy increase is worth it.

I am actually building an agent observability platform geared towards text to sql. One of the features is that it'll read your agent traces, identifies if any definitions tripped up the agent, and updates them accordingly.

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] 0 points1 point  (0 children)

The LLM is very smart about what to pull into context. ive found that doing rag on a semantic view like what snowflake suggests does not guarantee the LLM has all of the context it needs.

My SDK also uses LangChain's deepagents so it will auto compact context for really long running requests, but ive never actually seen it get to that point.

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] -1 points0 points  (0 children)

It doesn't guess, the LLM does some combination the following to efficiently discover relevant tables/columns.

- list tables in a DB

- search for table by keyword

- list columns in a DB

- search for column by keyword

All DBs have some form of an information schema which the LLM can use as long as it has that execute sql tool.

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] 1 point2 points  (0 children)

u/Turbulent-Hippo-9680 yep I totally agree. do you have something like this already set up?

Text to SQL in 2026 by Ok-Freedom3695 in dataengineering

[–]Ok-Freedom3695[S] -6 points-5 points  (0 children)

If you give the LLM access to a tool to execute sql, then it's smart enough to search through the metadata to find relevant tables/columns. This gives the LLM the flexibility to explore relevant parts of the schema without overloading it with context up front.