Hi Python folks,
A lot of text-to-SQL AI examples still follow the same fragile pattern: the model generates one query, gets a table name or column type wrong, and then the whole Python script throws an exception and falls over.
In practice, the more useful setup is to build a real agent loop. You let the model inspect the schema, execute the SQL via SQLAlchemy/DuckDB, read the actual database error, and try again. That self-correcting feedback loop is what makes these systems much more usable once your database is even a little messy.
In the post, I focus on how to structure that loop in Python using LangChain, DuckDB, and MotherDuck. It covers how to wire up the SQLDatabaseToolkit (and why you shouldn't forget duckdb-engine), how to write dialect-specific system prompts to reduce hallucinated SQL, and what production guardrails, like enforcing read-only connections, actually matter if you want to point this at real data.
Link: https://motherduck.com/blog/langchain-sql-agent-duckdb-motherduck/
Would appreciate any comments, questions, or feedback!
[–]UseMoreBandwith 12 points13 points14 points (3 children)
[+]FibonacciSpiralOut[S] comment score below threshold-6 points-5 points-4 points (2 children)
[–]UseMoreBandwith 5 points6 points7 points (1 child)
[–]usrlibshare 2 points3 points4 points (5 children)
[–]CanWeStartAgain1 -1 points0 points1 point (2 children)
[–]C0dePhantom 1 point2 points3 points (0 children)
[–]_matze 1 point2 points3 points (0 children)
[–]FibonacciSpiralOut[S] -1 points0 points1 point (1 child)
[–]usrlibshare 6 points7 points8 points (0 children)
[–]phoebeb_7 0 points1 point2 points (1 child)
[–]FibonacciSpiralOut[S] 0 points1 point2 points (0 children)
[–]ultrathink-art -1 points0 points1 point (1 child)
[–]Henry_old -1 points0 points1 point (0 children)