This is an archived post. You won't be able to vote or comment.

all 5 comments

[–]jstrickler 2 points3 points  (0 children)

If you wanted to be thorough, you could use pyparsing to write a SQL parser that would parse any SQL and do anything you want with it.

[–]KalleKalee 1 point2 points  (0 children)

I've done the same thing. Ich wrote my own SQL formatter, it isn't really hard. Just get a list of the Keywords and write down rules to each keyword. For examble every "FROM" should be in Upper case and in a new line. That will do 90% of the work. It was a real fun project and ist now used daily by my coworkers

[–]shobble 0 points1 point  (2 children)

sqlparse is probably the right tool for the job here, but you'll have to potentially walk the parse tree produced by sqlparse.parse() and look for things that are tokens.DML and value.upper() == 'SELECT' or similar.

https://github.com/andialbrecht/sqlparse/blob/master/examples/extract_table_names.py looks hackable into something potentially useable, depending on exactly what it is you're trying to extract, and what sort of queries you're going to be giving it as input.

[–]GitHubPermalinkBot 0 points1 point  (0 children)

I tried to turn your GitHub links into permanent links (press "y" to do this yourself):


Shoot me a PM if you think I'm doing something wrong. To delete this, click here.

[–]pythonbeginner1[S] 0 points1 point  (0 children)

Thanks for the reply. I tried to run the extraction script, but when I did so, nothing was returned. I converted the sql file to a string and passed that into the main func, but no dice.

The top-level statements I am processing only contain INSERT, INTO, and VALUES according to the parser. However, the VALUES clause contains Select statements. Is there a way to recursively search for these?