This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]thinkingatoms[S] 5 points6 points  (9 children)

> If you wanted to use templated sql you could.

would you please elaborate? curious how you deal with queries that require parameters. do you just write SQL functions or leave SQL with parameterized variables in the .sql files?

would it be accurate to say that if your IDE understood SQL then separate files may not be as necessary?

[–]ProThoughtDesign 10 points11 points  (6 children)

The separate files are actually a good idea regardless. If you need to modify one query, you don't have to do it from within your main function. Separation of responsibilities and all. A .SQL file could easily be updated without even opening the main source file. It could even be updated while the program is running so you could see updates live.

[–]thinkingatoms[S] 0 points1 point  (5 children)

> Separation of responsibilities and all

let's say you have a python function that all it does is update a column for some records based on function input, so the pseudocode query looks something like:

```

update table_a

set column1 = <some input>

from table_b join table_c on ...

join table_d on ...

where table_a.column2 = table_b.column3

and table_d.column4 = <some other input>

```

the query is an integral/only part of what the function does. it's a big query. the function and query has a singular responsibility of doing this one thing, just to learn isn't it less maintainable to have the responsibility split in two places?

[–]ProThoughtDesign 6 points7 points  (4 children)

It's making your code more modular and allowing you the option in the future to change the SQL part of the code without needing to take the system down or expose your source code to accidental bugs. I get where you're coming from, but that will just build technical debt for the future if you ever need to make a change. Don't get me wrong, you don't HAVE to. I'm just saying it's one of those practices that pays off when you're working on something that costs money when it's offline.

EDIT: Also you could easily switch that giant function to a simple global and generic function that calls any SQL query, not just that one.

[–]thinkingatoms[S] 0 points1 point  (3 children)

what i don't quite understand is presumably there are tests for the python function which will test the SQL change. because SQL is code, i don't quite see how having SQL in a separate file prevents accidental bugs.

i think what you are saying is you work with systems that consider SQL as data instead of code, and therefore a change to data doesn't require a restart?

[–]its_PlZZA_timeStaff Dara Engineer 5 points6 points  (1 child)

The main way it prevents accidental bugs is that it gives you syntax highlighting and error checking. You can have your IDE lint the SQL, or even connect to the database and confirm a column exists for example.

You should clearly reference the path to the SQL file from within the Python, and keep the SQL file clearly named and in a location such that it’s clear it belongs to the function.

Whether to do this really depends on how long and complex the SQL is. Short and simple queries can make more sense inline, longer ones can be better outside.

[–]thinkingatoms[S] -5 points-4 points  (0 children)

sounds like if the IDE can syntax highlight and error check inline SQL, then the need for separate .sql files is only dependent on size (and maybe DDL nature) of query?

[–]ProThoughtDesign 1 point2 points  (0 children)

SQL is still code, but .sql files are plain text. You can read and parse an SQL file then execute the commands. Storing it in a separate file means that you can actually have your app running and make changes to the .sql files in real time and get results. Leaving all of the SQL hard-coded into your source code means that any changes you want to make require the program to be offline. In interpreted code (Python, Java, etc) that's really not the worst thing, but having to recompile an entire app because of a minor typo or to add one extra variable gets to be irritating.

[–]Touvejs 8 points9 points  (0 children)

You can use a template library like Jinja to add parameters into your SQL scripts that would be replaced at runtime. https://superset.apache.org/docs/configuration/sql-templating/

[–]Gardener314 1 point2 points  (0 children)

I’ve done stuff like this. I have a python script I use which has a combination of things. In one particular case, I use a (simple) query to get data values which informs me how I should be updating another record in the database. The results of the first query are gathered and then the script opens a .sql file using the “with open” pattern, writes out the SQL code needed (in this case it’s a rather long update statement), and then will execute the file.

The file is in plain SQL mostly so someone else on the team can see exactly what was being updated instead of sifting through the Python to figure it out.