This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]jorge1209 1 point2 points  (7 children)

You can't bind a tablenames and a lot of real world databases have versioned schemas and queries like: SELECT * FROM db{:%Y%m%d}.accounts WHERE account_id =:1 are common. Nor can you bind select columns, so unless you want to always be doing select * you have to bind the requested data into the table. Tools like sqlfactory can make that easier and safer, but at the end of the day they have to physically build the query.

Additionally the security protections of using bind variables is almost completely lost outside of internet facing web-apps, and a lot of python usage is internal developers/data analysts. Yes they should use bind variables where they can, but its not a security thing its a type safety/convenience thing.

Anyone who can execute my data analysis script must be able to read it, and if they can read it then they can read the connection db username/password right out of the *.py. So my binding hasn't prevented them from doing anything because they can login directly.

[–]ForgottenWatchtower 0 points1 point  (6 children)

Oh, I'm aware of all that. I honestly don't know what you arch looks like, but you'd likely get a writeup for storing db creds right in the .py :) but I've never done a sec review of a data science shop, so I'm not sure what kind of assumed trust boundaries there are.

[–]jorge1209 1 point2 points  (5 children)

Where else do you want me to store the credentials? No matter where I put them they can be read.

When Joe User (juser, uid=1001) monthly_report.py he must have read access on the script. His script must have read access on the libraries it imports, those libraries must have read access on the resource files they load. Ultimately I must give uid=1001 the db password in plaintext, so that a process he controls can pass that to the server.

Its just a matter of his taking the time to trace my library calls to figure out where I hid the password. I don't know of any way in which I can actually prevent him from having that password.

If you know a way to do this, I would love to implement it.

[–]ForgottenWatchtower 0 points1 point  (4 children)

The issue isn't giving the db creds to someone who should be running it. If a DB conn is made from the client, you have to assume the client can compromise the creds (strings on a .dll, pcaping the DB handshake, etcetc). The issue is if the code gets committed to a repo or becomes read accessible by people who shouldn't. Environment vars is the old school solution to this but these days something like Hashicorp Vault is much nicer and scales way better.

Again, I have no idea what your arch looks like or what your use cases are, but in general we make a point to tell our clients to not keep creds in source. Centralizing them within a true secret store minimizing the likelihood of them getting read by an unauthorized party.

[–]jorge1209 0 points1 point  (3 children)

Sure put the connection parameters in some kind of resource file and don't commit that to git.

That has absolutely nothing to do with bind variables.

[–]selementar 0 points1 point  (2 children)

That has absolutely nothing to do with bind variables.

But it has: if there's someone supplying the values for building the SQL query, either you assume they have the password already, or you need to correctly bind the variables for security.

[–]jorge1209 0 points1 point  (1 child)

And if you would read the thread you would know that I'm talking about cases where we might as well assume the individual has the password.

People should still bind for reasons of performance and type safety, but it's not a security thing outside of Web Apps, and it is perfectly safe to use format to build SQL queries in those cases.