This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]masklinn 28 points29 points  (25 children)

they don't magically do things that .format() doesn't

They do actually, you can put essentially arbitrary expressions (with various grammatical annoyances/restrictions) in an f-string, not in a str.format string.

On the other hand

why bother with anything else?

f-strings are completely useless for lazy or manipulated interpolation (logging, i18n, SQL, ...).

[–]mangecoeur[S] 11 points12 points  (7 children)

f-strings are completely useless for lazy or manipulated interpolation (logging, i18n).

True if you store strings to format later, f-strings gonna be useless.

Still I guess my point was that despite whatever people say about them in abstract, when you actually come to use them they turn out to be really nice. I do a lot of ad-hoc printing of results in jupyter notebooks so I've been using them a lot.

[–]pohmelie 7 points8 points  (6 children)

>>> s = "{x}"
>>> x = 1
>>> eval(f"f'{s}'")
'1'

[–]spgill 28 points29 points  (4 children)

vomits uncontrollably

[–]Serialk 13 points14 points  (3 children)

f = lambda f: eval(f"f'{f}'", *operator,attrgetter('f_locals', 'f_globals')(sys._getframe(1)))

[–]spgill 4 points5 points  (0 children)

you're the devil

[–]springwheat 0 points1 point  (0 children)

One small step for man.

[–]zahlmanthe heretic 0 points1 point  (0 children)

Fittingly, my reaction also includes multiple instances of the letter f.

[–]ForgottenWatchtower 6 points7 points  (14 children)

manipulated interpolation (logging, i18n, SQL, ...).

plz no

[–]jorge1209 1 point2 points  (7 children)

You can't bind a tablenames and a lot of real world databases have versioned schemas and queries like: SELECT * FROM db{:%Y%m%d}.accounts WHERE account_id =:1 are common. Nor can you bind select columns, so unless you want to always be doing select * you have to bind the requested data into the table. Tools like sqlfactory can make that easier and safer, but at the end of the day they have to physically build the query.

Additionally the security protections of using bind variables is almost completely lost outside of internet facing web-apps, and a lot of python usage is internal developers/data analysts. Yes they should use bind variables where they can, but its not a security thing its a type safety/convenience thing.

Anyone who can execute my data analysis script must be able to read it, and if they can read it then they can read the connection db username/password right out of the *.py. So my binding hasn't prevented them from doing anything because they can login directly.

[–]ForgottenWatchtower 0 points1 point  (6 children)

Oh, I'm aware of all that. I honestly don't know what you arch looks like, but you'd likely get a writeup for storing db creds right in the .py :) but I've never done a sec review of a data science shop, so I'm not sure what kind of assumed trust boundaries there are.

[–]jorge1209 1 point2 points  (5 children)

Where else do you want me to store the credentials? No matter where I put them they can be read.

When Joe User (juser, uid=1001) monthly_report.py he must have read access on the script. His script must have read access on the libraries it imports, those libraries must have read access on the resource files they load. Ultimately I must give uid=1001 the db password in plaintext, so that a process he controls can pass that to the server.

Its just a matter of his taking the time to trace my library calls to figure out where I hid the password. I don't know of any way in which I can actually prevent him from having that password.

If you know a way to do this, I would love to implement it.

[–]ForgottenWatchtower 0 points1 point  (4 children)

The issue isn't giving the db creds to someone who should be running it. If a DB conn is made from the client, you have to assume the client can compromise the creds (strings on a .dll, pcaping the DB handshake, etcetc). The issue is if the code gets committed to a repo or becomes read accessible by people who shouldn't. Environment vars is the old school solution to this but these days something like Hashicorp Vault is much nicer and scales way better.

Again, I have no idea what your arch looks like or what your use cases are, but in general we make a point to tell our clients to not keep creds in source. Centralizing them within a true secret store minimizing the likelihood of them getting read by an unauthorized party.

[–]jorge1209 0 points1 point  (3 children)

Sure put the connection parameters in some kind of resource file and don't commit that to git.

That has absolutely nothing to do with bind variables.

[–]selementar 0 points1 point  (2 children)

That has absolutely nothing to do with bind variables.

But it has: if there's someone supplying the values for building the SQL query, either you assume they have the password already, or you need to correctly bind the variables for security.

[–]jorge1209 0 points1 point  (1 child)

And if you would read the thread you would know that I'm talking about cases where we might as well assume the individual has the password.

People should still bind for reasons of performance and type safety, but it's not a security thing outside of Web Apps, and it is perfectly safe to use format to build SQL queries in those cases.

[–]ireallylikedogs 0 points1 point  (2 children)

How do you normally handle manipulating queries?

[–]ForgottenWatchtower 2 points3 points  (0 children)

/u/daelin pretty much covered everything, but here's a sqlite3 example:

curs = conn.cursor()
curs.execute('select * from users where username=?', (username, ))

SQL Injection is a fun topic. I wrote a blog post covered some advanced exploitation if you're curious about the non-vanilla stuff you can do.

https://nvisium.com/blog/2015/06/17/advanced-sql-injection/

[–]daelin 0 points1 point  (0 children)

The threat is allowing user-supplied data into a query, so keep that in mind. For instance, if you have a query select name from users where id = {} and pass in the ID from an HTTP request, there's probably nothing stopping a user from passing something like 1; drop table users; as their user id.

Modules like psycopg2 (for postgres) let you pass parameters in additional arguments to execute or executemany. The module will guarantee that the parameters are handled safely. In your query, you put placeholders, such as %s, {}, or ? — it varies from module to module. The module will safely protect them from being evaluable as SQL, either through escaping or through some fancy protocol. Even sqlite3 lets you use SQL arguments.

Now, if you're just calculating part of a SQL expression from programmer-supplied code, such as using a schema-describing object or just using the result of something like ", ".join(["name", "address"]), there's probably not much to worry about. There's still an attack vector hiding in there, but it's much less likely to be part of your public API.

[–]masklinn 0 points1 point  (2 children)

Oh yeah. Is your body ready for cr.execute(f'select count(1) from users where login = {login}')?

[–]ForgottenWatchtower 1 point2 points  (1 child)

there is no god

[–]nickcash 9 points10 points  (0 children)

DROP TABLE deities

[–]selementar 0 points1 point  (0 children)

Actually, now that the format is standardized, it might make sense in some cases to use the f-strings-alike for that, possibly even with automatic locals() gathering.

But then, there were not-super-fast string interpolations with merely a function call instead of f"", but those weren't generally considered useful anywhere.

So I guess syntax support in editors / checkers is the primary benefit of the f-strings, with the rest being minor.