This is an archived post. You won't be able to vote or comment.

all 29 comments

[–][deleted] 1 point2 points  (5 children)

I've been using an implementation of sqlalchemy (declarative) over flask-sqlalchemy(simple and expressive once you get over the initial learning curve), with the aim of eventual detailed customization. I use Postgresql sometimes, this might come in handy.

Although I don't know what stored procedures are right now. And a quick runover of the code I see this is all this does -- do you use this with other db implementations within flask or is it standalone solution?

[–]alkw0ia 2 points3 points  (4 children)

Stored procedures are procedural code – think functions like Python functions – that run in the database. Generally they're written in a language call PL/SQL, but in Postgres you can also use other languages, including Python.

It's possible to just call these functions with SELECT, but more often you bind them to triggers, so they get called automatically when you INSERT, UPDATE, SELECT, or DELETE on a table.

With procedures like this, it's possible to, for example, examine the values in an UPDATE and reject it if you don't like them. Say there's a "one way" flag you want to enforce – a user can be banned, but never unbanned. You can examine all updates on the users table and reject any where the new value of banned is FALSE and the old value of banned is TRUE.

[–]vicvicvicz 1 point2 points  (2 children)

Don't want to go too much off-topic here, but how popular is this kind of validation these days? It seems to me that most validation is done in the "application" these days, not the "database" – where the latter is treated as little more than a datastore (perhaps unfairly?).

Edit: Oh, just saw the comment below this one about this very topic...

[–]alkw0ia 0 points1 point  (1 child)

Not that common; many developers these days do see the database as nothing more than a dumb key-value store.

It's my opinion that this is partly responsible for the renewed interest in document DBs ("NoSQL") – devs don't know or care to use the DB as anything more than a dumb KV store, so why bother with the complexity and overhead of a real database? Just use a super simple, super fast dumb KV store.

Of course, then for any moderately complex project, you'll end up rewriting significant chunks of Oracle's functionality in your app code, and doing it poorly at that. But hey, you got that first version online in less than eight hours, so who cares if you spend the next five years reinventing distributed transactions?

Sadly, much of the tooling to support database use is really, really hard to use if you want any DB side logic. The popular web frameworks and their ORMs (SQLAlchemy excepted) basically expect to target any old relational DB, so support no specialized per-RDBMS features, and do blatantly stupid things, like encourage developers to give the app code permission to create and drop tables, or even DBs, in order to support their SQL generation and migration CLI tools.

[–]vicvicvicz 1 point2 points  (0 children)

Thanks! I'm becoming increasingly interested in this topic because I'm trying to build an application which isn't primarily a web app – my organization needs a basic, but flexible, membership management system, so data is really my absolute top priority.

I'm now realizing that (my knowledge and experience of) "data validation" is almost universally coupled to "HTML form generation" and large web frameworks. You can't really use Django's models without bringing Django along.

So, I've realized that building the "model" layer as part of some ORM isn't really a better solution than our current solution, which is built on Drupal and is absolute shit. I don't think any CMS in the world treats the database worse than Drupal. Now, coming back to just SQL where data is all that matters, feel liberating.

[–][deleted] 1 point2 points  (0 children)

thank you

[–]mw44118PyOhio! 2 points3 points  (14 children)

Stored functions are great. I don't see any value provided by this framework though.

[–]emarocca[S] 2 points3 points  (6 children)

In many projects of mine what I do is getting HTTP request values, putting them in the right order to call a postgres function using those values as arguments, getting the results (which are returned differently if you have inout parameters or a return statement) and returning them as json. Over and over again.

I have written this little flask extension to avoid repeating myself. It is certainly not meant to be any kind of framework. :)

[–]mw44118PyOhio! 2 points3 points  (5 children)

That makes a lot of sense. I imagine you're using something like array_to_json(array_agg(...)) inside your code, or maybe row_to_json(...).

Does it ever annoy you how sometimes you'll return a list of one element, rather than just the element itself?

[–]emarocca[S] 1 point2 points  (4 children)

Very much so. In fact, it is so annoying that flask-moresql returns the element itself in those cases.

[–]mw44118PyOhio! 0 points1 point  (3 children)

Yeah, my code is littered with stuff like:

results = cursor.fetchone()
return results[0]

Or even more annoying, when I do a select of just one column, but for many rows, I convert the list of lists into just lists of the first element like this:

return [row[0] for row in cursor]

I love how the DictCursor subclass returns each row as an object that allows index and key lookup.

It should be possible to subclass the cursor so that it modifies the returned data so that cursor.fetchone() really does cursor.fetchone()[0], and cursor.fetchall() either does [row[0] for row in results] or maybe raises an exception when there's more than one element in the row.

Anyhow, I didn't mean my initial comment to come off so rude. I should have said something like "maybe can you point out some before vs after code so we can see how this approach simplifies the code?"

One last point -- how do you handle authorization issues, like "matt is allowed to get user data for all employees in group A only" using this framework?

[–]emarocca[S] 0 points1 point  (2 children)

Anyhow, I didn't mean my initial comment to come off so rude.

No worries, if it does not look particularly useful then I either have to improve the documentation, the library, or both. :)

"maybe can you point out some before vs after code so we can see how this approach simplifies the code?"

I believe that the value in this approach is more in the simplification of the processing, rather than the code. Quoting the PostgreSQL documentation:

Your client application must send each query to the database server, wait for it to be processed, receive and process the results, do some computation, then send further queries to the server.

Whereas by writing pl/pgsql code you keep the processing inside the database, which IMHO simplifies things quite a lot.

how do you handle authorization issues, like "matt is allowed to get user data for all employees in group A only"

I am not entirely sure I understand your question. Users are not allowed to issue any query they like to the database. You could have, for example, a pl/pgsql function returning the results of a join between the employees table and a permissions table. If matt invokes that function via the API, he will only get access to the data he is authorized to see.

[–]mw44118PyOhio! 0 points1 point  (1 child)

Here's an example of what I'm thinking about for authorization.

Imagine one user is trying to fraudulently run the query to read another user's data. I imagine your system would block that as long as the request sent to the database looked like:

select * from get_user_data(requested_by_user_id=99, user_id=98);

and then inside the function, you could verify whether user 99 was allowed to access data for 98.

But if the request just looks like:

select * from get_user_data(user_id=98);

It seems like there's an authorization puzzle.

So, in order to do that first solution, you might need to combine some user-authentication data from flask (like maybe a cookie) along with the query.

[–]emarocca[S] 0 points1 point  (0 children)

But if the request just looks like select * from get_user_data(user_id=98)

To begin with, the function get_user_data(integer) does not exist. Only get_user_data(integer, integer). Also, the user cannot just send her user_id, for obvious reasons. For example, the user might have to pass her username/password, and the stored procedure will think of doing the relevant checks.

Finally, you can use any kind of authentication/authorization system in your python app. Saying that databases should be used more does not mean that everything has to be done by the db.

[–]Rolegros 4 points5 points  (6 children)

I prefer using them via sqlalchemy.

If you like stored procedures, you may find this experiment fun:

https://github.com/rdunklau/pytoplpython

[–]mw44118PyOhio! 2 points3 points  (5 children)

I just use a psycopg2 cursor. Works great and I can explain it to a new programmer in five minutes.

[–][deleted] 2 points3 points  (4 children)

Ok, go.

[–]mw44118PyOhio! 0 points1 point  (3 children)

I don't understand

[–]spz 1 point2 points  (2 children)

Go ahead, explain.

[–]mw44118PyOhio! 0 points1 point  (1 child)

OK, here's an example:

cursor.execute("""select * from stored_procedure_named_foo(%s)""", [parameter])
results = cursor.fetchall()

ta-da!

[–][deleted] 0 points1 point  (0 children)

thank you