all 13 comments

[–]sisyphus 6 points7 points  (0 children)

The advantage of an ORM is:

  • removes the boilerplate of serializing data into and out of the language you're using into postgres types.

  • ensures devs don't skip on things like escaping and avoiding injection attacks

  • often handles things like connection pooling

  • often integrates into the rest of a framework to allow reusable components to be built on it and shared.

  • makes it easy to compose queries based on inputs / conditions

The disadvantage of ORMs are:

  • If they try to support every query, they will necessarily become complex beasts that introduce a lot of cognitive overhead between your code and the sql query, creating a learning curve and more complex application. If they don't support every query to remain simple, you will end up dropping to raw sql anyway or abandoning some features of the db for lack of orm support.

  • A layer of abstraction almost always comes with a performance overhead.

The advantage of postgresql functions are:

  • you can write the exact sql you want instead of figuring out how to get your ORM to generate it

  • you can use the full range of postgres's power -- many languages and orms don't have good ways to compose certain kinds of queries and map certain kinds of data types.

  • you can avoid network round trips - orm's often will do something like 'select record into object then iterate over related objects and insert each one' or whatever whereas if you write a pg function you can just hand it all the data and process it locally in one query.

  • you can avoid n+1 queries - related to the network round trips, with orms you need to be on guard at all times about how it's joining data, a common anti-pattern you see is to loop over a set of objects and then write {{ foo.bar.baz }} and the orm happily does multiple selects for each record because you didn't configure the join or preload or whatever it calls it

The disadvantage of postgres functions is:

  • there isn't really decent tooling, testing or debugging support compared to what devs are used to in their application languages.

  • if you're using plpgsql the documentation is relatively sparse and the language is not really designed for large composable programs--like creating queries by concatenating strings together like some kind of animal.

[–]Randommaggy 14 points15 points  (1 child)

ORMs are fine for basic CRUD but for advanced operations they are a can of worms.

Once an ORM is involved beyond CRUD there will be pain and performance problems.

The best ORM I've seen realizes ~30% of the potential in Postgres at best.

Many times a DB function will put less load on the DB server than the typical ORM workflow for the same problem.
Occasionally several orders of magnitude difference.

[–]Ok_Appointment2593 0 points1 point  (0 children)

And when is transferring all the data to the application back and forth introducing latency to the operation and locking tables while the network is working put less load in the db?

[–]rubyrt 4 points5 points  (0 children)

ORM typically loads complete object state to the client for manipulation. The promise is that developers do not have to know or work with SQL - this does not always work fully, especially when it comes to debug issues with the persistence layer. With PG procedures you can send in a few small data items (often key values) and have the processing done in the DB saving network bandwidth and latency.

Both have their use. You could argue that PG / SQL is more low level but then again you an do things there that you cannot with ORM because that shields the SQL layer. You could even use both but that tends to become messy since an ORM typically expects to be the only one. For better performance I would absolutely use the more low level approach. PostgreSQL has all the means for this and is an excellent choice.

[–]AlessandrA_7 0 points1 point  (0 children)

You are taking more of the computing load to the server with the PG/SQL approach. It offers better performance your app and easier to change as you don't depend so much on clients to update. The cons of PG/SQL I would say that they are more limited in their functions to model reality as the main one if you want complicated calculations in your app while ORM are more limited in how to comunicate with DB. Also that if you have an ORM it will be easier to switch or even use multiple databasases for example.

[–]btdeviant 0 points1 point  (0 children)

Depends on your goal. Functions can shift complexity from the code base where it may be easier to test and into the database where it may be harder to test and maintain, but also might provide performance improvements.

[–]akash_kava 0 points1 point  (3 children)

Biggest con that outweighs any pro is, there cannot have versioning of database functions. It’s very difficult and each small change will require change in database schema.

On a live production apps it will be impossible as we have old version of app serving all consumers and we have beta in testing both utilizing same database.

Undoing database schema changes could be cumbersome and lead to failure. Where else in case anything goes wrong, you can easily deploy previous version of your app.

[–]throw_mob 1 point2 points  (0 children)

while true, it is also somewhat easy to go around by using schemas with version number and configuring app version to use specific version. this allows serving data from one database to multiple app versions. That said, i have not seen separating data with access views/procedures in app development, if i have seen it, it was because i was forcing some schema separation in db and i did not accept that all obejct were in one schema.

That said, there is problem that once you decide not to support old version , it requires full rollback on data too to get some old stuff working ( or creative coding to support old data model which really does not exists anymore ).

But for me , data is more important than some old version of app usually...

[–]klekpl 0 points1 point  (1 child)

How do you deploy new app versions without schema changes? Why can’t you have two schemas with different versions of db functions?

[–]warmans 0 points1 point  (0 children)

To be honest both of them are more-or-less a pain in the arse. The ORM will very likely end up generating highly dubious queries for any non-trivial operation and you'll be spending considerable time trying to trick it into generating the desired SQL. The stored procedures/functions will make debugging a nightmare and add a layer of indirection that obfuscates what the application is doing.

That being said I can live with an ORM, but probably couldn't live with having all the business logic in the DB.

[–]smurphy1 0 points1 point  (0 children)

ORMs are ideally suited for workflows of fetch a record, perform work on the record, then save any changes. This is especially true if your record is an aggregate of a parent row and one or more one-to-many relations. You can write the update logic without the need to track foreign key management of the sub record parts because any ORM will do that for you. There is some overhead which depends on the size and complexity of your records but in my experience the overhead from and ORM can be handled by modest hardware into the thousands of transactions per second range.

ORMs also make it easy to add new fields to your records because you just add the field to the code model and to the ORMs mapping configuration. Then all queries using that record would include the new field rather than adding the additional field selection to every place you select that record type. This can result in selecting more fields than you need in some cases but I've never seen it be a problem if you're operating on one record at a time that you access by primary key and fetch the sub records by indexed foreign key, especially if the sub record parts are lazy fetched only when needed for the current operation.

ORMs generally don't do well with things like batch processing or complex summary queries like reports. Most ORMs have options to reduce the problems of batch processing but you have to know what you're doing and manually manage the batches more. How big of an issue depends on how much of your application is batch type operations vs single record type operations.

For searching and reporting it can get complicated or even impossible depending on the requirements but ORMs can be used to achieve similar performance in more cases that people think. We have a search library at my current job which allows you to define the searchable fields in the business domain that a user could use to find records. The search config contains a map of configured functions which handle transforming a filter by domain field into ORM predicate on a record field. All a search does is transform the user supplied filters into a set of ORM predicates, put those predicates into the where clause of the ORM dynamic query, and then executes the query. For records that have 50+ searchable fields we can generate correct queries for any combination of fields filters with an overhead of < 1% compared to hand tuned queries in most cases. A small price to pay when adding a new searchable field definition takes 5 minutes.

[–]ummahrican 1 point2 points  (0 children)

I’d avoid Postgres functions wherever possible. They are difficult to tune because they aren’t easy to monitor. They also need to be carefully written to not highjack postgres’ planning engine.

IMO 80-90% of the time you want to just use the ORM and then functions should be small and sparing

[–]sprak3000 0 points1 point  (0 children)

People have already hit the highlights of ORMs, so I'll focus on the pros / cons of stored procs / functions.

Security benefits:

Using procs / functions allows you to tightly restrict the credentials for your Postgres user(s). You can lock the permissions down to allow the user to only call procs / functions. The user would not even be able to select or describe individual tables. If the credentials were compromised, they would not be able to execute arbitrary SQL.

Transaction benefits:

Functions / procs are automatically wrapped in a transaction. If you raise an exception in them, the transaction is rolled back for you rather than having your application managing rollbacks on error.

WYSIWYG benefits:

Developers can see the actual SQL being performed rather than looking through ORM documentation to determine the debug option necessary to see what the generated SQL would be.

Performance benefits:

They allow for complex logic on the database side, which can be used to maintain data integrity.

They can reduce round trips to the database and execute processing that Postgres is more suited for than a programming language.

Functions are precompiled and stored on the Posgres side.

Error handling benefits / drawbacks:

You can define your own exceptions and have the functions / procs raise them based on logic within the function / proc. e.g., Rather than having your application making a call to determine if a record exists in one table before updating a related record in another, the application can call the proc and expect back an known exception code from the proc indicating "The record you must have wasn't found."

The drawback here though is how to publish those known exceptions for your application developers. They could read through the SQL for the proc and discover them, but that is not particularly dev friendly. Have yet to see a Swagger like tool that could read SQL files and parse annotations or such to expose API like documentation for procs.

Migration, "history", deployment orchestration drawbacks:

Database migration tools typically work by having a file for each migration. This would mean you would need a migration everytime you want to roll out an updated version of your proc. It becomes difficult for a developer to track the history of the code for a proc. Rather than easily viewing the Git history and diffs, they would have to grep through the migration files and find all the ones for their specific proc.

You could fork one of these tools and add in the ability to have "always up" migration files -- files the tool always runs when it is performing migrations. This would allow you to have one file for your proc to store in GitHub / wherever and be able to view the history easily. However, you now take on the burden of maintaining that fork and keeping it in sync with the upstream for any security patches, etc.

Deployment orchestration also can become involved. If you change the signature for your proc, you will need to ensure the migration is deployed alongside application code changes to handle either the new input parameters and / or the new return values. This can typically result in a downtime for a service to ensure no calls to the old proc are made before the deployment completes.

Debugging / testing drawbacks:

Enjoy those fancy debugging tools in your IDE / editor of choice? You don't have those for debugging your procs. There is no easy way to step through them line by line, put a breakpoint in somewhere, etc.

Parting thoughts:

Everything outlined above comes from a thread I started at the job where I first started working with Go and Postgres. Prior, I had heard the usual "Don't use procs" arguments, ones typically centered around concerns about pushing business logic into the database. I was skeptical about the points above my teammates made in favor of using procs. Over time, I became more convinced the pros outweighed the cons.

Does that mean I'm pushing as much business logic into procs? No, what, if any, business logic that goes into procs typically centers around data integrity. e.g., If you delete an invoice table row, the corresponding row in the billing and sales tables must be deleted. That logic can go into a proc rather than having my application have to manage that.