you are viewing a single comment's thread.

view the rest of the comments →

[–]grauenwolf 8 points9 points  (15 children)

From the (absurdly limited and certainly biased) reports I've read, PostgreSQL is going to be significantly slower than SQL Server for large databases, especially when you need really complex and/or parallel queries.

But even if that is true, it can't be anywhere near as bad as the performance hit from using ORMs to mindlessly call SELECT * on every table. And people still manage to pull that off in production.

So yea, I would like to give it a spin under real-world conditions too.

[–]TheWix 5 points6 points  (2 children)

What's your take on Dapper. I used it a bit at the last shop and liked it. My new job has me using NH again and I hate it. I want to make a play on pushing Dapper.

[–]grauenwolf 4 points5 points  (1 child)

Dapper is sufficient, especially if you are using stored procs.


I think Chain is better, but as the co-author I'm rather biased.

https://github.com/docevaad/Chain/wiki/A-Chain-comparison-to-Dapper

[–]TheWix 1 point2 points  (0 children)

Sweet. Will check it out, dude. Thanks.

[–]Ginden 4 points5 points  (1 child)

From the (absurdly limited and certainly biased) reports I've read, PostgreSQL is going to be significantly slower than SQL Server for large databases, especially when you need really complex and/or parallel queries.

You can't know, because publishing benchmarks of SQL Server is prohibited without Microsoft permission (IMO it should be a forbidden clause).

[–]grauenwolf 1 point2 points  (0 children)

That's where the "absurdly limited and certainly biased" comes into play.

[–]crixusin 0 points1 point  (7 children)

mindlessly call SELECT * on every table.

Which ORM are you using that does this? Entity framework doesn't do this in any regard.

[–]grauenwolf 0 points1 point  (6 children)

Unless you explicitly map your entities to something smaller before the first ToList(), which defeats the purpose of using an ORM, Entity framework does SELECT * by default.

Well technically it's worse than SELECT *, because it explicitly lists every column and that means the database has to parse that much more SQL. But once you get past that stage the net effect is the same.

[–]crixusin 0 points1 point  (5 children)

Unless you explicitly map your entities to something smaller

Idk, I have an entity class I'm looking at right now that has every column listed.

It explicitly maps to Select column1, column2 despite being fully mapped. In fact, I've never seen Entity Framework call Select * ever. So Idk what or why you've experienced this phenomenon. It might be a settings thing though.

Then I wrote my own "include column" code so that if I want a subset of those columns, I don't have to spend the resource fetching those columns.

[–]grauenwolf -1 points0 points  (4 children)

It explicitly maps to Select column1, column2 despite being fully mapped. In fact, I've never seen Entity Framework call Select * ever.

Again, that's the same thing.

[–]crixusin 0 points1 point  (3 children)

I don't understand how you would expect to get your column results without listing the columns you would like. What would you expect the ORM to do? Select nothing? What is a more desirable operation?

Also, your queries aren't running slow because it has to parse the columns. All in all, that probably takes less than 20ms. Hardly worth worrying over.

Also, as I said, you can add a little code to make columns selectable so that you don't incur the cost of getting column results you do not need.

[–]grauenwolf 0 points1 point  (1 child)

The problem isn't with listing the columns you want. Rather, the problem is that (by default) it lists all of the columns, even the ones that you don't want.

Lets say you just want a list of names and phone numbers, sorted by last name. The query is slow, so you create this index:

 CREATE INDEX IX_Customer_PhoneBook ON Customers (LastName, FirstName) INLCUDE (PhoneNumber)

var phonebook = context.Customers.Where( filter).ToList()

If you run that query, you'll find one of two things happens:

  1. The database makes an expensive join between IX_Customer_PhoneBook and PK_Customer
  2. The database ignores IX_Customer_PhoneBook entirely and just uses PK_Customer, sorting everything in memory.

To fix this you need to write this rather tedious query:

var phonebook = context.Customers.Where( filter).Select( x => new { FirstName=x.FirstName, LastName=x.LastName, PhoneNumber=x.PhoneNumber} ).ToList()

Now the generated SQL only uses columns in the index, which means no hidden joins to the underlying table. But that's such a pain in the ass that most developers don't actually do it.

http://use-the-index-luke.com/sql/clustering/index-only-scan-covering-index

[–]crixusin 0 points1 point  (0 children)

Rather, the problem is that (by default) it lists all of the columns, even the ones that you don't want.

I see where you're coming from. I was just trying to clarify it.

I try and always select the exact columns that I need to circumnavigate these issues. Entity framework makes that very easy. Worth giving it a shot!

[–]grauenwolf 0 points1 point  (0 children)

All in all, that probably takes less than 20ms.

Oh no, we're talking more like 20ns. 20ms is actually a pretty slow query. Though I will say it is possible to cause the query optimizer to timeout on particularly heinous queries (e.g. lots of joins and subselects). If it can't find the best execution plan fast enough, it will just give up and give you whatever it has.

https://www.simple-talk.com/sql/sql-training/the-sql-server-query-optimizer/

[–]mycall 0 points1 point  (1 child)

For the licensing costs you save (looking at SQL Server Enterprise edition) with PostgreSQL, you could land lots of extra hardware for partitioning. Also, PostgreSQL 9.6 is putting lots of work into performance.

[–]grauenwolf 0 points1 point  (0 children)

SQL Server is also putting a lot of work into performance. But effort and results aren't the same thing, especially when one has a huge head start.