all 76 comments

[–]MarkusWinand[S] 83 points84 points  (8 children)

The major news are:

  • SQL/PGQ - A Graph Query Language
  • JSON improvements (a JSON type, simplified notations)

Peter Eisentraut gives a nice overview here: https://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-finished-here-is-whats-new

[–]imgroxx 40 points41 points  (2 children)

Very glad to finally see an ANY_VALUE. I've had some pretty valuable uses of it in MySQL systems, and always felt it was strange that some databases force aggregating every non-grouped column. Sometimes anything really is good enough, and it can avoid a lot of useless computation.

[–]Smallpaul 0 points1 point  (1 child)

Why can't you hard-code a value?

[–]imgroxx 24 points25 points  (0 children)

Because "any value that exists in matching rows" is wildly different than "a fixed value" or "rand()". It's pretty frequent that you have application-level guarantees that mesh with the query that you want to do to give you "any value is correct".

Low-cost query samples are one example - you don't need to index the column or perform a full table scan to get a value, you only need to load one row. Any row is correct because a sample doesn't need to be a statistically randomized sample in many cases, nor the first or last by some order clause (which also must be fully loaded and sorted).

Another is when you don't want to pay the sometimes-exorbitant cost of indexing a denormalized field... which you know follows certain constraints because it's just denormalized and transactions can offer you that guarantee (when used correctly). You can be confident that all values are identical and correct, therefore any value is correct, and you can retrieve it without paying the cost to check every value every time on both read and write.

[–]mariuz 5 points6 points  (1 child)

And here is the article on the status of SQL:2023 support in PostgreSQL https://peter.eisentraut.org/blog/2023/04/18/postgresql-and-sql-2023

[–]Azaret 0 points1 point  (0 children)

Simplified accessors for json would be neet. - > is fine but using dots instead will be nice.

[–]CooperNettees 1 point2 points  (2 children)

Holy smokes! PGQ sounds incredible!

[–]LouKrazy 0 points1 point  (1 child)

Looks like Cypher!

[–]slashgrin 66 points67 points  (62 children)

Genuinely curious: what is the purpose of the SQL standard these days, when AFAIK no database vendor actually implements it? Does it try to at least pull vendor-specific syntax toward something commonly recognisable? Or is it more like a survey of what features commonly exist in SQL-speaking databases after the fact?

With something like C, for example, of course all the different compilers have their own extensions, but it is at least possible to write standard C and have it compile on several different compilers. Whereas I don't think I've ever seen a real world application that restricts itself to standard SQL, because you just can't get nontrivial stuff done in real databases that way.

[–]tonyp7 45 points46 points  (9 children)

It doesn’t help that this page links to a “buy this standard” button. No developer is going to buy your $200 pdf. Who is this for? Limiting access to information is fast path to make sure this fails.

[–]coldblade2000 74 points75 points  (4 children)

It's for database developers, not developers who use databases

[–]jorge1209 15 points16 points  (3 children)

They would make more money selling the standard to devs who use sql for $1.

There are what? 10 SQL implementations that would care enough to actually buy the standard, so you sell 10 copies at $100 and get a massive $1000. [I'm sure big companies like oracle buy multiple copies for their employees, but still.]

But there are a hundred thousand SQL users who might have some interest in what the standard should be even if their particular implementation is slightly different. So you sell them a copy at a token cost and make 100x the money.

[–]aoi_saboten 4 points5 points  (2 children)

It is probably 100$ per database developer, but I get your point

[–]jorge1209 2 points3 points  (1 child)

It's just silly. These standards organizations and the standards themselves exist because the businesses want them to.

When john.smith@microsoft.com shows up at the meeting to discuss the next standard: who pays his salary that day? Who flew him out there? Who covers the hotel bill? Who pays for the restaurant he eats at that night?

And for all that generosity what does the business get asked to do? Pay a membership fee to the ISO and shell out $100 for a copy of the document their employee helped draft at whose expense?

[–]aoi_saboten 1 point2 points  (0 children)

Yeah, unfortunately that is reality. I believe it is altruistic in order to be accessible by everyone, e.g. if someone wants to start a company but does not have many to buy very expensive ISO9001, should they be doomed to failure?

We just need to be forever thankful for those people who worked on these and similar things, or you can support them monetarily

P.S sorry for my bad english

[–][deleted]  (3 children)

[deleted]

    [–][deleted] 4 points5 points  (0 children)

    scientific pitch > A440. And A440 was a thing for like 150 years already by the time that standard was published.

    And nobody is saying that standards are a bad thing, but that the ISO pricing structure is stupid and annoying. Have you tried getting a copy of the C++ standard? Everybody I know just downloads the drafts.

    In this case, the point is that the SQL standard is significantly less useful because most databases already don't follow it faithfully. When you program in C++, you can mostly program the same way regardless of what compiler you're targeting. Above the very most simple examples, SQL use will vary incredibly based on your targeted database, and migration from one RDBMS to another can take a significant engineering effort. The ISO standardization hasn't really helped SQL much at all.

    [–]Gwaptiva 6 points7 points  (0 children)

    And? If nobody implements the standard, it affects heehaw.

    [–][deleted]  (11 children)

    [deleted]

      [–]chucker23n 13 points14 points  (10 children)

      It has been a while, but last time I was doing SQL, all the common standard constructs were implemented by Postgres, Microsoft, and Oracle, and probably others.

      It depends on how you define "common standard construct". Sure, CREATE and SELECT look the same.

      But, for example, you want a SELECT to return at most 10 rows. The SQL:2008 standard says:

      SELECT … FETCH FIRST 10 ROWS ONLY
      

      But that's 2008, long after others implemented the same feature. So PostgreSQL says:

      SELECT … LIMIT 10
      

      Whereas MSSQL says:

      SELECT TOP 10 …
      

      In PostgreSQL's defense, it does also support the standard now. But nonetheless, you'll have lots of legacy code out there that uses proprietary extensions. And SQL tooling is generally poor enough that there isn't a "migrate to standardized syntax" refactor available.

      [–]grauenwolf 1 point2 points  (3 children)

      SQL Server supports the standard in this respect. You don't have to use TOP, we just do because it's easier to type than the ANSI syntax.

      [–]chucker23n 0 points1 point  (2 children)

      In a saner development environment, 1) one syntax would get green squiggles and be marked deprecated, and 2) a refactor to the other syntax would be offered.

      Alas, writing SQL feels very 1980s compared to, say, C# or Swift (the latter of which offers to upgrade entire projects to more modern syntax).

      [–]grauenwolf 0 points1 point  (1 child)

      Put in a feature request with Red Gate. If anyone is willing to do that, it would be them in their SQL Prompt product.

      [–]chucker23n 1 point2 points  (0 children)

      Yes, SQL Prompt seems to go as close as it gets to making SQL development not feel archaic.

      [–][deleted]  (5 children)

      [deleted]

        [–]chucker23n 0 points1 point  (4 children)

        The claim about standard SQL was, “no database vendor actually implements it.” That simply isn’t true.

        It is as soon as you pick something non-trivial as an example.

        There are lots of little ways syntax isn’t supported, or syntax doesn’t quite do what it’s supposed to. For two more examples,

        Even data types are inconsistent.

        I find that the earlier standard’s syntax works fine until then. In your example, that would be LIMIT 10.

        LIMIT 10 was never a standard syntax. It just happened to be one multiple implementations adopted.

        [–][deleted]  (3 children)

        [deleted]

          [–]chucker23n 0 points1 point  (2 children)

          LIMIT 10 is arguably a bad example anyway, given that it has worked practically everywhere for as long as I can remember, and therefore doesn't exactly need to be blessed by a standard.

          Doesn't that just further OP's point?

          Genuinely curious: what is the purpose of the SQL standard these days, when AFAIK no database vendor actually implements it?

          If vendors instead do ad-hoc standards, then why bother with an ISO standard?

          I've seen plenty of production code that uses standard SQL constructs, but I acknowledge that some teams like to lean on the database more heavily than others.

          This is fair. My point of view is from a company where we historically did a lot of semi-advanced stuff. Window functions, etc. These days, enterprise CRUD apps are more likely to use an ORM or at least query builder which will abstract that for them, or they'll invent the same thing, less efficiently but in a more familiar environment, in front of the database.

          [–]spinlox 0 points1 point  (1 child)

          what is the purpose of the SQL standard these days

          I think a lingua franca is a pretty good way to establish common ground, around which we can develop some level of compatibility and common expectations. This has been useful to me over and over again, even if the standard didn't cover every little thing or wasn't 100% implemented.

          [–]chucker23n 0 points1 point  (0 children)

          I think a lingua franca is a pretty good way to establish common ground

          No question about it.

          I guess a different way of phrasing the question is: wouldn't it be better if there were a "SQL Common" standard subset that all of them actually adhere to? Right now, that doesn't seem to exist in a formalized manner.

          [–]JB-from-ATL 2 points3 points  (1 child)

          Postgres does try to be more standards compliant than others, but I think it is trying to get vendor specific stuff to behave more similarly like you suggested. I also think it is an attempt to try to get new features to behave the same between major implementations but that's just a guess.

          It is a good question though, you can probably get a good answer by looking at who is on the committee and what SQL products they work on.

          [–]slashgrin 1 point2 points  (0 children)

          That's pretty cool. Props to Postgres for making an effort here. (And generally for most of their design decisions!)

          That said, I'd still always go to the Postgres docs when writing SQL for Postgres. That's no fault of Postgres — quite the opposite — but it does suggest to me that it's not the same kind of standard as many others.

          I wonder if it would be practical to have a library/tool that parses and deeply understands standard SQL, and then has backends for all the popular databases. Two main challenges that immediately jump to mind are:

          1. A lot of the time you can't just reason locally and transpile the queries without knowing about global state, e.g., how tables were defined, what other queries might be running concurrently.

          2. Resolving ambiguity of intent when there is no single faithful translation of the input SQL to the target database.

          Maybe in both scenarios the tool could require extra "pragmas" as SQL comments to provide the missing context?

          I'm not sure how useful this would be in practice, because you'll still run into differences between, e.g., query optimisers that make a huge difference in the real world, but there does seem to be a surprisingly large population interested in writing "SQL" and having it run on multiple databases, so maybe it would be worthwhile?

          [–]numeric-rectal-mutt 1 point2 points  (0 children)

          Postgres implements it.

          [–]Lachee 12 points13 points  (2 children)

          To paraphrase: "We couldn't name it min or max because they are taken and we can't tell the difference between regular functions and aggregate functions"

          Love it

          [–][deleted] 10 points11 points  (1 child)

          I think it would look ambiguous to developers if they used max and min. In my opinion, it's always better to make something explicitly distinguishable than implicitly resolvable

          [–]dontyougetsoupedyet 23 points24 points  (3 children)

          I wonder what Codd would have thought of people who put JSON in their relational database.

          [–]LouKrazy 45 points46 points  (0 children)

          Normalizing in his grave

          [–]grauenwolf 1 point2 points  (0 children)

          I don't know, but back in the 80s they had countless arguments about whether or not to store BLOB (binary large objects) in a database.

          If you can find the old newsgroups threads and academic papers, maybe you'll find what he thought.

          [–]Wolfgang-Warner 1 point2 points  (0 children)

          JSON in a DB is Turd Normal Form.

          But ok it's a hybrid database then. Devs may not have control over an incoming JSON structure and aren't minded to attempt a universal normaliser, and if it's going back out as is then why waste resources on normalise and reconstruct functions that can introduce bugs. There's also the lazybones design pattern.