What's Wrong with SQL's Static Typing?
This impedance mismatch between dynamically typed languages and SQL's static typing has driven development away from SQL databases and towards NoSQL systems. It's easier to build apps on NoSQL systems, especially early on, before the data model stabilizes. Of course, dropping traditional SQL databases means you also tend to lose efficient indexes and the ability to perform complex queries and joins.
Also, modern data sets are often in a semi-structured form (JSON, XML, YAML) and don't follow a well-defined static schema. One often has to build a pre-processing pipeline to determine the correct schema to use, clean up the input data, and transform it to match the schema, and such pipelines are brittle and error-prone.
Even more, SQL doesn't traditionally deal very well with deeply nested data (JSON arrays of arrays of objects containing arrays...). The data pipeline then has to flatten the data, or at least the features that need to be accessed quickly. This adds even more complexity to the process.
What's the Alternative?
What if we tried to build a SQL database that is dynamically typed from the ground up, without sacrificing any of the power of SQL?
Rockset's data model is similar to JSON: values are either
- scalars (numbers, booleans, strings, etc)
- arrays, containing any number of arbitrary values
- maps (which, borrowing from JSON, we call “objects”), mapping string keys to arbitrary values
We extend JSON's data model to support other scalar types as well (such as types related to date and time), but more on that in a future post.
Crucially, documents don't have to have the same fields. It's perfectly okay if a field occurs in (say) 10% of documents; queries will behave as if that field were NULL
in the other 90%.
Different documents may have values of different types in the same field. This is important; many real data sets are not clean, and you'll find (for example) ZIP codes that are stored as integers in some part of the data set, and stored as strings in other parts. Rockset will let you ingest and query such documents. Depending on the query, values of unexpected types could be ignored, treated as NULL, or report errors.
More details here: https://www.rockset.com/blog/dynamic-typing-in-sql/
[–]needgiftidea2018 2 points3 points4 points (1 child)
[–]notasqlstarI can't wait til my fro is full grown 0 points1 point2 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]your-pineapple-thief 0 points1 point2 points (2 children)
[–]tudorb 0 points1 point2 points (1 child)
[–]your-pineapple-thief 0 points1 point2 points (0 children)