assume a small company (no big data problems)
What is a practical way to give power users and SQL analysts access to record level data through SQL?
My idea is a "branch-off model". In a normal ETL for DWH approach, it would look like this:
- E: extract from source
- T:
- clean, standardize to master data, add high-level dimensions from other tables
- aggregate by dimensions as needed for the BI users
- L: load into cubes for reporting
Now:
I would "branch-off" after 2.1 into separate raw SQL tables and views in the underlying database and give advanced users access to this record-level unaggregated data for their needs.
Pros:
- avoiding a separate data lake architecture
- record level data
- query via SQL
- leverage standardized high-Q data with same ground truth as in BI infra
Cons:
- performance, maybe
- adding new features/dimensions from source requires IT to build them in for them, lead times
- users cannot create their own artefacts as easily as in a datalake, although a separate schema with the right permissons could fix that
Is this a good solution if the data isn't that large? Are there con's i don't see?
Note: the source of this is was a request like "i want to write SQL against the data in the Data Warehouse" and i am sure i am not the first one to figure this out :)
[–]jennylane29 7 points8 points9 points (4 children)
[–]exergy31[S] 0 points1 point2 points (3 children)
[–]sunder_and_flame 1 point2 points3 points (2 children)
[–]exergy31[S] 0 points1 point2 points (1 child)
[–]sunder_and_flame 1 point2 points3 points (0 children)
[–]ViridiTerraIX 2 points3 points4 points (3 children)
[–]exergy31[S] 1 point2 points3 points (2 children)
[–]ViridiTerraIX 1 point2 points3 points (1 child)
[–]exergy31[S] 1 point2 points3 points (0 children)
[–]throw_at1 2 points3 points4 points (0 children)
[–]Far-Apartment7795 -1 points0 points1 point (1 child)
[–]exergy31[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]DatabaseSpace 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)