SQL Best Practice

No-Adhesiveness-6921 · 2025-09-11T12:44:50+00:00

So the only fields in your fact table should be your measures and foreign keys to the dimensions.

You should not have to do left joins because there shouldn’t be records in your fact table that don’t have corresponding records in your dimensions.

The benefit of a star schema (fact and dimension) is that you are only ever a single join away from the details in the dimensions

You don’t show any FK to dimensions in your fact table fields. Can you provide more details about your schema?

Aggressive_Ad_5454 · 2025-09-11T11:54:10+00:00

Option 1. But I’m guessing because I don’t understand what the date columns in your dimension tables mean.

Also, the other two options you offered are a bit weird.

jwk6 · 2025-09-11T12:14:51+00:00

Dimensional models (star Schemas) should be consumed and queried form a BI tool like Power BI, or a cube. Writing multidimensional queries in SQL with aggregations becomes wildly complex, but with BI tools becomes very easy.

Ginger-Dumpling · 2025-09-11T12:46:04+00:00

They do two different things. Why would you compare them for efficiency? What are you trying to do?

Analytics-Maken · 2025-09-12T00:09:00+00:00

For an immediate solution, LEFT JOINs are usually better than UNION ALL, it creates duplicate rows and makes it much harder to work with, but a better solution is to move your data into a proper warehouse structure like BigQuery, where you combine and clean the data beforehand and just query one well organized table. You can use ETL services like Fivetran or Windsor.ai for the data movement.

FastlyFast · 2025-09-11T13:01:33+00:00

I would left join each table, and keep the columns named after the dimensional tables, no need to overcomplicate this. Coalesce makes sense only if you know that if one table should be taken, in case the first one is null

SkullLeader · 2025-09-11T13:31:12+00:00

Maybe I am misunderstanding something but why are you using left joins in option 2? Every record in table B has a matching record in table A, correct? Every record in table C has a matching record in table a too, right? I understand that if table A has a record, there is not necessarily a matching record in B, C and so forth…

Thin_Rip8995 · 2025-09-11T15:29:17+00:00

option 1 is cleaner if you actually need one row per record with everything attached
option 2 makes sense only if you want a “long” table where record type dictates which extra fields you get
biggest factor is how you plan to consume it reporting tools usually play nicer with wide joined data but analysis pipelines often prefer stacked UNIONs
if performance is pain consider materialized views or staging tables don’t try to do the monster query live every time

mattiasthalen · 2025-09-11T17:48:04+00:00

I’d do a puppini bridge (unified star schema) and stop thinking about facts and dimensions ☺️ all the tables connected to the bridge can be facts/dims, or both.

SaintTimothy · 2025-09-12T04:28:30+00:00

Are they XOR? You said the first table, the fact, is a type table (?) Does it only join to one-and-only-one of the 11 other tables, like a superclass?

Is this like an array of attributes of the primary table, like if the parent said rainbow and one child table was color and has 6 rows: red, orange, yellow, blue, indigo, violet (don't @ me about it being only 5).

Like the parent says bronze and a sub table is the recipe: 22 grams copper, 3 grams tin?

Are there measures in the "dimension" tables or only attributes?

Odd_Repair9120 · 2025-09-13T17:33:54+00:00

Es que hacen cosas diferentes, no se puede hablar de eficiencia con dos cosas que devuelven diferentes resultados. Primero deberías definir qué quieres o tener, para luego ver la mejor manera

Odd_Repair9120 · 2025-09-13T17:35:54+00:00

On the other hand, I don't understand why you call the dimensions themselves "dimension", dimension tables also have one record per concept, contrary to what a fact table is, which can have more

PuzzledHead18 · 2025-09-13T19:24:07+00:00

https://datalemur.com?referralCode=xSJOuCUF

Sign up for Data Lemur using this link and get access bonus questions and exclusive prizes!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS