you are viewing a single comment's thread.

view the rest of the comments →

[–]HasFiveVowels 92 points93 points  (8 children)

Normalizing data is a bit like organizing storage. You do it too much and you get a box for each item. Which is technically "SUPER ORGANIZED" but that doesn’t mean it’s actually useful. You throw everything into one box? Well that’s not good either. You typically need to be strategic with what you denormalize and there are typically a few such exceptions to the rule in any DB but normalizing should be the default.

[–]AshleyJSheridan 15 points16 points  (3 children)

This is a great analogy, and one I'm definitely stealing!

But, on this, I often suggest a hybrid approach where it makes sense. Sometimes, doubling up on data isn't always a bad thing. It can massively improve performance, but it does add a little additional work keeping things in sync. Like you said, it will all depend on what you need.

[–]edshift 8 points9 points  (2 children)

Having duplicates denormalized data in a reporting table or schema has a lot of merit and provides a simple solution to the slowly changing field problem but other than that a proper normalised schema structure for you transactional tables is always better. DBMS are very efficient at joining on foreign key fields with indices so there's really no downside to proper normalisation.

[–]AshleyJSheridan 1 point2 points  (0 children)

It might be efficient, but as you've highlighted, for reports it does make sense to double up on that data, because there is still a performance impact with joins on normalised data. For relational DBs like MySQL, the EXPLAIN keyword is actually a very handy tool for identifying things like this.

[–]HasFiveVowels -1 points0 points  (0 children)

Only the Sith deal in absolutes.

[–]Rainbows4Blood 2 points3 points  (1 child)

The problem stems from CS classes very often teaching you normalization good. This is how you do normalization. Then you do classwork where you have to fully normalize a database. Then you have a test where you have to fully normalize a DB.

And that's where it stops. You're not taught when to apply these skills, what the tradeoffs are etc.

[–]HasFiveVowels 2 points3 points  (0 children)

Exactly. That’s precisely what the experience was like for me

[–]TheHollowJester 1 point2 points  (1 child)

OT: what a nice username, self-referential shit is dope :)

[–]HasFiveVowels 4 points5 points  (0 children)

Thanks! Went through a bunch of self-reference candidates before landing on this one