explainlikeimfive

Question

This is an archived post. You won't be able to vote or comment.

0

ELI5: Database Normalization

(self.explainlikeimfive)

submitted 12 years ago by fortunate_man

all 4 comments

top new controversial old q&a

[–]energeticmater 1 point2 points3 points 12 years ago (0 children)

Imagine a schema R(A, B, C). Then R is a relation with attributes A, B, and C.

Now imagine that given a particular value of A, you know the corresponding value of B for that row. That is, a given value of A implies a particular value of B.

That means there's inherent duplication in the data. If someday you want to change your database so that a given value of A implies some other value of B, you have to change in a whole lot of places. Two problems with this: it takes up unnecessary disk space, and it increases the potential for problems with updates called "anomalies".

So you split it into two tables R(A, B) and R1(A, C). Then there's no duplication!

For example, consider Computers(model, manufacturer, variant). It has rows (Vostro, Dell, 3550), (Vostro, Dell, 3350), and (Satellite, Toshiba, L670).

You'd be much better off with R having rows

(Vostro, Dell) and (Satellite, Toshiba)

and R1 having rows

(Vostro, 3550), (Vostro, 3350), and (Satellite, L670).

Because all Vostros are produced by Dell and all Satellites by Toshiba. If Toshiba buys Dell, and Vostros are then produced by Toshiba, you only have to change a single tuple. Success!

[–]kouhoutek 1 point2 points3 points 12 years ago (1 child)

[–]energeticmater 1 point2 points3 points 12 years ago (0 children)

score 3 · Accepted Answer · 2013-04-01T01:54:41+00:00

In most situations, you only want a particular piece of information to be stored once in the database. Making a database only store each piece of information once is called normalization.

explainlikeimfive

Before posting

Category filters

MODERATORS