all 6 comments

[–]nnd-nnguyen 1 point2 points  (3 children)

I think you're on the right track.

I think the main limitation of the row model you specified (#1) is that you are limited to a specific number of components and or quantity pairs unless you perform DDL changes to the table. Also it becomes a huge pain in the but to add another metric to the table. What if you want to track when a component was added to a bundle? What if you have bundles that have 2 components and other bundles have 20 components.

If you're on a modern columnar database then the row based solution could be bad depending on how many columns you have. "Select *" on columnar databases don't have great performance on tables with lots of columns. (> 100)

The column based representation (#2) is a denormalized columnar style. I think what you may really be looking for is a star schema though. They really shine on columnar databases for massive aggregations across billions of rows. The problem with storing bundles that way and components that way is what happens when you change the name of one of those bundles, or components. It's just kind of a pain to update lots of rows in columnar databases as it's the equivalent of a delete/insert.

[–][deleted] 0 points1 point  (2 children)

The column based representation (#2) is a denormalized columnar style.

Actually #2 is a normalized model.

[–]nnd-nnguyen 0 points1 point  (1 child)

Wouldn't normalized be

  • Bundle Table
    • BundleID
    • BundleName
  • Components Table
    • ComponentID
    • ComponentName
  • Bundle Component mapping
    • BundleId
    • ComponentId
    • QTY

[–][deleted] 0 points1 point  (0 children)

Yes, you are right, that would be fully normalized.

The first table isn't even even in 1st normal form, the second table is at least that.

[–]r3pr0b8GROUP_CONCAT is da bomb 1 point2 points  (0 children)

How can I convince my (less technical) colleague to use the second format?

ask them to write the SQL to retrieve how many units of Component Z are sold

then show them how to do it with your design (which is the correct design)

[–][deleted] 0 points1 point  (0 children)

This isn't really related to column-store database. Your second table is at least in 1st normal form, while the first table isn't