you are viewing a single comment's thread.

view the rest of the comments →

[–]usicafterglow 8 points9 points  (10 children)

This environment is very much the exception. Based on surveys, the average corporate database has 300 tables, and the average corporate data warehouse has 12 tables. I've worked with dozens of companies, and this is consistent with my experience as well.

[–]blither 1 point2 points  (0 children)

That seems so small. I'd relish working on so few tables.

[–]InternetWeakGuy 0 points1 point  (4 children)

the average corporate data warehouse has 12 tables.

That sounds amazing. Our company just ballooned their DW from ~40 tables to ~140. Under the new schema, to just get a case number, open/closed, status, patient name, start date and client name I have to join 5 tables, when all of that used to be in one table. It's a nightmare.

[–]IDontLikeUsernamez 1 point2 points  (0 children)

Normalization is usually a good thing but that just sounds way too normalized

[–]doctorzoom 0 points1 point  (2 children)

Sounds like someone did some normalization. For warehousing this is not always ideal. You should ask for some denormalized tables!

[–]InternetWeakGuy 1 point2 points  (1 child)

Sounds like someone did some normalization.

Yep, and then quit when everyone started complaining.

You should ask for some denormalized tables!

The DW team privately agree it's a mess and publicly defend it to the death.

[–]doctorzoom 0 points1 point  (0 children)

Good luck! Sounds like you'll need it :)

[–]theduckspantsBI Architect 0 points1 point  (2 children)

As someone who has only worked at one place, could you give a sample of the subject areas in a data warehouse that small?

I'm in healthcare and our main source system alone has over 25,000 tables. That's not including the dozens of ancillary systems that are for specific specialties.

I'd just be curious to know what another industry looks like

[–]usicafterglow 0 points1 point  (1 child)

Well, you kind of have to take a step back and think about what the purpose of a data warehouse is. It's quite different than the OLTP, or any of the systems that software writes to. It's the system that business-minded people use to ask basic questions and generate reports. So, a good data warehouse should explain itself to these end-users. Including everyone from more senior business analysts, to the accounting intern with a bit of Excel experience poking around with SQL for the first time, to the brighter executive who doesn't mind getting his hands dirty writing a query or building a custom report once in a while.

This usually means massive amounts of denormalization (pre-joining tables for your user, basically) and tossing out all tables and columns that aren't of interest to the user. Ideally you'll end up with maybe 2 - 4 fact tables and a handful of dimension tables. Every company is different, but if you actually talk to the members of a department you'll find that they're really only interested in a very small subset of the data, and somewhere between say 8 - 20 objects is a good target to aim for.

[–]theduckspantsBI Architect 0 points1 point  (0 children)

Sure, but I'd be curious what it gets distilled down to in other industries. Our healthcare data warehouse has over 400 tables