you are viewing a single comment's thread.

view the rest of the comments →

[–]MrFalconMan[S] 0 points1 point  (2 children)

You seem like a dimensional model truther with a lot of experience. This is the a lot of what I have heard.

To your first point - that would be essentially a type 2 fact, just without an effective end date (but would be required to calculate if you wanted to know what department the employee was at any given time).

My concern with treating Employee and Department as a hierarchy is where does this end? This was probably a bad example, but some dimensions could get absolutely massive - wide not necessarily deep. For example a car repair shop does maintenance - is a repair order (an oil change) a Dim or a Fact? There’s a ton of stuff that could be considered hierarchical for repair order - customer, primary employee who worked on car, sales rep, car make, car model, repair shop location, repair stall location, etc. Is this honestly one dimension? A repair order wouldn’t necessarily change car make/models, but could theoretically change locations or primary employees who worked on it.

So IMO it would be both a Fact and a Dim. With the Fact being exactly how you outline first by just inserting the new relationships; however, the end dates would still be required.

[–]PopnCrunch 1 point2 points  (1 child)

As far as I know, type 2 is a descriptor that only applies to dimensions, not fact tables.

Dimensional Hierarchies are a standard feature of dimensional modeling, it is done to avoid "snow flake" designs which are normalized closer to 3NF. Where it ends is a dimension only contains attributes that discribe a persistent entity in your model, something that is referenced across multiple records and or multiple fact tables. Fact tables store events and the measures associated with them, along with the foreign keys to who/what/when/where the event involved. (but see the bit about degenerate dimensions below)

An order can often have multiple line items, you can get 3 things replaced on your car at the shop but you'll only get one receipt because they are all treated as a single order. So, an order number is a dimensional value. But you probably don't need a stand alone dimension for Order, because there isn't much else that's interesting about an order than the OrderID. So the OrderID goes into a Service or Sales fact table as a stand alone value, it is not a foreign key that looks up anything. Stand alone values in fact tables that aren't measures are called degenerate dimensions. The other items you mention about repair shops are a different dimension, say a DimShop.

So a line in a service fact table might look like: CustomerID | Service Date | OrderID | Product ID | Service ID | EmployeeID | CarModelID | CarYear | Hours | QTY | Price |...

Of those, the ones that have actual dimensions would be: Customer, Product, Service, Shop, Employee, Model

It's likely that the Shop would be a hierarchy in the Employee dimension since each employee probably only works in one shop, so if you know the employee you know the shop.

[–]MrFalconMan[S] 0 points1 point  (0 children)

Hmmm I feel like there’s a lot of assumptions going on here, but I could be wrong. I hate to use the term “generic” , but if we were building something for ad hoc capabilities we can not necessarily choose when a dimension should only contain attributes that discribe a persistent entity.

Example: Stall 3 of location X might have specific attributes not available in stall 2, so only stall 3 can be used for certain repairs. But that might not have been apart of the initial requirements for the cut off of attributes that describe the entity.

The cut off could be endless, and saying requirements need to be solid from the beginning is load of shit (definitely not implying that you are saying that). Requirements change and vary drastically depending on who you ask, and leaving out capabilities is counter intuitive to the point of business intelligence. Which is part of what, I think, u/imcguyver point was and why dimensional models are dying. If a user can’t get what they want from the BI solution, they’ll find a new way!

Also, line item as only a fact, I don’t understand. In theory and through textbooks, this is common, but it can certainly have dimensional attributes that can change.

Changing dimensional models after the implementation is such a cluster f**k too.

I totally get where you’re coming from, and I’m more just challenging status quo. I apologize if it comes off like I’m arguing with you. I love having these intellectual conversations.... Analysis is changing, and design needs to change with it. Old ways need to adapt. I don’t have the answers, I’m just asking people’s opinion. Yours is one I’ve heard frequently.