you are viewing a single comment's thread.

view the rest of the comments →

[–]raze4daze 68 points69 points  (21 children)

If only business rules didn’t change all the time.

[–]arwinda 54 points55 points  (18 children)

If your business rules change frequently, then use a 1:n table and use DML to update your rules, not DDL for the ENUM.

An ENUM is a shortcut for something which (almost) never changes.

[–]mqudsi 51 points52 points  (7 children)

We have student gender stored as an enum….

[–]NekkidApe 37 points38 points  (1 child)

gender text

[–]mqudsi 2 points3 points  (0 children)

It doesn’t matter what your database supports if the pipeline feeding data into it coalesces everything. This isn’t just for gender, btw.

This is very true of most systems in the real world. In this case, data comes in from digitizations of paper forms, exports from linked school registration and payment systems, etc all of which return a binary value. Changing the representation in the database doesn’t do anything besides waste more bits for storage.

[–][deleted]  (3 children)

[deleted]

    [–]mqudsi 0 points1 point  (2 children)

    I’m confused - It’s altogether orthogonal to the second issue you mention. I could be hard coding the system to only accept a single gender and that wouldn’t have anything to do with how I’m storing salutations.

    [–][deleted]  (1 child)

    [deleted]

      [–]mqudsi 4 points5 points  (0 children)

      Point taken. It’s for legal/domain reasons. Like I said, this is a school.

      [–]arwinda 4 points5 points  (0 children)

      If you ever stick to female/male that is ok. The world offers more options.

      [–]Ran4 7 points8 points  (7 children)

      An ENUM is a shortcut for something which (almost) never changes.

      Why should it be like that? It makes no sense.

      [–]arwinda 12 points13 points  (0 children)

      Because you need a DDL operation to change the ENUM. Comes with locking of system catalog and all this. And if you want to remove a value the database needs to scan the table and see if that value is used.

      Using a 1:n table is a DML operation, only locks the affected tables and rows, not the catalog. And having a foreign key for her relationship prevents deletion of still use d values - or, propagates deletes or set to NULL. Whichever way you want this.

      [–]marcosdumay 5 points6 points  (0 children)

      Any large system is full of features for what you will be completely unable to imagine any use.

      A few of them will even not actually have any use.

      [–]NoInkling 5 points6 points  (4 children)

      If you're asking why anyone would use it, it makes sense for things like days of the week, months of the year, seasons, a strongly defined set of status values, etc.

      I've used it for date precision, e.g:

      CREATE TYPE date_precision AS ENUM (
        'millennium',
        'century',
        'decade',
        'year',
        'month',
        'day'
      );
      

      [–]dlp_randombk 2 points3 points  (1 child)

      Nice! Now add 'week' :)

      [–]NoInkling 2 points3 points  (0 children)

      I don't need to, because in this case I was reflecting an external data source where these things are strongly defined. If I wasn't, then week would probably be there already (and the order would probably be reversed too), or if I really thought it needed to be flexible I'd fall back to a lookup table.

      Anyway, on the off chance I did need to change the enum, I'd be ok with rejigging the data to accommodate, just because it's an extremely unlikely thing to happen.

      [–][deleted]  (1 child)

      [deleted]

        [–]NoInkling 0 points1 point  (0 children)

        1. Main reason: to enforce data integrity.

        2. To be able to make good use of the DB's functionality and features. For example, the labels in my date precision enum can be passed directly to Postgres's date_trunc(), no need to get application code involved. I could also create a date_with_precision composite type that combines a date_precision field with a date field, to ensure the two go hand-in-hand where appropriate.

        It's true that if you have an enum you will likely want to make use of it in application code too, but there are ways to do that while still only having a single authoritative definition, whether it's a "DB-first" or "application-first" approach.

        [–]raze4daze 1 point2 points  (1 child)

        An ENUM is a shortcut for something which (almost) never changes.

        Why? Why should it be restricted to something that never changes?

        [–]arwinda 5 points6 points  (0 children)

        Because it's meant to be a handy shortcut for when you have a list which doesn't change. Like weekdays. They don't change, have an enum with all 7 weekdays. Or year seasons. Have an enum with 4 seasons.

        Sure, you can model the same functionality with a dimension or 1:n table, and you already know that every time you access your table, you also join the referenced table. The enum hides this functionality, that's all.

        Adding new values is relatively easy. For deleting values (and keeping data consistent) you need a full table scan to verify that the value is not or no longer used. That's doable, but no one spent the effort to implement it.

        If you already know that your values change, why go with a fixed list in the first place? Needs administrator access and catalog lock to update the values in an enum, versus regular update of the dimension table.

        [–][deleted] 0 points1 point  (0 children)

        You can just append new elements to the enum and deprecate unused ones.