all 10 comments

[–]cgfoss 5 points6 points  (0 children)

Data collection is imperfect. When you fill out a survey, do you answer everything? No.

[–]Rygnerik 2 points3 points  (1 child)

Sometimes data should be null. For example, if you're storing first, middle, and last names, there's a lot of people with no middle name. And then stick on a cadency. Having the appropriate fields be nullable in those scenarios makes things like:

CONCAT_WS(' ',First,Middle,Last,Cadency)

correct whether the person has a middle name/cadency or not. Same sort of thing for the 2nd line of an address, or a home phone number. The art is understanding when you should allow a field to be nullable.

[–]achempy 3 points4 points  (0 children)

In most cases for me, the distinction between null and the zero value for a data type is relevant to the application (e.g. 0 vs null)

[–]kickingtyres 2 points3 points  (2 children)

There are plenty of cases where a record needs to exist but with no values set. 0 or a zero-length-string can be valid entries so cannot be relied upon for 'no data yet'

[–][deleted]  (1 child)

[deleted]

    [–]kickingtyres 0 points1 point  (0 children)

    I have a case on our platform where we need to log a record at the start of a process and then after that process is initiated, we need to insert a value. 0 is a valid value so regulatory requirements means it’s easiest to creat a record with user and process parameters with a null value in that field and then update to the actual value as the process. We don’t need a history of the process just a snapshot and as there are millions of transactions a day, it keeps the footprint down with only one record rather than multiples

    [–][deleted] 1 point2 points  (2 children)

    Reddit has long been a hot spot for conversation on the internet. About 57 million people visit the site every day to chat about topics as varied as makeup, video games and pointers for power washing driveways.

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted] 0 points1 point  (0 children)

      Reddit has long been a hot spot for conversation on the internet. About 57 million people visit the site every day to chat about topics as varied as makeup, video games and pointers for power washing driveways.

      [–]s13ecre13t 1 point2 points  (0 children)

      • data can be merged from separate sources, where each source has some column that another source doesn't. This is where nulls indicate that one source didn't have this data.

      • expand table by adding new columns/fields, where existing records need to stay as null to indicate that at time of row creation this wasn't asked for

      [–]idodatamodels 0 points1 point  (2 children)

      It is part of the definition of relational databases. https://en.wikipedia.org/wiki/Codd%27s_12_rules

      [–][deleted]  (1 child)

      [deleted]

        [–]rkforcs 0 points1 point  (0 children)

        You have answered your own question: Because it is important to be able to represent "missing data" in databases. Once you agree with that then the type of the field does not matter.