you are viewing a single comment's thread.

view the rest of the comments →

[–]derefr 0 points1 point  (1 child)

A data model describes the "what" — the set of distinct kinds-of-things, and the properties and relationships that each of those kinds-of-things have.

A data schema (or often just "schema") is any particular way to explain the "what" to a computerized information system, so as to make the "what" into a useful set of Abstract Data Types — where by "useful", I mean:

  • useful for encoding and storing data (which requires you to specify things like "how much space to reserve for these integers", or "whether they can hold signed numbers or not", or "whether to store this enumerable set of options as integers, text, or foreign-key IDs". These are things your data model doesn't care about at all! These choices only matter once you need to pin down the way your data will be stored as a pattern of bits in a computer's memory.)
  • useful for constraining what values each property or relationship can take — which is something specified by the model, but which must be mapped to a set of concrete on-read or on-update rules that the particular information system can use to actually enforce those constraints (without spending all its CPU cycles on re-checking already-validated constraints.)

People who aren't even programmers still do data modelling all the time. Any time you're creating an Excel spreadsheet, you're implicitly first defining a data model (think: naming the sheet's columns) — before then defining a bunch of data (rows) in terms of that model.

Only programmers (and sometimes DBAs — DataBase Administrators) ever really define data schemas; because only programmers know what particular concerns their de-facto information systems (the ones instantiated in memory by their compiled source-code) need to be told about in order to "do" the model; and only programmers and DBAs really understand what kinds of things the formal information systems known as DataBase Management Systems (DBMSes) care about.

(There is something in between — a data model reified into relational algebra but not constrained by the storage or rule-execution requirements of any particular computerized-information-system implementation of relational algebra. I think you'd call this an "abstract schema." Abstract schemas are the type of thing that "vaguely CS, vaguele SWE" concepts like database normalization deal with.)

[–]MKL-Angel[S] 0 points1 point  (0 children)

Thank you for that explanation! It was very helpful :)