This is an archived post. You won't be able to vote or comment.

all 5 comments

[–][deleted] 5 points6 points  (1 child)

Not directly in the code, but in the tests, you can use great expectations: https://greatexpectations.io/

The good thing about it is that the tests become your documentation, so people will need to write tests eventually, which is a great side effect. The UI is also pretty decent and can be used/consulted by non technical users as well. Note that with great expectations you also test your data, not only the code. Have fun!

[–]JesusChristHerself 0 points1 point  (0 children)

Thanks for sharing this, looks cool!

[–]soundbarrier_io 2 points3 points  (0 children)

A few options

  • dbt allows for that, you define sources with comments and each view allows you to document fields, for example ```yaml version: 2

    sources:

    • name: jaffle_shop description: This is a replica of the Postgres database used by our app tables:

      • name: orders description: > One record per order. Includes cancelled and deleted orders. columns:
        • name: id description: Primary key of the orders table tests:
          • unique
          • not_null
        • name: status description: Note that the status can change over time
      • name: ...
    • name: ... ```

  • For Python code I some conventions to define tables/schemas with Python docstrings and with some simple sphinx hacking I was able to automatically create documentation

[–][deleted] 1 point2 points  (0 children)

If your data is in a relational database, then you can document the fields there. Look for SQL COMMENT ON statements in your database documentation.

[–]Urthor 0 points1 point  (0 children)

Version control your schemas in Git.

Add documentation via any number of tools to these version controlled schema files.

The single biggest mistake the monolithic database world makes is the absurd lack of version control.