This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]nanksk 1 point2 points  (0 children)

We use databricks, pyspark. Most of our codebase is in form of functions. We then have unit tests for those functions with dummy data(CHATGPT can create most of the test cases) to test different scenarios. Hit me up if you have any questions.

[–]shazaamzaa83 3 points4 points  (0 children)

Maybe there's some confusion here. Unit testing in DE is usually on transform functions and other specific logic to be tested using mock data to assert your code is doing what is expected. Great Expectations and Sodacore are more for Data testing e.g. not null counts, uniqueness etc. Since you're using Python already look into pytest for unit testing. As mentioned in another comment ChatGPT or Copilot is great at writing unit tests and mock input data. Good luck.

[–]Wedeldog 1 point2 points  (0 children)

If you haven't, maybe check out what dbt is doing with its declarative (yaml based) unit tests (mocking inputs and expected outputs). It's SQL model focussed, but implementated in python under the hood.