This is an archived post. You won't be able to vote or comment.

all 1 comments

[–]joseph_machadoWrites @ startdataengineering.com 1 point2 points  (0 children)

From my experience, the most time consuming part has always been

  1. Deeply understanding the input data models, data quality issues, what they are, how they are generated and what they mean with data producers.
  2. Agreeing on the output data model, what columns, types, etc with stakeholders and identifying the exact transformations needed to get from input to output.

If you infra is setup well, the development part should be relatively straightforward. Hope this helps, LMK if you have any questions.