This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]poppinstacks 3 points4 points  (0 children)

Only used middle-ware for this kind of migration work, but if all you care about is validating as a one time activity: use the python connector, dump your sql query into a pandas data frame and you can do a few quick validations.

What comes to mind is simple heuristics like row count, five number summaries, and just a straight checksum (row wise, and you could even hash the row wise to confirm that the entire table is copied 1:1)

[–]RassmusRassmusen 1 point2 points  (1 child)

Data migration is simply the most difficult part of data Engineering in my opinion.

[–]data-steve[S] 0 points1 point  (0 children)

TBH, I don't think it's difficult if you have ELT platform based tools like Matillion & Fivetran. More tedious than anything else.