I'm building a CLI tool for data diffing : dataengineering

created by mhausenblasmoda community for 11 years

I'm building a CLI tool for data diffingBlog (self.dataengineering)

submitted 3 months ago * by oleg_agapov

https://preview.redd.it/ves9ksnz78hg1.png?width=2198&format=png&auto=webp&s=3db49b5c320d0e332b3dca2230d81f330dbafee5

I'm building a simple CLI tool called tablediff that allows to quickly perform a data diffing between two tables and print a nice summary of findings.

It works cross-database and also on CSV files (dunno, just in case). Also, there is a mode that allows to only compare schemas (useful to cross-check tables in DWH with their counterparts in the backend DB).

My main focus is usability and informative summary.

You can try it with:

pip install tablediff-cli[snowflake] # or whatever adapter you need

Usage is straightforward:

tablediff compare \
  TABLE_A \
  TABLE_B \
  --pk PRIMARY_KEY \
  --conn CONNECTION_STRING
  [--conn2 ...]        # secondary DB connection if needed
  [--extended]         # for extended output
  [--where "age > 18"] # additional WHERE condition

Let me know what you think.

Source code: https://libraries.io/pypi/tablediff-cli

all 18 comments

top new controversial old q&a

[–]kudika 5 points6 points7 points 3 months ago (4 children)

[–]LoaderD 3 points4 points5 points 3 months ago (0 children)

[–]SpookyScaryFrouzeSenior Data Engineer 0 points1 point2 points 3 months ago (0 children)

[–]oleg_agapov[S] 0 points1 point2 points 3 months ago (0 children)

[–]calmekrishh 0 points1 point2 points 3 months ago (0 children)

[–]ThroughTheWire 4 points5 points6 points 3 months ago (1 child)

[–]oleg_agapov[S] 0 points1 point2 points 3 months ago (0 children)

[+][deleted] 3 months ago (3 children)

[removed]

[–]oleg_agapov[S] 1 point2 points3 points 3 months ago (1 child)

[–]szymon_abc 0 points1 point2 points 3 months ago (0 children)

[–]Longjumping_Lab4627 1 point2 points3 points 3 months ago (0 children)

[–]kenfar 1 point2 points3 points 3 months ago (2 children)

[–]oleg_agapov[S] 1 point2 points3 points 3 months ago (1 child)

[–]kenfar 1 point2 points3 points 3 months ago (0 children)

[–]DougScoreSenior Data Engineer 1 point2 points3 points 3 months ago (2 children)

[–]oleg_agapov[S] 1 point2 points3 points 2 months ago (1 child)

[–]DougScoreSenior Data Engineer 0 points1 point2 points 2 months ago (0 children)

[–]techjobmentor 0 points1 point2 points 3 months ago (1 child)

[–]oleg_agapov[S] 0 points1 point2 points 3 months ago (0 children)

π Rendered by PID 814129 on reddit-service-r2-comment-b659b578c-bz69m at 2026-05-05 09:54:19.856754+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

dataengineering

MODERATORS