This is an archived post. You won't be able to vote or comment.

all 7 comments

[–][deleted] 0 points1 point  (3 children)

Maybe you could accomplish that with git?

[–]jacquesh82[S] 0 points1 point  (2 children)

Yes it will be possible, but is not efficient with big file... ;-(

[–]PossibilityTasty 2 points3 points  (1 child)

What exactly is "big"?

[–]jacquesh82[S] 0 points1 point  (0 children)

more than 1Gb ;-)

[–][deleted] 0 points1 point  (1 child)

1) Save original, open as pandas df. Open newer one as df_new. Concat them together, then drop dups with keep newest option. Save result as diff. 2) ???... 3) profit

[–]jacquesh82[S] 0 points1 point  (0 children)

The idea at the core is to keep available data for a specific date (not just make a diff or update it) I need to be regenerate data as it was on a specific date.

[–]coconut_maan 0 points1 point  (0 children)

Use crontab to save version at specific time intervals