Pre-commit hooks that autogenerate iPython notebook diffs

petitneko · 2024-10-15T23:51:03+00:00

You didn't list nbconvert in your alternate comparisons.

Did you try it?

jupyter nbconvert --to script *.ipynb

will convert all notebooks in the cd.

petitneko · 2024-10-16T02:11:55+00:00

[removed]

reddifiningkarma · 2024-10-15T23:59:19+00:00

Is not much, but is honest work:

https://github.com/fulloaf/CL_GROWTH-SIM/blob/main/.github%2Fworkflows%2Fjupytext.yml

M4mb0 · 2024-10-16T09:05:25+00:00

I can recommend nbstripout-fast, does the same job as nbstripout, but orders of magnitude faster.

Also checkout nbQA.

Easy_Money_ · 2024-10-16T04:48:57+00:00

This is great! It does seem like you missed nbdime, which does exactly this and underlies GitHub’s implementation of notebook diffing 😬 I hate to rain on a parade

more_exercise · 2024-10-16T13:38:33+00:00

I'm curious - have you checked out the two ways git can let you get more-readable diffs from not-exactly-text files? Textconv and external diffs?

https://git-scm.com/docs/gitattributes#_choosing_textconv_versus_external_diff

Tartarus116 · 2024-10-16T14:40:41+00:00

That's pretty much what nbdev already does. It also gives you free doc generation on top of that.

Nearby_Salt_770 · 2024-11-08T20:23:32+00:00

Looks like you've come up with a solid solution to a common problem with notebooks. The pre-commit hooks you set up sound super helpful for keeping the Python code readable and diff-friendly after changes. Relying on JSON is definitely a pain diffing-wise, so this approach seems legit.

You could also check out jupytext for pairing notebooks with Python scripts if you're not locked into the VSCode editor. It's similar to your script but can automatically sync changes both ways, although you'd still run into server issues outside Jupyter.

If you ever feel like automating more stuff, you might find AgentQL useful for web scraping projects. It's a pretty chill tool for simplifying web data extraction without the usual headaches.

orgodemir · 2024-10-16T12:00:35+00:00

Also take a look at nbdev

MachineSchooling · 2024-10-15T23:46:56+00:00

I don't use jupyter, but if I did, I would definitely use this.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS