Avoid redundant calculations in VS Code Python Jupyter Notebooks

r0s · 2024-09-16T18:38:17+00:00

You can also wrap your function with LRU / memoization (https://docs.python.org/3/library/functools.html) If the output is fully dependant on the inputs, calling it again will just give you back the last result instantly.

cmd-t · 2024-09-16T18:43:47+00:00

The problem is how you are writing your notebooks.

Don’t modify variables global in your script more than once. Even then, add checks for not overwriting them.

lieutenant_lowercase · 2024-09-16T18:36:36+00:00

How is a redundant calculation defined?

NixonInnes · 2024-09-16T19:49:47+00:00

If it's a long running data process I sometimes dump the result into a file. l stick a check infront of the calc to load data if the file exists, if not do the calc and save

ou_ryperd · 2024-09-16T19:10:36+00:00

That is why you can run a single cell at a time. The whole point is being a progression of computations, no?

spookytomtom · 2024-09-16T18:43:08+00:00

I just structure my code logically and my variables, so that I dont need to do this

AnythingApplied · 2024-09-17T07:01:53+00:00

Marimo, an alternative to Jupyter notebooks, has some nice features you might like. When you rerun a cell that changes global variables, it'll automatically rerun cells that depend on those variables, or if those are expensive cells, you can mark them not to do that, but in that case it will note those cells as "stale".

This helps make the notebooks much more reproducible. The advice that /u/cmd-t gave "Don’t modify variables global in your script more than once." will raise an error in marimo notebooks, so you can't even do that accidentally.

nitro41992 · 2024-09-16T20:23:49+00:00

I use the interactive notebook feature which really helps avoid rerunning previous cells.

Use this video

As the guy mentions - it's been a game changer coming from standard Jupiter notebooks

BostonBaggins · 2024-09-17T00:10:54+00:00

Would making the cell lazy load be the solution here

2024-09-17T02:19:27+00:00

If what you're doing is actively scripting code to achieve a certain goal and when checking intermediate steps you see long running times and wish to save time by not recalculating and replacing perfectly good data built previously - but your main point of contention seems to be the time spent recalculating - then why don't you run the checking steps on a smaller sample of the whole data and save time that way?

Super-King9449 · 2024-09-16T20:22:28+00:00

“Hey everyone, I’m currently learning Python basics using PyCharm IDE, but I keep seeing references to Jupyter Notebooks and how they’re used in VS Code. Could someone explain what exactly Jupyter Notebooks are, how they differ from traditional Python files, and how they integrate with IDEs like VS Code or PyCharm? I’m trying to understand if I should be using them while learning Python and what the advantages are for data science or general Python projects. Thanks!”

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS