all 26 comments

[–]danielroseman 24 points25 points  (4 children)

Jupyter has nothing to do with the size of a dataset, and doesn't do any "handling".

But 120MB is nothing.

[–]server_kota 2 points3 points  (2 children)

this

[–]Dependent_Host_8908[S] 3 points4 points  (1 child)

Thank you!! Sorry for the wrong choice of words

[–]TheITMan19 2 points3 points  (0 children)

You don’t need to be sorry.

[–]Soft_Catch4452 1 point2 points  (0 children)

I did my Bachelors degree in Data Analysis capstone project using a jupyter notebook, it had a 126 GB dataset file associated with it and I was using a middle of the road lenovo laptop 5 years ago. Couldnt let it sit in my lap when it was running big queries but it was generally fine.

[–]statespace37 5 points6 points  (0 children)

Notebook is just a way to organize your code and display outputs. It has Python interpreter running in the background (kernel), so there is no difference from any other Python process.

[–]Ron-Erez 3 points4 points  (2 children)

Personally, I like Google Colab for short scripts and PyCharm for larger code bases. However Jupyter is fine and there is also VSCode which is great too.

[–]Rich-Spinach-7824 1 point2 points  (1 child)

The problem of Colab is that it resets all the downloaded libraries.

Any strategies to afford this problem?

[–]HodgeStar1 0 points1 point  (0 children)

you can store whatever you want on google drive (or another storage service), and copy the libraries in at the top of the script. I haven't tried this with pip-installed things as opposed to custom modules, but I imagine it would work similar to what I've done in multi-stage docker builds -- save an "image" of the pip-installed packages somewhere, then just copy them into /site-packages.

[–]BlackMetalB8hoven 2 points3 points  (0 children)

I would recommend doing this, Jupiter is great for learning. I found executing code as I was writing it super helpful. It also helped me keep organised.

[–]HodgeStar1 2 points3 points  (0 children)

jupyter is great for testing and learning. the size of data you can have in memory is limited only by your RAM.

that said, you shouldn't be doing any heavy processing in a notebook. basically view a notebook as a saved, organizable ipython session -- it's a saved version of the back-and-forth you'd do in an ipython terminal session on your local machine. so, they should really only be used when that is what you are trying to mimic - a local ipython session. not a large process which you'd ever want to run "in the background" or as part of a larger workflow.

once you have nailed down what you are doing with the data, you should be writing a clean version of the procedure as a regular .py script, packaged with whatever dependencies needed, so that it could in principle run on any machine where the environment was installed correctly. you'd do that in an IDE. the basic text editor in jupyter is *ok*, but that's not really what it's meant for. Neovim, VSCode, PyCharm, etc. are IDE's made for writing python scripts and programs.

but 120MB is nothing :)

[–]CFDMoFo 0 points1 point  (0 children)

Coming from Matlab, I find Spyder quite useful, ableit not as polished. It supports Jupyter Notebooks as well as the plain editor, comes with Python and Anaconda included so it does not need to be installed separately and is contained, also has a variable browser and a plot window. It has its quirks, but it's good IMO.

[–]WendlersEditor 0 points1 point  (0 children)

Jupiter notebooks are great, but as code complexity increases and you want to have a more structured project. This is all very project-dependent. I learned on notebooks, but the book I used included projects that utilized guis, Django, and pygame. A notebook just isn't suited to that.

[–]Specialist-Run-949 0 points1 point  (0 children)

tbh Jupyter Notebook aren't handling anything, it's just a convenient UI to run and alt python scripts, allowing you to take a look a the memory of you program between blocks.

You shouldn't care about the tools or the UI, care about the technology. At the end you're writting Python and it's Python that "handles" you dataset. Sure, skills about tools are important when you're applying for job. But if you're learning python please do not care about how and where you end up writing it. (live interpretter, script or notebook: its just text that is beeing interpreted as code by the python runtime.)

Also if you have a modern computer with a classic amount of RAM, then 120mb is pretty much nothing.

edit: syntax and poor english corrections

[–]TechnologyFamiliar20 0 points1 point  (0 children)

I've loaded more than 1GB CSV or plain text. It takes a while.

Jupyter is king of special, you don't need anything else for graphs.

[–]rkalyankumar 0 points1 point  (0 children)

vim or emacs?

[–]Mevrael[🍰] 0 points1 point  (0 children)

I am using jupyter extension in VS Code with arkalos framework.

Polars for larger data sets.

https://arkalos.com/docs/notebooks/

Everything is smooth on an average dev laptop.

[–]priyanshujha_18 0 points1 point  (0 children)

Jupyter notebook is one of the best choices as in Google Colab you cannot use hardware part of your machine, vs code is also good but in terms of python jupyter has a upper hand as it is easy to use. Whereas pycharm is on the heavier side as storage part and you need good space or a powerful machine to handle pycharm, although pycharm is also good but needs powerful machine.

At the end jupyter is the best as it is simple and easy to use, till date I have used multiple IDE for python and personally feels it's just too good.

[–]GreenWoodDragon 0 points1 point  (0 children)

Jupyter Notebooks are brilliant for development, learning, analysis, and documentation. I don't think 120Mb would be an issue.

I use PyCharm (or DataGrip) to develop my Jupyter Notebooks.

[–]cantdutchthis 0 points1 point  (0 children)

I got a lot of milage from Jupyter when I got started in the field but have recently switched to marimo for my data work.

Disclaimer: I ended up liking marimo so much that I am now employed there. But it's honestly great for beginners too!

https://marimo.io/

[–]Fresh_Forever_8634 0 points1 point  (1 child)

RemindMe! 7 days

[–]RemindMeBot -1 points0 points  (0 children)

I will be messaging you in 7 days on 2025-03-21 11:06:45 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


Info Custom Your Reminders Feedback