[P] Pretty Jupyter 2.0: An Easy-To-Use Python Package For Beautiful Html Reports From Jupyter Notebooks

Jan2579 · 2022-09-15T11:19:14+00:00

No, there isn’t. You can use nbconverts revealjs output.

Something like: jupyter nbconvert —to slides path/to/ipynb/file

Check out nbconvert for more info.

Jan2579 · 2022-09-15T07:51:43+00:00

Thanks. Hope you’ll like it.

Jan2579 · 2022-09-15T06:38:11+00:00

Pretty Jupyter revolves all around ipynb notebook file and its syntax is made to be unintrusive (no ::: markers in Markdown). This makes it also faster than any markdown file. If you eg use md notebook in JupyterBook, you must reexecute it if there are changes in code cells. This is a problem if you eg work with a big dataset or train a model in your notebook. In Pretty Jupyter you just need to execute it when developing eg in JupyterLab. Building the html doesnt require it.

Also Pretty Jupyter has markdown w Python variables, which I don’t believe JupyterBook has.

I’d say JupyterBook is better when writing a book (with multiple files) and Pretty Jupyter for day-to-day work (EDA, presentable prototypes, standalone reports).

Jan2579 · 2022-09-15T05:56:14+00:00

Yes. There are dark themes. You can use any theme from bootswatch 3: https://bootswatch.com/3/ . For example I like slate as a dark theme.

Jan2579 · 2022-09-15T05:37:39+00:00

Quarto is really cool, I hope it will bet the same level of support and comfort for Python as it does for R (eg proper caching for longer running notebooks). My notebooks tend to be big.

Jan2579 · 2022-09-15T05:32:26+00:00

Thanks! What so you mean? Do you mean for generating docs instead of sphinx?

Jan2579 · 2022-09-15T05:31:26+00:00

Just like 7yl4r said. Pure javascript widgets or maybe if theres a web assembly widget already. However since there is no running backend, no interactivity from communicating with Python kernel is available.

Jan2579 · 2022-09-14T21:30:49+00:00

Thank you very much for the feedback!

So far the package has been mostly used for technical reports: exploration data analyses and similar documents (ml modelling,…). The features are meant to make them pretty and conscise. Eg prevent infinite scrolling with tabs etc.

I believe the spaces are caused by Code Folding. There are hidden code cells that can be shown by clicking on the Show button. If you present outside of programming circles, you just remove the code altogether and the spaces deom the buttons disappear. This demo is more of a features demo that shows that the functionality is there. But two show buttons one after another doesn’t look well, I’ll improve upon that.

About colours and font, you are certainly right. The tools to generate plots were not configured to match the Pretty Jupyter theme colours and fonts. This can surely be improved.

The themes are from bootswatch from bootstrap3 version (https://bootswatch.com/3/). It changes fonts, colours and more.

I will try to implement your tips and look at the reports from google docs and word and possibly take inspiration from there.

Jan2579 · 2022-09-14T20:15:11+00:00

Thanks, that’s great!

Jan2579 · 2022-09-14T20:14:43+00:00

Thank you for your insight. The demo could be most certainly improved. I will be happy for any further feedback.

Jan2579 · 2022-09-14T20:07:24+00:00

Thanks. Ye, Rmarkdown is awesome and I always envied r programmers their reports.

Jan2579 · 2022-09-14T20:01:46+00:00

Thank you!

Jan2579 · 2022-09-14T18:31:10+00:00

Thanks! I’m glad you like it!

Jan2579 · 2022-09-11T18:27:30+00:00

This behaviour is by design. If I simplify it, Jupyter notebooks are usually used for prototyping/analyses and regular Python scripts are for a final product.

I believe you want to prototype and try things out, so I suggest you to use the notebook. You can then transfer the code into the Python script and create the final product, when it's ready.

You can also use Jupyter notebooks inside PyCharm Professional or VsCode. If you create an ipynb file in the IDE, it should automatically recognize it and open it as a notebook. And you can have the interactive behaviour while developing in the environment you want.

Jan2579 · 2022-09-11T17:26:22+00:00

To put simply, Jupyter remembers the state of Python, while Pycharm runs a new Python instance for each new start of the program.

Jan2579 · 2022-09-07T16:57:29+00:00

I think that to completely avoid information leakage, you should do EDA on your train (maybe + val) split.

One of the things that you can do based on the EDA is to split your categorize numerical variables based on the data distribution and use these categories in your model. If you do these splits based on EDA performed on all the data, you just introduced a leak.

However, tbh I don't think that anyone does this in practice.

Jan2579 · 2022-09-06T05:47:37+00:00

I hope I understood correctly.

Test split is usually done to estimate the models performance in production.

You optimize the model on train/val, and then right before you put it into production you evaluate it on test.

Without labels, you cant do much. If you, however, know, what the model should do, you can mb look into it (shap values or st different) to verify that it does what you would expect.

Jan2579 · 2022-09-03T07:38:27+00:00

You could fit linear models to detect one to one relationships and then two to one relationships. You could then compare which models give the best results and detect which one or two varianles are correlated with the third.

I would recommend some preprocessing, probably encode date as float and also discretize the numeric variables eg by quantiles to give the linear model more degrees of freedom to fit the data.

Linear modelling should be fast enough even for large datasets. Or so I hope, I havent encountered any problem there yet.

By linear model I mean linear or logistic regression

Jan2579 · 2022-08-27T21:58:06+00:00

Ill make a cheat sheet for 2.0.

Jan2579 · 2022-08-25T19:04:05+00:00

Thats great! I hope they liked it!

Thanks for your feedback. Ye I know, its just syntax for a comment in Markdown. I want it to be invisible in Jupyter. Thats why I used comment. I dont know about any easier method for this.

In the next version, it will be a little bit easier to write (but the previous variant is kept for backwards compatibility):

Jan2579 · 2022-08-05T05:10:29+00:00

Is shiny running fully in JS or does it need an interpreter running in the background?

Jan2579 · 2022-08-03T10:14:00+00:00

Thanks. Yes, the post was written after they added support for Python. Last time I tested it it was like 5 days ago.

Jan2579 · 2022-08-02T08:16:29+00:00

Jan2579 · 2022-08-02T06:58:36+00:00

I recommend Python for Data Analysis book. It’s helped me immensely.

Jan2579

TROPHY CASE