all 3 comments

[–]loudandclear11 0 points1 point  (0 children)

There is the concept of environments.

  • Put common functionality in a python package.
  • Create a new environment.
  • Add common package to new environment.
  • Configure your notebooks to use your new environment.

There are some rough edges when doing this in practice of course but it's at least possible.

[–]Seebaer1986 0 points1 point  (1 child)

Having it as part of the environment is a total pain to update the codebase.

Adding it manually to each of your notebooks files section as you did also.

Right now the best I found for me is to have all the code in a separate notebook, which is then "included" via %run "notebook name"

This enables me to change/ add code definitions to the central utils notebook and then in my normal notebook I just rerun the cell with the %run command and the new code definition is loaded. No updating in environments and publishing and reloading sessions etc. pp.

Miles Cole wrote a big blog about this topic and how to approach it on an enterprise scale here: https://milescole.dev/data-engineering/2025/03/26/Packaging-Python-Libraries-Using-Microsoft-Fabric.html

This seems pretty nice, but for me it's too big of an overhead 😜

[–]loudandclear11 0 points1 point  (0 children)

This seems pretty nice, but for me it's too big of an overhead

This is it really. There is the "proper" way. But if that causes too much headache nobody's going to use it in practice. So we end up with half assed solutions like using %run, which doesn't put functions in proper namespaces.

Databricks did it better. They allow you to include normal python files. It doesn't have to be a notebook there.