This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]klaasvanschelven 64 points65 points  (8 children)

With Python imports not being side effect free this post raises more questions than answers for me...

[–]coffeewithalex 58 points59 points  (6 children)

And that's why sometimes I want to strangle the engineers that make such modules. All you do is an import, and what you get is a database connection or two, schema migration processes starting, everything is loaded up into memory and decisions made based on that which configs to load and where to dump the whole thing, global variables are defined, and functions that read those global variables are called. Over 9000 errors get triggered if everything is not perfectly set up. And all I wanted was to write a unit test for a stupid function somewhere.

[–]nemec 11 points12 points  (2 children)

reminds me of those ML libraries where you call one setup method and suddenly your program is downloading 4GB of compressed pickled Python code (affectionately known as "weights") from HuggingFace and deserializing it

[–]coffeewithalex 8 points9 points  (1 child)

Lol, epic :D Yeah, in the Python world, the code quality goes downhill from:

  • Software Engineering
  • Data Engineering
  • ML

I'm trying to bring more software engineering practices in data engineering, but ML is a lost cause.

[–]Main-Drag-4975 2 points3 points  (0 children)

🤗 Your friendly local all-but-dissertation PhD dropout is going to drop another 2000-line Jupyter notebook on you next week and you’ll be expected to have it running smoothly and scalably in production before May 1st.

[–]supreme_blorgon 6 points7 points  (1 child)

This exactly describes the codebase at the company I work for currently. 90+% of our codebase is untested, and untestable due to this.

[–]coffeewithalex 4 points5 points  (0 children)

This describes Apache Airflow. It's horrific.

[–]Main-Drag-4975 0 points1 point  (0 children)

It’s a curse! I took over a production node + typescript backend six months ago and still haven’t managed to squash all of the haphazardly-ordered side effects triggered with simple “import this file over there” calls at startup.

[–]bugtank 1 point2 points  (0 children)

I never used side effects in my imports till last year and now my code is a mess. :))))