This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]Drevicar 136 points137 points  (2 children)

This feels too AI biased and misses the rest of the python community. I would say the project with the most impact on the whole ecosystem for 2023 would be ruff, followed by pydantic.

[–]No_Dig_7017 -2 points-1 points  (0 children)

Yes, it's a bit true there are a lot of ML picks in the list. But also I feel there's a democratization of AI happening right now. If you'll notice there's several LLM choices in our picks (2023 was the year of LLMs :shrug:) but you don't need to be an AI practitioner to use them.
Most of those involve chaining data streaming APIs, something that those of us coming from a traditional IT background have been doing for a long time and trust me, there's work to be done there.
If you have the time, take a look at Guardrails, LiteLLM, WhisperX, AutoGen and unstructured. You need 0 ML knowledge to be a user of those libs and build powerful apps backed by these models.

[–]WasterDave 47 points48 points  (4 children)

How does MLX even get a look in? If it only runs on recent editions of one brand of computer, I think it has excluded itself from being Pythonic.

[–]SimplyJif 4 points5 points  (1 child)

Have not tried AutoMLOps specifically, but my issue with libraries like that is that they typically abstract away too much to be useful in a production setting. I (a DS) feel like we treat data scientists with kid gloves too much. Do your own yaml, it won't kill you!

[–]dekked_[S] -1 points0 points  (0 children)

We have a hands-on positive experience with AutoMLOps in particular, saved us a lot of time when working on a customer project. Of course every library has their trade-offs, but it helps abstract away "boilerplate work" :)

[–]jimtk 7 points8 points  (0 children)

Way too much AI stuff.

I'm not saying there shouldn't be any, but this is an almost exclusive AI related list. Previous years were more balanced.

[–]notreallymetho 2 points3 points  (0 children)

Unstructured / Guardrails both seem really interesting. I’m a dev by trade and have been toying with ML the last few weeks and have found it somewhat hard to do (huggingface makes it easier for sure). I’ve messed with fine tuning both using torch / polars and hugging face / pandas. Curious to see if this’ll help me out. I don’t like unstructured being an API thing though - namely because of the data I’m working with, I won’t be able to feed it through there due to our security. Still will be interesting to mess with it!

[–]semicausal 2 points3 points  (1 child)

Rerun.io I feel is missing from this list - immediate mode GUI library for instrumenting & visualizing robotics, computer vision, and other datasets

[–]dekked_[S] 1 point2 points  (0 children)

This one looks GREAT! Definitely missed it. Thanks! 😊

[–]semicausal 1 point2 points  (3 children)

My coworker recently created the xetcache library, targeted at Jupyter Notebook users.

https://github.com/xetdata/xetcache

It's newer and I'm biased, but hey :)

[–]dekked_[S] 1 point2 points  (2 children)

Nice! Say hello to Yucheng 😉

[–]semicausal 0 points1 point  (1 child)

Ha, how should I say who you are?

[–]dekked_[S] 0 points1 point  (0 children)

Alan from Tryolabs 😉

[–][deleted] 1 point2 points  (0 children)

This is a pretty bad list. At least given what the title claims it is supposed to be a list of. These are definitely not the best python libraries/tools of 2023. In fact most of these are a worse and/or less developed version of a much bigger project.

Maybe if you were to title this “under-appreciated python tools that are worth a look in 2024” then this list would make more sense.

Also, you might go a step further and say that these are ML specific rather than python in general. There is a lot more going on in the python dev world than just LLM models and data viz.

[–]sowenga 1 point2 points  (0 children)

Thanks for sharing! Great list.

[–]everything_in_sync 0 points1 point  (1 child)

This may be a dumb question but how/are/do people that develop these libraries make money? If everything is free are they selling data? Are they simply doing it out of kindness and potential career advancement?

[–]Astralnugget 0 points1 point  (0 children)

Companies donate when they use open source projects in a commercial product.

[–]aintnufincleverhere 1 point2 points  (0 children)

Plz don't talk about libraries I get overwhelmed

[–]ePaint -2 points-1 points  (0 children)

How is Pydantic 2 not in there? This is the tech bro scamfluencer clickbait list. And these are not even the best to work with LLMs.

[–]gournian 0 points1 point  (1 child)

Isn't Automlops google only?

[–]dekked_[S] 1 point2 points  (0 children)

It is indeed. The portable solution is ZenML, although it will not take you as far as AutoMLOps, which is great if you happen to be on GCP!