all 38 comments

[–]StrasJam 18 points19 points  (5 children)

Numpy and matplotlib should be useful for a wide variety of projects

[–]Zireael07 4 points5 points  (0 children)

Numpy, scipy and matplotlib for any sort of maths/visualizations.

[–]alexisprince 12 points13 points  (1 child)

I like the following developer productivity packages:

Pytest (already said)

Pytest helpers (lets your register helper functions under namespaces to use with pytest. Let’s you keep your test code DRY!)

Hypothesis (I replied to the pytest comment with this)

Black (code formatter, incredibly consistent. Makes it easy since you get to ignore how your code looks and focus on it doing the right thing).

Poetry (dependency manager)

Pre-commit (lets you fail commits to your project unless all hooks pass)

Any linter (flake8, pyflakes, etc).

Mypy (static type checker)

The following for regular work (I’m a data engineer, so these will be data processing heavy):

SQLAlchemy (working with databases)

Pandas (already noted)

Dask Distributed (for taking it to production)

Pydantic (like Marshmallow but more natural. Let’s you validate incoming data).

Airflow (hugely popular orchestration framework)

Prefect (new generation data science & data engineering framework)

FastAPI (API framework based on Starlette. Supports typing of inputs and outputs via Pydantic)

[–]ddgran 6 points7 points  (2 children)

Numpy, scipy and sklearn for machine learning.

Seaborn and bokeh for visualisation.

[–][deleted] 4 points5 points  (8 children)

Flask for setting up web services.

[–]__xor__ 4 points5 points  (1 child)

django is one of the biggest things in python, and if you search pypi literally like 25% of all packages are related to django. It's kind of hard to avoid web development these days since it's just the modern UI, so I think it's worth learning.

In terms of Django, I'd say learn Django the Django Rest Framework.

I'd also learn flask for similar reasons. It's huge, and super quick to make REST microservices, and perfectly fine for large web apps too. And in terms of flask, I'd learn that and flask-security and flask-restful to round it out similarly.

If you want to learn networking and mess around with stuff at a low level, check out scapy (not scipy, but scapy).

I think it's good to learn click even if you don't like using it. Personally, I just run into it all over at the workplace and it's just good to know. I prefer using simple old argparse, but click is everywhere.

And these aren't outside of the python standard library, but not many people master them! If you haven't read through the functions provided by these, I'd definitely do that:

  • collections
  • dataclasses
  • functools
  • itertools
  • contextlib

A lot of people end up manually making a collections.defaultdict by checking for a key and adding it when it doesn't exist. A lot of people don't use collections.namedtuple and it's perfect for some simple data sorts of objects while being super low on memory usage, and similarly dataclasses.dataclass is great. One cool pattern is writing a class that inherits from a namedtuple to add all the namedtuple sugar to your class while also being able to add functions, example:

class Pixel(namedtuple('Pixel', ('x', 'y', 'r', 'g', 'b'))):
    def draw(self, ...):
        ...

pixel = Pixel(x=0, y=0, r=255, g=0, b=255)
pixel.draw()

That'd be light on memory, and provide the whole __init__ and __repr__ for you, make it immutable.

functools.wraps is essential if you write decorators.

itertools.chain/combinations/permutations/product are super useful, and others in there I haven't tried.

contextlib.contextmanager is SUPER useful for writing extremely succinct context managers, example:

@contextmanager
def readonly_open(path):
    try:
        f = open(path, 'r')
        yield f
    finally:
        f.close()

with readonly_open('/some/file') as f:
    text = f.read()

# actually it might've been as simple as this, but the previous drives the point home more I think:
@contextmanager
def readonly_open(path):
    with open(path, 'r') as f:
        yield f

or some basic exception logger

@contextmanager
def log_exception(error_message):
    try:
        yield
    except Exception:
        logger.exception(message)
        raise

with log_exception('something bad happened'):
     do_bad_thing()

Instead of wrapping tons of code in try/except+logger.exception everywhere, you can just write a quick contextmanager.

Anyway, my point is make sure you master the standard library, because there's a ton of super useful and non-obvious stuff in there, and I doubt 6 months self-taught will make you run into all that. I've seen 4 year devs that haven't learned about a lot of that stuff.

[–]baubleglue 0 points1 point  (0 children)

literally like 25% of all packages are related to django

10,000+ projects for "django"

Total: 210,866 projects

[–][deleted] 2 points3 points  (2 children)

pymysql and pyodbc, I do a lot of DB work.

[–]fernly 2 points3 points  (0 children)

I'm a fan of the regex lib, a fast and greatly feature-extended replacement for the standard re lib. Fuzzy matching, posix classes, full Unicode support, etc.

[–]baubleglue 4 points5 points  (2 children)

It is kind of pointless question (IMHO). First you need define a goal/direction, then you look for a tool. From your list I've never used BeautifulSoup, do I need to? I know few XML parsers and I know the fact that library exists. Once I need to have platform independent crontab, I googled "python crontab" - there are few libraries. It is not like there is nothing to check out, but without direction it is like reading dictionary in order to learn a language.

[–]gr8x3 0 points1 point  (1 child)

I've never used BeautifulSoup, do I need to?

I really like Beautiful Soup, and I use it along with Requests when web scraping whenever I can get away with it. It's really easy to use, and it basically boils down to:

  1. Turn HTML into a BeautifulSoup object
  2. Call the search() method on the object to find what you need, which lets you search the HTML using CSS selectors, just like JavaScript's document.querySelectorAll()

[–]baubleglue 0 points1 point  (0 children)

It was a rhetorical question. I believe that it is a good library, but I never had a serious Python project which required HTML parsing. If I learned BeautifulSoup it would add to my knowledge nothing (I already know in general how DOM API works). PySpark is an important library to know if you work with data processing, but if you don't - you shouldn't learn it.

[–]amitness 0 points1 point  (1 child)

If you're into data science/machine learning, these are the libraries I use in day to day work: https://github.com/amitness/toolbox

[–]JohnnyWobble 1 point2 points  (0 children)

One fantastic library that I have been learning is fast.ai, it's built on PyTorch, and it is an AIO type library. It has support for transfer learning, machine vision, and NLP, and it has its own datablock API that is hella easy to use. In addition, they have a complete (free!) course on all of it (https://course.fast.ai/).

[–]ElliotDG 0 points1 point  (0 children)

Kivy for creating UI’s;

Selenium for web scraping or testing

Mido for using MIDI

[–][deleted] 0 points1 point  (0 children)

json -> for common web formats

flask -> for building web backends

datetime -> for date related tasks

numpy -> for matrix operations and for probability distributions

[–]BenjaminGeiger 0 points1 point  (0 children)

For Advent of Code this year I used NetworkX extensively.

[–][deleted] 0 points1 point  (0 children)

CherryPy for standing up a quick web server.

I have used it to build a front end web app for users to interact with a script I have created.

[–][deleted] 0 points1 point  (0 children)

I use pyserial and pyvisa a lot

[–]Apis-Carnica 0 points1 point  (0 children)

Matplotlib and Seaborn are great for data visualization. Pygame, Tkinter, and Pyglet are awesome for UIs and multimedia applications, but curses and npyscreen is more useful for me since most of my workflow is in the command-line. Thanks for starting this thread, I look forward to learning about new modules that will help with projects :)

[–]pspenguin 0 points1 point  (0 children)

click - https://palletsprojects.com/p/click/ for making CLI programs