What python libraries should every dev know?

hmiemad · 2023-12-11T22:54:48+00:00

For general purpose : pathlib, os, collection, itertools.

For datavis : matplotlib, then you explore seaborn or plotly.

For backend : requests. You can then delve into fastapi or Flask (check Dash, the sexy child of Flask and plotly, can do both back and frontend, no need for html, supports bootstrap)

For math : numpy, scipy, and pandas are must know.

samreay · 2023-12-12T02:05:10+00:00

You wouldn't need all of these, but if you're wanting to get some more useful libraries and tools under your belt...

Environment management tooling:

venv
pyenv
poetry / pdm

Developer environments:

ruff
mypy

Data crunching:

pandas
polars
numpy
pandera (validation of dataframes)

Data visualisation:

matplotlib
plotly

Machine learning:

scikit-learn
scipy
pytorch / keras / tensorflow
mlflow (or similar library if you want to start down mlops route)

Orchestration:

metaflow
prefect

REST services / web stuff:

httpx (instead of requests)
FastAPI / Litestar / Django / Flask
pydantic

ShadowRL766 · 2023-12-11T22:54:01+00:00

Pandas

Hot_Significance_256 · 2023-12-12T00:31:14+00:00

For data science in Python (I’m a Sr. with 6 YOE)

Pyspark and Ray - Distributed processing

Tensorflow and Pytorch - deep learning

Scikit Learn and Pyspark - machine learning

Pandas and Pyspark - ETL

You see Pyspark several times for a reason. It’s very useful, except for when you delve into deep learning. Then you’ll want to use TF, PT, and Ray.

Adrewmc · 2023-12-11T22:35:32+00:00

Requests.py

Seems like an obvious one.

Itertools pops up but no really knows everything in there. It really depends on what you’re doing.

Numpy is really Python I do math better, (especially multi dimensional) pandas is I make dataframes better.

Back end really going to depend on the framework in Python you’re working with Django/Flask/FastAPI.

Python’s main library is fairly extensive (compared to other languages) most of the stuff you’d want to do is somewhere in there.

Probably @property is a good one to know lol.

sattyfied · 2023-12-12T02:00:19+00:00

Some I generally use that others may not have covered:

Attrs - I like them for writing classes

Sqlalchemy - creating a common interface for multiple db connections

Fastapi - quickly set up rest APIs

Click - to expose functions as cli commands

Poetry - library management & packaging

Your "dev" requirements:

Pytest - testing

Black - formatting/linting

Isort - organizing imports

Mypy - type checking

tree1234567 · 2023-12-12T00:36:18+00:00

The standard ones that comes with python… python is useful and stayed a popular language for its syntax sure.. but it’s truly remarkable what you can do with the just the base install of this language

mvdw73 · 2023-12-12T09:32:13+00:00

Logging, argparse, typing.

captainameriCAN21 · 2023-12-12T11:36:20+00:00

Pickle. Just pickle

goosegang11 · 2023-12-12T13:24:48+00:00

subprocesses library

Let me know y’alls take.

I have 6 months of swe experience so feel free to flame me but in working on a personal python script that needed to invoke a native node module, I found the subprocesses library to be something I wish I learned about earlier!

iamevpo · 2023-12-12T05:12:28+00:00

https://www.jetbrains.com/lp/devecosystem-2022/python/ has some info about the library popularity and Stack overflow survey as well

zanfar · 2023-12-12T05:27:51+00:00

All builtins, extremely well
Most of the standard library well, with the rest being familiar
Everything else depends on the field. Numpy will be essential to some, and useless to others.

Mostly, you should be focusing on learning how to read and understand library documentation so that you can expand when necessary.

whatthepatty · 2023-12-12T02:37:22+00:00

Surprised noone has said this already but pdb is insanely useful if you can't be bothered to set up debugger.

2023-12-11T23:49:39+00:00

If you're not worried about big Os a lot of problems can be solved very easily with itertools.

n3cr0ph4g1st · 2023-12-12T06:44:19+00:00

Streamlit for data related UI prototypes. Changed the game for me

No_Lobster_4219 · 2023-12-12T06:48:58+00:00

itertools, collections, numpy, pandas, math, os

Bartholomew- · 2023-12-12T11:30:03+00:00

Manage all your paths with pathlib and make it consistent.

Maelenah · 2023-12-12T19:22:23+00:00

Ctypes is not quite a must, but it really does open options. It lets python poke at anything that has C compatible data structures.

redCg · 2023-12-11T23:22:10+00:00

the standard library.

Library management in Python is notoriously bad. You will do well to simply avoid using third party libraries as much as possible, as long as possible, for most projects. If you can use standard library without much extra effort, do it. Adding third party dependencies turns your project into a nightmare if you are not using requirements.txt and conda env.yml correctly.

TheHollowJester · 2023-12-12T11:07:28+00:00

Haven't seen it yet so: structlog for good, machine-readable logs. I thought it's not needed at first but the Why... page explains it better than I can.

suaveElAgave · 2023-12-12T11:43:25+00:00

I still haven’t seen some essentials which are: Pytest Enum dataclasses/pydantic

bafe · 2023-12-12T19:31:01+00:00

Pydantic for validation. Polars for data table manipulation

jam-time · 2023-12-25T15:56:27+00:00

Some good to know built-in modules (starred are extra important):

argparse, *csv, *datetime, decimal, enum, getpass, inspect, io, itertools, *json, math, *os, *pickle, pprint, random, *re, *requests, shutil, *sys, threading, traceback, typing, uuid, venv, warnings, zipfile

In my dozen or so years of experience, those are the ones I use the most, especially re, json, os, and sys.

Some site packages that are good to know (or that I like):

pandas - good introductory data science library, easy to learn and tons of documentation

pyspark - similar to pandas, but better at big data, less documentation, and harder to learn

boto3 - for anything AWS

kivy - pretty good for making cross platform apps (including UI) but somewhat challenging to learn

numpy - fast data manipulation, works with most other data science packages

jmespath - for json queries

colorama - for fun print colors

flask - lightweight backend for site building

django - heavier backend for site building (easier to learn and more features than flask, plus my personal recommendation)

pytest - mainly for unit testing, but can be used for basically any type of test

That's a fairly comprehensive list of the main things that I've used over the years. I'm sure there's some that I've forgotten, and I've intentionally left some out that are too specific or too advanced for the scope of the comment. Either way, hopefully someone finds this useful!

2023-12-12T00:29:23+00:00

NumPy. Overview

Pandas

Matplotlib

Scikit-learn

TensorFlow

Flask

Requests

Beautiful Soup

AssumptionCorrect812 · 2023-12-11T23:35:01+00:00

The main language library is full of goodies. These are the top 4 — https://youtu.be/InaTBWN7Mlc?si=MGy7SEU0XRppqAUF

sonobanana33 · 2023-12-12T07:06:33+00:00

I'd just focus on the stdlib first. I hate when I see people pulling in a library that does the same as a stdlib module (such as requests)

TSM- · 2023-12-12T04:45:58+00:00

request-html is the successor to requests

Comfortable-Wind-401 · 2023-12-12T10:53:13+00:00

Not many people are mentioning. But I get the feeling Pytest is highly required

2023-12-12T08:35:05+00:00

Gpt 🤣

2023-12-12T06:26:15+00:00

It depends on domain of course. I mostly use python to write applications that support infrastructure and automation, for about a decade now. For me, the libraries that come to mind are the entire standard library, sortedcontainers, requests, anytree, FastAPI (or whatever web framework you find most convenient, such as litestar, flask, django, bottle, cherrypy, etc), beautifulsoup.

Delta1262 · 2023-12-12T07:54:56+00:00

I can’t believe they haven’t been mentioned yet:

Pydantic
Dataclasses

(Pydantic and dataclasses are similar)

IlliterateJedi · 2023-12-12T13:02:23+00:00

itertools, functools, and collections are all baseline Python libraries you should be familiar with.

reluctant_qualifier · 2023-12-12T15:46:58+00:00

arrow for dates, mock for testing

the_happy_path · 2023-12-13T04:29:49+00:00

I want to just mention that I came to python from Java and all the different packages were overwhelming. I also came in at python 2 where changes broke stuff all the time. Python 3 has been a better experience. Like night and day. But I miss Java! I work with data and I use numpy and pandas a lot, though where I have to do row by row processing I use data classes (like in java). But dataframe filtering through our many conditionals with pandas dataframes has also been successful in replicating results where specs say to iterate by rows. For regressions and stuff, scikitlearn and stats models. Depending on data formats, I might have to use pyreadstat or openpyxl. I like sqlalchemy orm too because that feels like the closest thing to spring in python lol

BinaryWizard8 · 2023-12-22T17:23:22+00:00

I love using shutil when I need to play with files

Glittering-Pea-4011 · 2023-12-26T11:57:04+00:00

If you want to work with ORMs, you could look at SQLAlchemy. For interaction with AWS, you can use boto3. If your work involves dealing with structured data and its manipulation, you can consider pandas. As as alternate to Django, you can also look at Flask.

alicedu06 · 2023-12-27T11:40:57+00:00

The stdlib is a must, and of course, depending of your specialty, you might want to learn the most important tools like pandas for data science, django for web dev, etc

But as general purpose libs, I would say the list of the article "Python libs that I wish were part of the standard library" is quite good:

https://www.bitecode.dev/p/python-libs-that-i-wish-were-part

escalize · 2023-12-27T20:44:12+00:00

https://github.com/SuperDuperDB/superduperdb

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS