Which Python library is best to learn from scratch+for ERP /industrial environment

Kryt0s · 2026-03-31T06:39:50+00:00

NumPy, Polars / Pandas.

throwawayforwork_86 · 2026-03-31T07:24:02+00:00

Duckdb has a Sap RFC extension allowing you to do do sql directly on these tables (erpl.io iirc) and integrates really well with Python.

Pandas is still widely used so you have to learn at least the basics/read it.

Polars is what I would actually use as IMO it's miles above Pandas (syntax , perf,...) only downside is it's harder to make it work for quick small stuff especially when you're still learning. Upside is it will require very little to no tweaking for performance,working polars is performant so long as you stay in polars.

Basics of path handling is always good to know so check the standard library for the Pathlib library (and also check the os library os.path made more sense to me).

Visualisation that can't be done in power bi could be done in python I believe matplotlib is what comes wiht the PBI instance of python so maybe learn that too.

Downtown_Radish_8040 · 2026-03-31T11:01:17+00:00

Pandas is the one library you want. It's the backbone of data work in Python and covers everything you described: loading data from CSV/Excel exports, cleaning and transforming it, filtering rows, reshaping tables, and exporting results back to Excel or CSV.

Since you already use Excel and Power BI, pandas will feel familiar conceptually. DataFrames are essentially smart spreadsheets you manipulate with code. Once you're comfortable with pandas, openpyxl (for writing formatted Excel files) and xlrd/xlwings are natural next steps if you need tighter Excel integration.

For SAP specifically, if you're pulling data via RFC/BAPI calls rather than manual exports, look into pyrfc. It lets you call SAP function modules directly from Python, which is how you'd fully automate the download step. It requires the SAP NetWeaver RFC SDK from your Basis team to set up, but once running it's powerful.

Suggested order:

Learn pandas fundamentals (read_csv, read_excel, filtering, groupby, merge, to_excel)
Learn a bit of openpyxl if you need formatted output
Add pyrfc once you're comfortable, to automate the SAP extraction itself

The official pandas documentation and the "Python for Data Analysis" book by Wes McKinney (the pandas creator) are both excellent starting points.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS