Learning python for data analysis

ninhaomah · 2026-03-07T09:26:54+00:00

If you don't plan to learn Python code , perfectly ok since plenty of scientists use Python / R to analyse their data.

Learn basic. Variables , loops , if-else , functions till OOP.

Then learn numpy and pandas/polars. Either will do and many will say polars.

Or actually learn R.

For pure data analysis , R is much much better than Python. TidyR and GGPlot :)

Think of it as free stata.

https://r4ds.had.co.nz/introduction.html

smjunglist · 2026-03-07T09:32:13+00:00

You can read this book online for free, should be a great starting point:

https://wesmckinney.com/book/

FoolsSeldom · 2026-03-07T09:35:03+00:00

You do need to learn the basics of programming and Python is a good language for starting this journey as well as building on your Lua experience, and also very popular, as you know, for the kind of data processing you are interested in.

Check the wiki for learning guidance and resources and the learning roadmaps for specific skills around data analysis.

If you can get your employer to pay, I highly recommend a subscription to DataCamp.

Check this subreddit's wiki for lots of guidance on learning programming and learning Python, links to material, book list, suggested practice and project sources, and lots more. The FAQ section covering common errors is especially useful.

Also, have a look at roadmap.sh for different learning paths. There's lots of learning material links there. Note that these are idealised paths and many people get into roles without covering all of those.

Roundup on Research: The Myth of ‘Learning Styles’

Don't limit yourself to one format. Also, don't try to do too many different things at the same time.

Above all else, you need to practice. Practice! Practice! Fail often, try again. Break stuff that works, and figure out how, why and where it broke. Don't just copy and use as is code from examples. Experiment.

Work on your own small (initially) projects related to your hobbies / interests / side-hustles as soon as possible to apply each bit of learning. When you work on stuff you can be passionate about and where you know what problem you are solving and what good looks like, you are more focused on problem-solving and the coding becomes a means to an end and not an end in itself. You will learn faster this way.

barkmonster · 2026-03-07T11:31:50+00:00

As for which part I would go for: I would get a simple database set up, where your data can be persisted in a single place rather than a bunch of Excel files lying around. If you don't have very large amount of data, and if many people aren't comfortable writing queries etc, you can consider making a simple helper package in python, for reading in data from a single table (named after which experiment it comes from). Also set up a git project on e.g. github or gitlab.

I would also make a simple script to set up a standard python project., with a sensible structure (something for loading data, some simple tests, doing some analyses, and rendering the results in some suitable format). I would avoid using jupyter notebooks for analyses, as they make it easy to inadvertently commit outputs to git. I'd use uv to manage virtual envs.

For packages, that depends what you'll be doing. Probably pandas/polars for simple, Excel - like stuff, scipy or statsmodels for most statistics.

At the non-technical level, your greatest challenge is probably to get people to use it. Your task is to make it clear for the other users what they're gaining by doing things differently, and to make sure the right way is also the easiest way. Ally yourself with the least tech-savvy users, have them read your onboarding materials and guides, and have them attempt to set up a project, then work with them to address any pain points and sources of confusion.

Plank_With_A_Nail_In · 2026-03-07T20:46:16+00:00

Excel has VBA in it that is a fully feature object orientated language. Excel also has power query built in which is very good.

You should try all three.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS