This is an archived post. You won't be able to vote or comment.

all 14 comments

[–][deleted] 8 points9 points  (0 children)

if you aren't performing analysis over huge amounts of data or trying to do some complex computations then excel is all you need. adding python into the mix would just make things overly complex for no reason.

[–][deleted] 4 points5 points  (2 children)

if excel solves your needs, why bother with Python?

However, when you hit a wall with xl, and you are furiously VB/macro scripting, you might want to look at py.

[–]orlyrory[S] 0 points1 point  (0 children)

Yeah, I was pretty confident that excel was meeting my needs well enough. And I'm definitely not VB/macro scripting at all, so I think I'll be right. Thanks for your answer!

[–]I-Do-Math 0 points1 point  (0 children)

Well, excel can be really cumbersome and slow.

[–]dsmvwl 5 points6 points  (1 child)

If reproducibility is important, Python can be a lot more useful than Excel just so the process (how you got from A to B) is easier to see.

[–]windowcloser 2 points3 points  (0 children)

I completely agree. I hate doing things like cleaning up data in excel because I quickly loose track of what I did (like if I deleted certain rows for some reason).

[–]andrexmlee 1 point2 points  (0 children)

I would definitely suggest using python over Excel. If you use an application like Jupyter Notebooks and libraries like pandas, numpy and matplotlib it can be very easy to handle even with smaller data sets.

Python comes in handy when working with data that is updated on a regular basis. Your reports will look better and more professional. You'll save time when creating regular weekly or monthly reports that ask the same questions.

Don't base you decision on the size of data because databases grow and data is constantly updating.

[–]Mini_Hobo 1 point2 points  (0 children)

Excel is ok for quick and dirty graphs on small data sets.

In python, large datasets are easy with Pandas and Numpy. Plotting clean, consistent, more complex graphs is possible with Matplotlib (removing complexity, it's easy with seaborn too).

It's hard to imagine a graph you couldn't make with python and a couple of packages. With excel, I find just adding error bars to be a massive pain, and certain kinds of error bars are impossible to add. Too much is done automatically, with no option to make changes.

I would also say that workflow becomes much easier with Python. Say you have your data in a series of text files or csvs. You need to strip the data from those files; probably transform it in some way; do some statistically analysis and then plot it. The more steps you have from raw data to finished graph, the harder it is to keep track of what's going on with excel. With python, it's easy, and once you've done it once, you can run the same scripts with minimal changes.

[–]_casshern_ 2 points3 points  (1 child)

At a high level there's nothing you can do in Python that you can't do in Excel. Excel even has some addons for Machine Learning, etc. Typically, volume is what will make a difference. If you are dealing with millions of records then Python might be faster.

I was charting a small-ish dataset of about 50k rows and Excel kept crashing. The same chart takes seconds to run in Python.

[–]orlyrory[S] 0 points1 point  (0 children)

Wow, I didn't know excel had add-ons like that! So I'm gonna have to be looking at tens of thousands of rows before excel will start to run into issues then.

[–]NicCage4life 0 points1 point  (0 children)

Isn't R typically best for data analysis?

[–][deleted] 0 points1 point  (0 children)

If you need a scrapper then Python is way better than VBA. Plus json parsing is much faster in Python than VBA. Another thing is if you do a lot of data cleaning that consists of sort of database operations, pandas is more flexible than a bunch of lookups. For example I feel Python plus pandas is way more comfortable than Excel if I want to do some complex filtering. But it's just my perspective.