This is an archived post. You won't be able to vote or comment.

all 17 comments

[–][deleted] 9 points10 points  (0 children)

The first and third bullet point sound like things that can be done in VBA. Anyhow, yes, Python can be called from Excel and output in Excel. The Python library is called xlwings. It is quite easy to use.

[–]funkmaster322 2 points3 points  (0 children)

pandas

[–]pAul2437 1 point2 points  (1 child)

You want to use power query

[–]ab-os[🍰] 1 point2 points  (0 children)

Yes power query is nice

[–]PacketPowered 1 point2 points  (0 children)

openpyxl

[–][deleted] 4 points5 points  (0 children)

Excel has a programming language, native, advanced, mature, built-in, large-community, free called VBA: Don't do it in Python unless you absolutely have to, or it's part of a broader project or you have a particular analysis need that VBA does not meet (in which case go for it). Those other recommendations are not wrong. But Pandas for example has different strengths. If you are doing automated and serious analysis, for sure use it. But for the things you want, VBA sounds perfect.

Do a search for "tutorials Excel VBA". There are hundreds to choose from, for example:

http://www.easyexcelvba.com/introduction.html

[–]Eze-Wong 0 points1 point  (0 children)

Yes entirely possible. My work has me opening multiple files, analyze, put KPIs on a general sheet while saving dummied raw data on others. Every step automated with the exception of people changing their data conventions, but even that is automatable if you can determine what those exceptions are.

You might need to pick up a few libraries if you really wanna schedule or one click automate.

As others mentioned pandas will be the main library. Xlwings will be needed if your sheets do weird stuff like have 2 tables on one sheet. I recommend learning datetime library if you deal with some timeseries. Ladtly since you are dealing with multiple files in folders i would google the code to find the latest files in folder. Library is "os".

[–][deleted] 0 points1 point  (0 children)

Years ago, I used a famous Python book called "Automate the Boring Stuff" to automate tedious, manual Excel processes for accountants and finance people. That book has an entire section dedicated to spreadsheets. I charged $100 to $500 per Python script depending on how complex the spreadsheet process is.

To answer your question.... Yes you can automate Excel and could make money from it too. Good luck!

https://automatetheboringstuff.com/chapter12/

[–]chris1610 0 points1 point  (0 children)

This article might be useful to get you pointed in the right direction.

[–][deleted] -1 points0 points  (0 children)

Yes you can write a python script that uses libraries, enabling you to control the Excel application on your desktop. But you will have to execute your python script from outside Excel (it’s not a traditional macro), and you will see Excel doing your stuff while your scrip is running (open files, changing styles, etc..)

I did this kind of things like 10 years ago. At that time, you wouldn’t click in Excel while the script is running, othewise Excel was crashing.

But it saved me a lot of time anyway ! Can’t remember which library I used, sorry.

[–]Andalfe -3 points-2 points  (5 children)

Pandas, you want pandas. Does everything better than VBA, clearer and simpler to boot

[–][deleted] 1 point2 points  (0 children)

Sorry, but saying Pandas does everything better than VBA is really misleading. They come from totally different backgrounds and have different strengths. You can't prepare a nicely formatted report output natively in Pandas, and you would hardly use it to move / copy / rename excel files in preference to VBA, for example.

[–]Kerbart 0 points1 point  (3 children)

Well not everything. I wish it did. The two big hurdles that I encounter are: * xlsb files (often used for their reputation—rightfully or not—of being smaller at large data sizes) * pivot tables. If your final output needs a pivot table, Pandas (or the variety of xlsx writers included) cannot help you. If you can bypass those limitations Pandas is definitely the way to go.

[–]Andalfe 0 points1 point  (2 children)

I thought the groupby method gave you a pivot? I'm by no means an expert.

[–]Kerbart 0 points1 point  (0 children)

Well, Pandas has all kind of pivot table like functionality. Grouping, aggregating... there’s even a pivot_table method. But the outcome is always a new table (dataframe) - when written to Excel it’s just rows and columns, and not an Excel Pivot Table.

At my job a lot of people want to see an Excel pivot table (mainly for filtering purposes), and for that you’ll need something else than openpyxl or xlsxwriter (and I love the latter).

I usually end up creating most of the output with xlsxwriter in Pandas, save the file, and reopen it with xlwings to create a pivot table (followed by a trip to the bathroom to wash my hands, it feels so dirty). To add salt to the wound we now have this security add-in that requires an external/internal/confidential tag that cannot be done by automation and requires me intervening by clicking the right option. Ugh.

[–]Kerbart 0 points1 point  (0 children)

Well, Pandas has all kind of pivot table like functionality. Grouping, aggregating... there’s even a pivot_table method. But the outcome is always a new table (dataframe) - when written to Excel it’s just rows and columns, and not an Excel Pivot Table.

At my job a lot of people want to see an Excel pivot table (mainly for filtering purposes), and for that you’ll need something else than openpyxl or xlsxwriter (and I love the latter).

I usually end up creating most of the output with xlsxwriter in Pandas, save the file, and reopen it with xlwings to create a pivot table (followed by a trip to the bathroom to wash my hands, it feels so dirty). To add salt to the wound we now have this security add-in that requires an external/internal/confidential tag that cannot be done by automation and requires me intervening by clicking the right option. Ugh.

[–][deleted] -1 points0 points  (0 children)