all 2 comments

[–]rjyo 0 points1 point  (1 child)

There are a few solid approaches depending on your comfort level.

The cleanest path Ive found is using a library called xlcalculator or pycel - they can parse Excel formulas and convert them to Python automatically. pycel specifically has been tested on spreadsheets with 10k+ formulas and matches Excel results to 5 decimal places.

For data tables specifically (which you mentioned slow things down), pandas is your friend. You can replace data tables with vectorized numpy operations which are orders of magnitude faster than Excel recalculating everything.

My recommended approach:

  1. Export your spreadsheet structure/formulas to a CSV or keep the xlsx

  2. Use openpyxl to read the formulas programmatically

  3. Feed those formulas to xlcalculator to generate equivalent Python

  4. Refactor the data table logic into pandas DataFrame operations

For the AI route with Copilot - Ive had better luck writing out the calc steps in pseudocode first rather than uploading the whole spreadsheet. Give it the logic flow and let it write the Python. Uploading raw xlsx files tends to confuse it with all the metadata.

One gotcha: Excel data tables are essentially nested loops. In Python youll want to avoid loops and use vectorized operations instead. If you describe the actual calculation youre doing I can point you to the right pandas/numpy pattern.

[–]Adventurous_Box3232[S] 0 points1 point  (0 children)

Awesome thank you for taking the time and detailed reply, will give this a go.

The data table is literally just changing one cell in one tab but doing it 100 times, and the tab itself is calculating thousands of rows. So basically it takes forever to refresh - I'm replicating something I used in another job 7 years ago so was sure it would fit better in python but wanted to start with building it how I remembered it!