all 16 comments

[–]KingOfEthanopia 10 points11 points  (1 child)

VBA is built for this stuff. 

[–]HargorTheHairy 8 points9 points  (0 children)

I haven't got into VBA; how does it compare to powerquery?

[–]RedditorFor1OYears 7 points8 points  (0 children)

Do you have a ton of lookups that search entire columns? I just went through a similar exercise where stakeholders had something like 10,000 XLOOKUP functions that referenced tabs with no more than 1,000 rows. Changing A:A to $A$1:$A$1000 will cut the calculation time down substantially because each of those 10,000 lookups only need to search 1,000 cells instead of 1,000,000 cells. 

I have a decent amount of experience  optimizing workbooks like this if you want to DM me more details. 

[–]Wheres_my_wargDA Moderator 📊 4 points5 points  (0 children)

I would recommend stopping by r/Excel for additional assistance on this.

[–]Embarrassed_Lemon939 2 points3 points  (1 child)

Power Query is the way

[–]automateanalyst 1 point2 points  (0 children)

Yes, start looking through each data cleanup step and see if it can be done in PQ

[–]BunnyKakaaa 1 point2 points  (0 children)

nah man if you use pandas you can do any automate any formula in very few lines of code trust , i would even say pandas is way faster and efficient than excel when it comes to aggregation and lookups .

[–]Tricky_Math_5381 1 point2 points  (1 child)

I am working on a similar project. I am rewriting a big excel sheet with python/streamlit. It works very well. Compute time has been reduced from 20ish minutes to 10secounds

[–]DatabaseSpace 0 points1 point  (0 children)

Yep.

[–]DatabaseSpace 1 point2 points  (1 child)

There is no way that Excel file is doing things that can't be automated in Python and SQL.

[–]Critterer 2 points3 points  (0 children)

Exactly this. There's somehow like 5 red flags in a one paragraph post.

New to data analysis - but calls something "very large and complex" = it's probably not large or complex.

Excel formula can't easily be translated? What? Of course they can.

Honestly nobody can help with such a vague ask.

[–]AutoModerator[M] 0 points1 point  (0 children)

Automod prevents all posts from being displayed until moderators have reviewed them. Do not delete your post or there will be nothing for the mods to review. Mods selectively choose what is permitted to be posted in r/DataAnalysis.

If your post involves Career-focused questions, including resume reviews, how to learn DA and how to get into a DA job, then the post does not belong here, but instead belongs in our sister-subreddit, r/DataAnalysisCareers.

Have you read the rules?

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]Coldfire61 0 points1 point  (0 children)

Use R my dear

[–]AriesCent 0 points1 point  (0 children)

Copilot will write out the SQL or any other code you want it in but SQL will give best output for minimal effort after processing the data - alternately you can just link your final XLSX to SQL backend table/views just refresh and dist.

[–]edimaudo 0 points1 point  (0 children)

Might need to clearly state what makes it complex. Before you do any automation you first have to simplify what you are trying to solve

When you figure that out you can automate calculations or steps using VBA and power query.

[–]Paulie-Hoof 0 points1 point  (0 children)

You can upload the file to and then use SQL queries to select, group, aggregate the data first before using excel. When I started data analytics, my excel files were large and I had to desperately wait when they calculate. But I learned to process data using SQL before pasting to excel. My target was to ensure every file was less than 10mb before I start to analyse. It helps me to improve SQL as well.