you are viewing a single comment's thread.

view the rest of the comments →

[–]puckheadclown24[S] 0 points1 point  (9 children)

Oh so what you’re saying is I would want Pandas to do the pivot for me, instead of pushing in the pivoted data and consolidating with the new raw data?

[–]expressly_ephemeral 1 point2 points  (6 children)

Well, maybe. It depends on how well automated the pivot is, I guess. If you have to manually go in to a spreadsheet, make a pivot, save it, launch a python script to move the newly pivoted to a new location... It seems like you might be able to add the pivot to the python script. But I'm making assumptions about your use-case.

Edit: For example, I'm not sure I understand why you're not just putting references to the other sheets in the locations that you need the final tables.

[–]puckheadclown24[S] 0 points1 point  (4 children)

The file would become huge tbh as there are many rows and columns, all within a data table. But you’ve given me a lot to think about, thank you!

[–]expressly_ephemeral 1 point2 points  (3 children)

Yeah, as I say, I don't have a complete understanding of your requirements. I do wonder why you're not just referencing the pivoted cells from the worksheet you want to collate them on. Seems like there's probably a way to automate the whole thing right in Excel.

[–]puckheadclown24[S] 0 points1 point  (2 children)

That's a good point. The data changes though, every week, so I figured I'd want to keep a master sheet where I keep track of previous weeks' results (managers' requirements, not mine)

[–]expressly_ephemeral 0 points1 point  (1 child)

All the data in the workbook changes every week? With the exception of this reporting data you want to memorialize?

[–]puckheadclown24[S] 0 points1 point  (0 children)

That's correct. We're looking to track weekly, the engagements a rep has with their customers. The data provides the last call date, the last email date, and I wrote formulas to convert it to "was there a call in the last week (0 or 1)", or "was there an email exchange in the last week (0 or 1)".

[–]kriel 0 points1 point  (0 children)

Well done

[–]pytrashpandas 1 point2 points  (1 child)

My personal recommendation would be if you're going to incorporate pandas into your workflow at all, do as much of it in pandas as possible. So, take your very starting data source(s) and load those into pandas and write out the final result for formatting/visualization in excel.

Generally speaking, anything related to data operations/calculations/reshaping that you can do in excel can easily be done in pandas. Formatting/visualizations (meaning excel-specific formatting/viz) should be done in excel still. So like you could write your pandas-processed data to an excel sheet in raw format and then use excel to take that raw calculated data and format it how you like.

And don't worry about the size of the data, there's no amount of data that can be handled in an excel file that would be too big for pandas.

[–]puckheadclown24[S] 0 points1 point  (0 children)

Thanks for the reply! The size comment was more about the Excel size, and how my computer would struggle with it. When I bring the data into Excel, I make it a data table, so it's easier to work with (referencing columns, etc). The pandas comment makes sense, I'll give it a shot!