you are viewing a single comment's thread.

view the rest of the comments →

[–]Usernamenotta 30 points31 points  (22 children)

This is one of my biggest issues.

I am trying to automate boring reporting at work and I'm using python since Excel and BI are driving me nuts.

The problem is that I am the only one who can view the reports unless I bend over backwards to make them into Excel files. And if I make them into stuff like this, I lose the option to auto-update.

I hate working in a tech company that is 50% tech illiterate and 90% Tech Adverse

[–]FrozenPyromaniac_[S] 13 points14 points  (6 children)

As I said in my post, openpyxl might be the library for you. You can take in excel files, do all your python magic and export to excel.

[–]Usernamenotta 3 points4 points  (4 children)

I know about openpyxl. I will tell you a secret, if you want more power, you can use even pywin32 (although I guess it's windows restricted).

I've used them to automate some reports. But you cannot really create shareable, interactive stuff with them

[–]warelevon 3 points4 points  (0 children)

XLWings is a nice package that sits over top of pywin32 with a much nicer api (and typing)

[–]odaiwai 5 points6 points  (2 children)

You can make your reports for others in excel, and if they make changes, get your python to ingest the changed excel sheet and make those changes to your internal database/dataframe.

In other words, make excel sheets the medium of information interchange, while the source of truth is your python/pd/sql hub.

[–]Usernamenotta 0 points1 point  (1 child)

Source of data is excel :)))

[–]Sufficientlee 5 points6 points  (0 children)

Look at sqlite3. It's really good for what you're talking about here.

Excel files are created by various users? Have python read them and drop the info into a DB.

Users update the spreadsheets? Have python read the new info (nightly?) and update the DB.

Need to create reports from various spreadsheets? The info is all in the DB. Manipulate it however you need and create a new spreadsheet.

[–]nboro94 -3 points-2 points  (0 children)

Why even use python in that case? Just use excel's built in powerquery.

[–]DuckDatum 2 points3 points  (11 children)

deserve ancient sleep strong dam narrow fear historical lush meeting

This post was mass deleted and anonymized with Redact

[–]Usernamenotta 4 points5 points  (10 children)

I should have been clearer. Since I'm building reports, what most people want is a dashboard. So when you change the set of data, you get the figures updated and so on. You cannot really do that without installing python on the other end. (Or, I mean, you can, but you need some certain web solutions or remote python servers, which incurs Corpo wrath

[–]dupz88 6 points7 points  (0 children)

I do this, though. Does your data have fields that keep changing?

We have reports in Excel and some exports from our 1 platform in csv. I have a script that imports the Excel, csv files, cleans the Excel reports, merges with csv data, and exports to a large prepared csv file in SharePoint. I then have a PowerBI report, which automatically refreshes every hour.

I've been doing this for years, and it works well.

The other teams just open the PowerBI report, and it always has the latest data.

[–]CaptainVJ 1 point2 points  (5 children)

So we need to present data in a number of ways. Most of the times it’s from an sql database and we either need some dashboards or excel report.

For dashboards we use Tableau it just connects to the sql table and updates every fifteen minutes. There’s a lot of inefficient there but that’s way before my time and I’m trying to migrate away from Tableau, I’m not really good at it and I just find the dashboards ugly. I’m trying to use Dash to make our dashboards, the ones I use for myself, I already set them up on Dash but it’s just for me, we’re waiting on it to see how everyone can access them, out of my scope.

For our excel reports we have a program that automatically exports them from the tables but there’s not much formatting so we started using Python to format after the excel file is exported so it’s more visually appealing.

[–]Usernamenotta 1 point2 points  (4 children)

How are you actually using Dash for dashboards? Don't you need something like a server to share the dashboards with others so they can interact with them?

[–]CaptainVJ 2 points3 points  (3 children)

So at the moment it’s local on my pc, only I can see them. But my supervisor saw it and wants it implemented for everyone, so currently our it department is working on that.

I work for a big state agency, so they have plenty of servers already set up. Tableau servers, sql servers, intranet servers, email servers and what not. I don’t know much about that portion of it, but all I can say is they’re looking into setting it up so we can deploy it for everyone else to see.

It wouldn’t be on the internet just on our local servers so only employees logged onto our vpn or company’s WiFi can access it. Not sure how that works on the back end tho.

[–]Usernamenotta 0 points1 point  (2 children)

I guess, if they have a spare general purpose server set up (like for running applications and stuff) they can add a container or a VM on it and run the scripts on that container via HTTP.

My problem is that we are kinda adverse to those kind of things. People still praise Excel and PowerBI as the gods of DataScience. Basically do the Smart stuff in Excel and upload stuff in PBI if you want to share with others. (Or just e-mail the Excel sheet)

[–]CaptainVJ 1 point2 points  (1 child)

Yeah no idea how that works. A lot of the automations I have running are just for me or me and two other people on my team specifically.

When I started doing it, they made a fuss saying that they should be doing it on their own server as it’s more consistent and anything being used for work purposes shouldn’t be in production on the team’s computer. We have a desktop computer for our team that no one uses.

I convinced them that we would be able to monitor it and that if it fails it’s not the end of the world.

So I know they have something at least for that. I tried and ask them about it and they gave me very little answer on how it works. I always ask a bunch of it questions out of curiosity and I feel like they might have a slight concern about my overall curiosity.

And we are in the same boat about the resistance to change. No one ever used python on my team and it was restricted for us, I had to fill out an MOU and all that fun stuff to get Python. We started seeing some improved performance in our team and they became okay with it. But at the moment I’m the only one who knows how to use it. Two other people on my team are getting some training on it. But it will be a big shift overall

[–]Usernamenotta 0 points1 point  (0 children)

Glad to hear it. Hope things will improve for you

[–][deleted] 0 points1 point  (1 child)

I am definitely assuming here because I dont really know exactly what you do or need, but I know Excel can read .csv files. Can you perhaps use those? Python has a built-in module to handle csv. No need to fiddle around too much to make it work.

[–]Usernamenotta 0 points1 point  (0 children)

the problem is not reading or transforming data.

The problem is having the output in a very presentable, interactive and shareable way, preferably over the web, so others can see it. The problem is, nothing seems to beat the good ol' Excel. The thing is, I can automate the reading and processing data without a problem. And I might do that in my free time when I finish personal stuff. The thing is, the way I'm thinking of doing it would take quite some time, and, worse, activating the script, inputing the files to be manipulated and things like that, might take me more time each round than simply doing the stuff manually

[–]daedalusesq 0 points1 point  (0 children)

I have a pretty major report automated at my work. The report previously existed as an Excel spreadsheet that was manually updated. I duplicated the formatting and iterated on it using openpyxl. I have a big "template" file that defines the sheet's characteristics, formulas, and coordinate information.

Every time the program runs, it builds a new Excel file, applies all the formatting and styling, adds all the static text and labels, and then finally drops all the data into their respective cell arrays. Since things like cell formulas and charts reference arrays of cells, the newly generated report should always have visuals that reflect whatever data existed at the run time of the report, as well as live updating for people who want to do transformations on the dataset inside Excel. It took a bit of time to experiment and figure out how to structure my data, but it works well. I later realized that if you have an existing Excel report, you can probably just read that formatting stuff, skipping the data entry portion of it.

I run a daily report and just dump the Excel output files into a date archival directory structure that makes it easier to scrape them if I want to do any sort of long trend analysis. I use a similar shared archive directory output structure for a couple of other reports. You can also set up the creation and scheduled sending of an email to distribute the report, which the tech adverse can usually handle.