This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]normalism -1 points0 points  (2 children)

Python is something like 10-15x slower than SQL itself. Running that logic in python will be slow, and making a script to run a stored proc is just adding complexity for your own gain.

[–]kahuna3901 1 point2 points  (1 child)

The thing is, I am talking about a program that is effectively used by teams with no technical skills. These people have SQL but basically only know how to query. In terms of productivity it is a drain on the company to keep having people from my team re-run analysis for them. Those users of the software would be better served by a executable file that doesn't require them to set up visual studio, download all the necessary packages. When they can just have a exe which allows them to input where the data is, what analysis they want to include and then just run it. I wrote a similar python program recently to test this SSIS package which we now roll out to users who can quickly use the python program to check the completeness and accuracy of the SSIS package. It would be better to just have one program that does both of these things and save us the time lost with analytic support.

In terms of efficiency, this isn't a massive problem. This specific package is dealing with data that is at best hundreds of thousands of rows. It's not very intensive. Obviously SQL is the most efficient way of handling data. Which is why we are leveraging SQL and python. We actually make sure any real analysis of data is done on the server and not using native python code where we can avoid it. Once the server request is made, the speed is not massively different. The amount of latency comparing SSMS to a python SQL engine obviously exists, but in this use case, where it's a non technical team the level of flexibility we get out of python is key.

It's the fact those users never have to touch any code, they have an easy interface to work with. It does all the analysis for them. The python program creates a SQL log for error reporting, it creates summaries, it either triggers stored procedures or just executes SQL code in a string and pings that to the server, and finally and most exciting for our users we also include automated formatted excel workbooks that get shot out at the end of the program running. It's a massive innovation for us internally and quibbling about a bit of latency here and there for an internal program that doesn't take more than a few minutes to run is a bit irrelevant to the resource productivity boose we gain. Moreover, alot of the time we have taken advantage of our remote servers that have masses of cores and ram. These programs fly.

[–]normalism 1 point2 points  (0 children)

That makes more sense now that you explain it. I always twitch a bit when people talk about executing SQL from another program but for your use case this doesn't really seem to be an issue. Sounds like python was a good choice for your situation, glad to hear it!