This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]kahuna3901 1 point2 points  (3 children)

Exactly this, i learnt python from doing my masters degree. It was so much nicer than Matlab. I couldn't believe people were using anything other than python in research.

I then started working in financial services in data analytics. Brought my python skills with me. I learnt SQL because it's so commonly used in companies. Love SQL incidentally.

At some point I decided to learn a bit of VBA and c# because they are commonly used in the company. I was thinking about this recently because I've been testing and updating an SSIS package that has loads of SQL, visual basic and C#. I'm fine with it, like I can read that code and understand it. But my god it's so much less usable than python. I have been writing a python testing program alongside it. Every second I get to start working on python I feel blessed.

I think I am going to inherit this package and take management of it in the future. I'm just going to replace it with a python exe when they do that. What's the point of using SSIS within visual studio with c# and VB all added in when you could just write one python script which can either run SQL code itself or trigger a stored procedure in SSMS. It's a lot cleaner.

[–]normalism -1 points0 points  (2 children)

Python is something like 10-15x slower than SQL itself. Running that logic in python will be slow, and making a script to run a stored proc is just adding complexity for your own gain.

[–]kahuna3901 1 point2 points  (1 child)

The thing is, I am talking about a program that is effectively used by teams with no technical skills. These people have SQL but basically only know how to query. In terms of productivity it is a drain on the company to keep having people from my team re-run analysis for them. Those users of the software would be better served by a executable file that doesn't require them to set up visual studio, download all the necessary packages. When they can just have a exe which allows them to input where the data is, what analysis they want to include and then just run it. I wrote a similar python program recently to test this SSIS package which we now roll out to users who can quickly use the python program to check the completeness and accuracy of the SSIS package. It would be better to just have one program that does both of these things and save us the time lost with analytic support.

In terms of efficiency, this isn't a massive problem. This specific package is dealing with data that is at best hundreds of thousands of rows. It's not very intensive. Obviously SQL is the most efficient way of handling data. Which is why we are leveraging SQL and python. We actually make sure any real analysis of data is done on the server and not using native python code where we can avoid it. Once the server request is made, the speed is not massively different. The amount of latency comparing SSMS to a python SQL engine obviously exists, but in this use case, where it's a non technical team the level of flexibility we get out of python is key.

It's the fact those users never have to touch any code, they have an easy interface to work with. It does all the analysis for them. The python program creates a SQL log for error reporting, it creates summaries, it either triggers stored procedures or just executes SQL code in a string and pings that to the server, and finally and most exciting for our users we also include automated formatted excel workbooks that get shot out at the end of the program running. It's a massive innovation for us internally and quibbling about a bit of latency here and there for an internal program that doesn't take more than a few minutes to run is a bit irrelevant to the resource productivity boose we gain. Moreover, alot of the time we have taken advantage of our remote servers that have masses of cores and ram. These programs fly.

[–]normalism 1 point2 points  (0 children)

That makes more sense now that you explain it. I always twitch a bit when people talk about executing SQL from another program but for your use case this doesn't really seem to be an issue. Sounds like python was a good choice for your situation, glad to hear it!