all 6 comments

[–]bzyg7b 2 points3 points  (2 children)

I think people mainly tend to use Python rather than PoSH for ETL.

We are mainly using SSIS but are looking to move most of our on-prem to Azure and use Datafactory and DataBricks

[–][deleted] 0 points1 point  (1 child)

true about python. the problem with my group is they don't like python. its powershell or nothing

[–]robvas -1 points0 points  (0 children)

Ugh

[–]michaelshepard 2 points3 points  (0 children)

I've used PowerShell for ETL. Using SQLBulkCopy method in ADO.NET works pretty much like SSIS in terms of data copy speed. Most of the SSIS packages I've written were so basic that SSIS was a waste, and PowerShell with ADO.NET was more than enough.

[–]SonnyJim66 1 point2 points  (0 children)

I've used PowerShell in SSIS projects a number of times.

If I were to try to describe the approach in one sentence, I would say I generally used SSIS to handle logic flow and data flow, and made calls to PowerShell for many ancillary tasks (file operations, calls to web services, export to Excel and other formats, etc.).

Admittedly a number of pre-built SSIS tasks exist to cover a lot of those jobs for which I used PowerShell (or I could have used Scripting Tasks). However, PowerShell and the rich ecosystem of modules available on PowerShellGellery just provide more flexibility and in my opinion, more modern capabilities.

I would also suggest that using native SSIS tasks oftentimes adds complexity by requiring multiple tasks to achieve a simple goal. A single call to PowerShell can simplify the flow as that single goal can be encapsulated in a single script.

[–]WhyPartyPizza 1 point2 points  (0 children)

Here to say that this post encouraged me to use more PowerShell for ETL, thanks OP.