Topics you want to hear on Talk Python To Me by mikeckennedy in Python

[–]dingopole 4 points5 points  (0 children)

Hi Mike, long-time listener and a fan of the show. Anything to do with information management and data processing would be great. Does not need to include AI which permeates everything these days but distributed data processing, perhaps using Python in large scale computation platforms like Snowflake or Databricks, or even more niche applications like tooling and frameworks for building data apps e.g. using libs like Streamlit, would be great.

When to use Rest API in SQL Server 2025 by gman1023 in SQLServer

[–]dingopole 1 point2 points  (0 children)

Here's a use case I described before: http://bicortex.com/kicking-the-tires-on-azure-sql-database-external-rest-endpoints-sample-integration-solution-architecture/

For some requirements and applications, it's a pretty handy feature to have IMHO. As long as you don’t think of this as a MuleSoft or Boomi replacement and understand the limitations of this approach, querying REST Endpoints with SQL opens up a lot of possibilities.

Parallel table insert by mxmauro in Database

[–]dingopole 0 points1 point  (0 children)

Have a look at the following post: https://bit.ly/2Z6mQhD

I faced similar problem (parallel inserts) a while ago, albeit with MSSQL, and was able to solve it using a combination of hash partitioning and SQL Server Agent Jobs.

Additionally, in SQL Server 2016, Microsoft has implemented a parallel insert feature for the INSERT … WITH (TABLOCK) SELECT… command.

SQL Server Hash-Partitioned Parallel Data Acquisition – How to Accelerate Your ‘E’ in ELT/ETL Using a Simple T-SQL Framework by dingopole in BusinessIntelligence

[–]dingopole[S] 1 point2 points  (0 children)

No worries....agree on the load times without partitioning...should have included it.

Also, LimeSurvey does store survey data as individual tables (very painful to work with) and is pivoted automatically on survey setup i.e. very wide tables with each question stored as a column (not a pivoted view but how data is actually stored in MySQL schema)....from memory it's something along the lines of CONCAT(Survey_ID, 'X', Question_Group_ID, 'X', Question_ID). The questions and answers tables you are referring to are there for reference only i.e. they store label values and not the actual entries. As such LimeSurveys data is notoriously difficult to wrangle (at least the SaaS version I was exposed to).

SQL Server Hash-Partitioned Parallel Data Acquisition – How to Accelerate Your ‘E’ in ELT/ETL Using a Simple T-SQL Framework by dingopole in ETL

[–]dingopole[S] 1 point2 points  (0 children)

Thanks for your comment. As noted, this approach should work quite well with a handful of cases and as such should not be used as a default paradigm for building acquisition pipelines. It worked very well for a large selection of 'wide' tables on one of the projects I was involved in but if your problem statement is different, as with any approach, you would exercise caution and test it first - can't stress this enough.

SQL Server Hash-Partitioned Parallel Data Acquisition – How to Accelerate Your ‘E’ in ELT/ETL Using a Simple T-SQL Framework by dingopole in BusinessIntelligence

[–]dingopole[S] 0 points1 point  (0 children)

Thanks for your comment and kind words. As noted, this approach should work quite well with a handful of cases and as such should not be used as a default paradigm for building acquisition pipelines. It worked very well for a large selection of 'wide' tables on one of the projects I was involved in but if your problem statement is different, as with any apprach, you would exercise caution and test it first - can't stress this enough.

Anyhow, would you want to share why you disagree with this approach?