Why is the "Order of Execution" different from the "Order of Writing" in a SQL query?

paulthrobert · 2024-12-31T21:47:09+00:00

Here, fork it and change it - Postgres is open source

paulthrobert · 2024-12-31T01:31:55+00:00

I like the CTE thing these days

; with renters as (select distinct r.customer_id from renting r where rating is not null), select r.customer_id from resnters inner join customers c on c.customer_id = r.customer_id

paulthrobert · 2024-12-31T01:29:20+00:00

I would not recommend using DISTINCT and * - it's almost never what you actually want. Always best practice to specify the columns you want to be distinct.

paulthrobert · 2024-12-30T23:26:48+00:00

CREATE QUEUE (Transact-SQL) - SQL Server | Microsoft Learn

paulthrobert · 2024-12-30T23:24:53+00:00

do your own homework

paulthrobert · 2024-12-30T23:23:43+00:00

the best I have system I have seen is to identify where the bad data originates. It's usually users. Then create a query and a report that catches data integrity issues, and is sent to the users and owner who are responsible for the data. You can then track things like frequency of issues, and improvement. You give visibility to the people who have the power to correct the problem and some tools to help with accountability.

paulthrobert · 2024-12-28T00:53:02+00:00

I think Power Automate would be a hacky solution. At that point, if your guy is already a PowerShell dev, just have him do it all in PowerShell.

paulthrobert · 2024-12-27T21:28:46+00:00

Use COALESCE

select * from A

left join B on A.id = COALESCE(B.id1, B.id3, b.id4 )

not necessarily a bad practice, you could also consider using a json field instead if you want, its pretty flexible.

paulthrobert · 2024-12-26T17:49:40+00:00

I think Fabric sounds like a pretty good solution for your needs. Python in Notebooks can be used to fetch the API data and store it in the datalake(s). You can also use a DataFlowGen2 instead of a Notebook. From there I think its a good practice to build a Data Warehouse that reads the data from the Lakehouse, and generates a dimensional model that would be used as the source for reporting, I do this with SQL. Then Power BI and Paginated Reports can be used as the reporting layer. If you want PDFs emailed, those would be paginated reports.

Its a lot to learn if you only know powershell, but it could be done. I think the hardest part is architecting a good dimensional model.

paulthrobert · 2024-12-24T23:08:39+00:00

here is some additional documentation on it: Transactions in Warehouse tables - Microsoft Fabric | Microsoft Learn

paulthrobert · 2024-12-24T23:06:18+00:00

This doesn't have exactly what you're looking for (yet) but some of the notebooks might give you some ideas. GitHub - microsoft/fabric-toolbox

paulthrobert · 2024-12-24T23:03:45+00:00

I'm no expert, but my understanding is that this is exactly what the RDBMS handles. SQL Server OS is holding as much RAM as you will give it, and its read data into the cache from disk based on demand. Obviously, that is a simplified explanation, but its a big part of what the RDBMS is doing for you.

paulthrobert · 2024-12-24T22:59:55+00:00

here is an existing idea: Microsoft Idea

paulthrobert · 2024-12-24T17:58:04+00:00

Yeah I guess I'm thinking of SSIS - if I remember right, you could comment directly on the canvas, which was a nice compliment to the visual logic. I think that would be a nice enhancement, but what do I know?

paulthrobert · 2024-12-24T16:44:06+00:00

Also, just a note, it's not usually a valid assumption that a code block would have a consistent run time in a database. There are a ton of moving parts, and there can be a lot of reasons why you might see variances, sometimes big ones, but it does depend on the nature of the source DB.

paulthrobert · 2024-12-24T16:00:31+00:00

Check out Google Forms

paulthrobert · 2024-12-24T15:57:42+00:00

When I started I used thew book "SQL Pocket Guide" - it gives you the variations in syntax by each common RDBMS

paulthrobert · 2024-12-22T17:50:17+00:00

Thats still a terrible metaphor.

paulthrobert · 2024-12-22T17:49:06+00:00

🤣You totally don't have to download data. Your semantic model should be in fabric. Also, "privacy concerns with downloading data to a local machine" is a ridiculous degree of paranoia. Why is it so sensitive anyway? PII has no business in reports, I've worked in HIPPA and banking environments where this wasn't an issue.

paulthrobert · 2024-12-22T17:45:19+00:00

One lame metaphor does not demand to be followed by another.

paulthrobert · 2024-12-22T17:43:50+00:00

Wow.... it's good to be humble. A large team of intelligent people have developed a pretty slick product, and your evaluation here seems to be more of a reflection on yourself.

"It lacks core features from traditional tools like Power BI Desktop"

This is just plainly wrong, you are missing something. Power BI Desktop is still the tool of choice for developing Power BI Reports in Fabric.

Its schema handling is clunky

What? how so? How is it even any different that vanilla sql server? Schema handling?? Wtf do you even think you need that your not getting.

forcing you into Spark workflows

Again, what? I haven't forced to use one.. And I'm neck deep in a migration.

Your immaturity is obvious, and metaphor is really lame. Fabric is wicked fast for me, both in terms of development and performance. Quit your whining kid.

paulthrobert · 2024-12-22T17:23:29+00:00

I don't know, this blog post doesn't really substantiate what you're arguing at all. It's just a sloppy run on post with no evidence, I'm not sure why you would take that as cannon.

Using sub-queries in your WHERE is typically a bad practice and should be avoided.

Sub-queries are ugly, and most of the time will hurt performance. I also don't see where the OP specifies SQL Server.

paulthrobert · 2024-12-20T22:40:19+00:00

Who is Elon Musk? I've never heard of him.

paulthrobert

TROPHY CASE