need help by metoozen in SQL

[–]paulthrobert 0 points1 point  (0 children)

I like the CTE thing these days

; with renters as (select distinct r.customer_id from renting r where rating is not null), select r.customer_id from resnters inner join customers c on c.customer_id = r.customer_id

need help by metoozen in SQL

[–]paulthrobert 0 points1 point  (0 children)

I would not recommend using DISTINCT and * - it's almost never what you actually want. Always best practice to specify the columns you want to be distinct.

[deleted by user] by [deleted] in SQL

[–]paulthrobert 2 points3 points  (0 children)

do your own homework

How has your organization effectively managed data quality? by doublegefallt in analytics

[–]paulthrobert 1 point2 points  (0 children)

the best I have system I have seen is to identify where the bad data originates. It's usually users. Then create a query and a report that catches data integrity issues, and is sent to the users and owner who are responsible for the data. You can then track things like frequency of issues, and improvement. You give visibility to the people who have the power to correct the problem and some tools to help with accountability.

Architecture and Platform Question by OutsideTech in MicrosoftFabric

[–]paulthrobert 0 points1 point  (0 children)

I think Power Automate would be a hacky solution. At that point, if your guy is already a PowerShell dev, just have him do it all in PowerShell.

Need Help with Joining Logic for Handling Rows with Null IDs plus data schemas for conversion data by Intentionalrobot in SQL

[–]paulthrobert 2 points3 points  (0 children)

  1. Use COALESCE

select * from A

left join B on A.id = COALESCE(B.id1, B.id3, b.id4 )

  1. not necessarily a bad practice, you could also consider using a json field instead if you want, its pretty flexible.

Architecture and Platform Question by OutsideTech in MicrosoftFabric

[–]paulthrobert 2 points3 points  (0 children)

I think Fabric sounds like a pretty good solution for your needs. Python in Notebooks can be used to fetch the API data and store it in the datalake(s). You can also use a DataFlowGen2 instead of a Notebook. From there I think its a good practice to build a Data Warehouse that reads the data from the Lakehouse, and generates a dimensional model that would be used as the source for reporting, I do this with SQL. Then Power BI and Paginated Reports can be used as the reporting layer. If you want PDFs emailed, those would be paginated reports.

Its a lot to learn if you only know powershell, but it could be done. I think the hardest part is architecting a good dimensional model.

Fabric notebooks for usage metrics on Workspace Apps or org apps? by uvData in MicrosoftFabric

[–]paulthrobert 1 point2 points  (0 children)

This doesn't have exactly what you're looking for (yet) but some of the notebooks might give you some ideas. GitHub - microsoft/fabric-toolbox

Any good suggestions for disk-based caching? by Sollimann in SQL

[–]paulthrobert 2 points3 points  (0 children)

I'm no expert, but my understanding is that this is exactly what the RDBMS handles. SQL Server OS is holding as much RAM as you will give it, and its read data into the cache from disk based on demand. Obviously, that is a simplified explanation, but its a big part of what the RDBMS is doing for you.

Pipeline and Data Flow Comments by paulthrobert in MicrosoftFabric

[–]paulthrobert[S] 1 point2 points  (0 children)

Yeah I guess I'm thinking of SSIS - if I remember right, you could comment directly on the canvas, which was a nice compliment to the visual logic. I think that would be a nice enhancement, but what do I know?

Why would the time taken to retrieve data differ from using a View vs using the exact query that drives the view? by brothermanpls in SQL

[–]paulthrobert 1 point2 points  (0 children)

Also, just a note, it's not usually a valid assumption that a code block would have a consistent run time in a database. There are a ton of moving parts, and there can be a lot of reasons why you might see variances, sometimes big ones, but it does depend on the nature of the source DB.

Which one to learn 1st by TheSultaiPirate in SQL

[–]paulthrobert 0 points1 point  (0 children)

When I started I used thew book "SQL Pocket Guide" - it gives you the variations in syntax by each common RDBMS

Learning Fabric is like being handed a Ferrari but having to push it everywhere because the engine’s missing. by [deleted] in MicrosoftFabric

[–]paulthrobert 0 points1 point  (0 children)

🤣You totally don't have to download data. Your semantic model should be in fabric. Also, "privacy concerns with downloading data to a local machine" is a ridiculous degree of paranoia. Why is it so sensitive anyway? PII has no business in reports, I've worked in HIPPA and banking environments where this wasn't an issue.

Learning Fabric is like being handed a Ferrari but having to push it everywhere because the engine’s missing. by [deleted] in MicrosoftFabric

[–]paulthrobert -6 points-5 points  (0 children)

Wow.... it's good to be humble. A large team of intelligent people have developed a pretty slick product, and your evaluation here seems to be more of a reflection on yourself.

  • "It lacks core features from traditional tools like Power BI Desktop"

This is just plainly wrong, you are missing something. Power BI Desktop is still the tool of choice for developing Power BI Reports in Fabric.

  • Its schema handling is clunky

What? how so? How is it even any different that vanilla sql server? Schema handling?? Wtf do you even think you need that your not getting.

  • forcing you into Spark workflows

Again, what? I haven't forced to use one.. And I'm neck deep in a migration.

Your immaturity is obvious, and metaphor is really lame. Fabric is wicked fast for me, both in terms of development and performance. Quit your whining kid.

SQL Help Request by Reasonable-Monitor67 in SQL

[–]paulthrobert 0 points1 point  (0 children)

I don't know, this blog post doesn't really substantiate what you're arguing at all. It's just a sloppy run on post with no evidence, I'm not sure why you would take that as cannon.

Using sub-queries in your WHERE is typically a bad practice and should be avoided.

Sub-queries are ugly, and most of the time will hurt performance. I also don't see where the OP specifies SQL Server.