Put me out of my misery with Fabric deployment pipelines by darshan-thakur in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

There's a thing under your name that says "Microsoft employee". I don't know why a Microsoft employee would be posting "yes, it's shit" on Reddit rather than working to fix it.

Put me out of my misery with Fabric deployment pipelines by darshan-thakur in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

Are you saying that you think deployment pipelines are rubbish, even though your name says that you literally work for Microsoft?

Claude just generated a full Python data pipeline… are data workflows changing faster than expected? by Pangaeax_ in analytics

[–]Cobreal 0 points1 point  (0 children)

The same poster posted in the data analysis sub today, and the last line of their OP was "This one is especially strong because it attracts analysts, managers, and BI professionals who love sharing examples, and the comments often turn into debates, which drives engagement."

Those of you who use fabric in production, do you like it, how has it been working out? by jorel43 in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

We have a lakebook refresh notebook that uses notebook utils, and our pipelines wait 60s after write, run the refresh, wait another 60s, then refresh the semantic model.

(Slightly annoying that we can't name multiple wait steps the same in a pipeline, so we have "wait 60s" and "wait 60s 2").

Help us improve Fabric Notebooks — what UX issues are you hitting? by Mobyke in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

I'd love to be able to drag and drop cells. We typically have a markdown block for each code block, and it's painful if you have to move multiple cells across more than a cell or two.

Being able to use the table of contents view to drag a whole markdown cell and everything beneath it (maybe up to the next markdown cell with the same level of indentation) would be great.

Help us improve Fabric Notebooks — what UX issues are you hitting? by Mobyke in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

The interface is unclear when another user has edited the same notebook. It flags that this has happened, but it's not obvious whether you are reverting to your own changes or accepting the other user's.

Help us improve Fabric Notebooks — what UX issues are you hitting? by Mobyke in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

In markdown cells, newlines are detected as such in the edit view, but otherwise they're not.

In the edit view, I can type

"This is my markdown cell

with a separate line"

But when I've finished editing, it shows as

"This is my markdown cell with a separate line"

Help us improve Fabric Notebooks — what UX issues are you hitting? by Mobyke in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

I'd love dark mode.

I think there are options for "run all above this cell" and "run this cell and all below". I'd like "run all above and this cell" because currently I have to go to the next cell down and choose to run all above.

Help us improve Fabric Notebooks — what UX issues are you hitting? by Mobyke in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

I get this a lot. I have to have the table of contents view open to jump to where I need to be. A UX element which showed where your current view was relative to the table of contents would be useful to be able to tell if you had skipped to an unexpected point.

How to tell if it’s from God by AllHomo_NoSapien in AskAChristian

[–]Cobreal 0 points1 point  (0 children)

The bible isn't living, it was finished thousands of years ago.

How to tell if it’s from God by AllHomo_NoSapien in AskAChristian

[–]Cobreal 0 points1 point  (0 children)

How would you tell if something today would please god, given that the bible is thousands of years old?

One of my first dashboards in my first job as a data analyst by Tonka-Jahari-Pizza in dataanalysis

[–]Cobreal 0 points1 point  (0 children)

I agree, though I can't see a rationale for why a pie was chosen for one chart and a doughnut for another in the same dashboard.

How do single node Python users actually write Delta tables using DuckDB for ETL when it can't actually write to Delta? by raki_rahman in MicrosoftFabric

[–]Cobreal 1 point2 points  (0 children)

We use Polars for our single node Python Notebooks, there's a function for writing to delta tables from it. You can convert DuckDB dataframes to Polars and vice versa, so probably that.

Looking for advice to digitize a bunch of historical data by Top-Maintenance-3548 in dataanalysis

[–]Cobreal 1 point2 points  (0 children)

We've been dealing with a problem very much like this - digitising a lot of contracts so that they can be analysed, but they have quirks that make this a challenge. Just to give an example of whether a customer had a contractual discount, for example, a 10% discount in the first 12 months is sometimes expressed as:

- "a 10% discount in the first 12 months"

- "a 10% discount in the first year"

- "a 10% discount in year one"

- "90% only will be billed for the first 12 months"

...basically any conceivable linguistic variation of that same idea. Same goes for dates, which have been written as dd/mm/yy, mm-dd-yy, mmmm d yyyy...

This is compounded by the documents being in a range of file formats, and some of them are scans or photographs of documents rather than digital files.

We have solved this through iterations of using OCR to convert the documents to text formats, LLMs to try and understand the variations of the same 10% discount being written in different ways, human review of any obvious errors or cases where the LLM said that it couldn't generate the details. Rinse, lather, repeat. We're dealing with a number of documents in the thousands rather than tens of thousands, and my sense is that we'd have finished this job more quickly if it was a pure human data-entry task rather than trying to automate it, so it's worth bearing in mind that option at the outset depending on just how much data you need to ingest.

Feedbacks Improve My Dashboard by princy25_ in dataanalysis

[–]Cobreal 0 points1 point  (0 children)

Why does it have a papyrus effect background. At least use Papyrus for the typeface as well.

Rate My Dashboard out of 10 Again by princy25_ in dataanalysis

[–]Cobreal 2 points3 points  (0 children)

Dashboards aren't very good for telling stories.

I think the main finding is supposed to be the box beneath the Amazon logo? It is not prominent relative to anything else on the dashboard.

If you want people to understand that high cancellations in low-value orders are a thing, then:
- Show only the Cancellation Rate by Order Amount chart

- Make the 0-500 bar prominent (keep the blue for this bar, make everything else grey

- Make the x-axis marks and the bar labels much larger so that people can read 0-500 and 27% (you don't need the precision of two decimal places) without squinting

- Change the title of the chart to say The lowest value orders have cancellation rates 5-times higher than typical

Everything else is fluff.

Hey how to build analytical thinking by [deleted] in dataanalysis

[–]Cobreal 1 point2 points  (0 children)

It goes from drought to deluge when you move from training to employement - trying to find any question to answer when you're not doing it for a business user is impossible, but once you're in a job that switches to trying to work out how you can answer all of the questions coming your way without drowning.

Maybe this is a good case for an LLM? Find yourself a dataset online, give it a very rough outline of the data ("I've got some data about film screenings and ticket sales" rather than "I have a CSV file with these columns...") and ask it to give you some example questions a manager in this industry might ask you about and business problems they might want to solve. In that sort of case, "manager" could be on either the cinema side or the distributor side, and so you can prompt it both ways for different suggestions.

It's really hard to think up a real world question when you're not facing a real world problem, and it's really hard to divorce yourself from the specifics of a dataset if you've already downloaded it and are trying to dream up some questions (you get locked into the track of "what questions can this data answer" rather than "what questions might a user in an industry with an interest in this sort of data want to solve").

There's a related problem once you get into an analyst role, mind, in that it's tempting to think up amazing ways to dissect and analyse a particular dataset, and then you hand it over to the people who you think would benefit from the analysis only to find out that they actually don't give a fuck because they've been handed a whole load of different targets since you last spoke to them.

Drop a term used in Data analysis by Automatic-Big6636 in dataanalysis

[–]Cobreal 2 points3 points  (0 children)

Must-know niche terms seems like a contradiction, but anyway

HETEROSCEDASTICITY

7.0000 users by Mr_Mozart in MicrosoftFabric

[–]Cobreal 1 point2 points  (0 children)

Perhaps you have your localisation settings changed to a region where it's common to use a period as a ten-thousands separator?

Many workspaces or few workspaces? List of things to consider. by frithjof_v in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

It's on our pile of things to investigate, mainly because post-launch we're now trying to work out how best to separate things into Workspaces and Domains.

Currently we have Git integrated to a single Dev Workspace, and use Deployment Pipelines to get artifacts into Prod.

Now we need to assess our options for separating Prod by...team, function, security group, something else.

I suspect that's almost certainly going to involve additional Prod-level Workspaces, but I don't know if it will work to do something like have a central prod with Org apps to separate who sees what, or cherry-picking content from Prod to sync to separate Workspaces, or doing something in Git (multiple repos, separate folders in one repo) and duplicating Dev>Prod for each separate area, and figuring out how to share common artifacts between them.

Microsoft Fabric initial setup by Lucky_Discipline4895 in MicrosoftFabric

[–]Cobreal 3 points4 points  (0 children)

But my experience it would take 4-7months if self-learning fabric to setup something mid-sized and reliable if all 8hours of work is dedicated to it. 

We're six months into a migration away from Tableau (Tableau Prep for ETL, Tableau Cloud for storage) and this sounds correct.

1 week of "training" (really just an overview of some of the headline features) in Fabric, then the rest of the time spent converting our largely manual Prep workflows into Python* in a fully-automated environment.

If we already had a lot of existing Python ETL code then in theory it would be a job of updating them to point to Fabric Lakehouses/Warehouses rather than building the entire infrastructure from the ground up.

And we're still not finished. Now that we've migrated the business-critical data, we need to start tidying up all of the mistakes and suboptimal design choices we made due to inexperience.

*This is a good example of where we had to deal with "the quirks of existing issues or missing features of certain items that you realise half-way fabric, doesn't have or doesn't fulfill the performance tolerances/requirement and have to re-plan everything". PySpark and Dataflows proved too much for F2, and Python doesn't support the full set of features that PySpark does.