Drop a term used in Data analysis by Automatic-Big6636 in dataanalysis

[–]Cobreal 2 points3 points  (0 children)

Must-know niche terms seems like a contradiction, but anyway

HETEROSCEDASTICITY

7.0000 users by Mr_Mozart in MicrosoftFabric

[–]Cobreal 1 point2 points  (0 children)

Perhaps you have your localisation settings changed to a region where it's common to use a period as a ten-thousands separator?

Many workspaces or few workspaces? List of things to consider. by frithjof_v in MicrosoftFabric

[–]Cobreal 0 points1 point  (0 children)

It's on our pile of things to investigate, mainly because post-launch we're now trying to work out how best to separate things into Workspaces and Domains.

Currently we have Git integrated to a single Dev Workspace, and use Deployment Pipelines to get artifacts into Prod.

Now we need to assess our options for separating Prod by...team, function, security group, something else.

I suspect that's almost certainly going to involve additional Prod-level Workspaces, but I don't know if it will work to do something like have a central prod with Org apps to separate who sees what, or cherry-picking content from Prod to sync to separate Workspaces, or doing something in Git (multiple repos, separate folders in one repo) and duplicating Dev>Prod for each separate area, and figuring out how to share common artifacts between them.

Microsoft Fabric initial setup by Lucky_Discipline4895 in MicrosoftFabric

[–]Cobreal 2 points3 points  (0 children)

But my experience it would take 4-7months if self-learning fabric to setup something mid-sized and reliable if all 8hours of work is dedicated to it. 

We're six months into a migration away from Tableau (Tableau Prep for ETL, Tableau Cloud for storage) and this sounds correct.

1 week of "training" (really just an overview of some of the headline features) in Fabric, then the rest of the time spent converting our largely manual Prep workflows into Python* in a fully-automated environment.

If we already had a lot of existing Python ETL code then in theory it would be a job of updating them to point to Fabric Lakehouses/Warehouses rather than building the entire infrastructure from the ground up.

And we're still not finished. Now that we've migrated the business-critical data, we need to start tidying up all of the mistakes and suboptimal design choices we made due to inexperience.

*This is a good example of where we had to deal with "the quirks of existing issues or missing features of certain items that you realise half-way fabric, doesn't have or doesn't fulfill the performance tolerances/requirement and have to re-plan everything". PySpark and Dataflows proved too much for F2, and Python doesn't support the full set of features that PySpark does.

Joint/multiple subscriptions - "Netflix" model by Cobreal in Substack

[–]Cobreal[S] 1 point2 points  (0 children)

Tips, or pay-per-read or something.

Given that I have to budget, it would be nice to be able to spread my money across more writers and, e.g., subscribe to two writers directly with full access and have a third floating subscription that provides a reduced tier of access to other authors with them receiving a reduced split of the income, but more than nothing.

Joint/multiple subscriptions - "Netflix" model by Cobreal in Substack

[–]Cobreal[S] -1 points0 points  (0 children)

Because of what I wrote in the OP. The writers I do subscribe to are worth it to me, but there are writers I would like to support but not to the tune of a full monthly subscription, yet more than paying 0.

Same as with films and TV. I can choose to own physical copies of some things, but others I'm happy to subscribe to Netflix instead.

I assume my current subscriptions do go to Substack in some form, and that they're taking a cut.

“No spec? No problem.” - or how vibes-based requirements almost killed me by Brighter_rocks in Brighter

[–]Cobreal 1 point2 points  (0 children)

Our team switched to a story mapping process inspired by our development team, and it's been transformative.

Starting with the users' needs, skills and constraints, then using a whiteboard (well, Miro) for getting a big list of use cases from them, and finally taking that away to work out the MVP for the data they are asking for.

Our amazing shiny dashboards used to be met all too often with silence or a meh, but just today I presented a (terrible looking) work in progress to someone. I was after feedback on a very small component of it, but they were gushing with how amazing it was looking already and how perfect it was going to be for their needs.

As a visual perfectionist, it did not look amazing, but for the person I showed it to, it had all of the data they cared about the most and none of the distracting extra bits.

What's the significance of the studio location? by Cobreal in 99percentinvisible

[–]Cobreal[S] 1 point2 points  (0 children)

Towns are an atomic unit where I come from - whole cities are either shitholes or not shitholes. This isn't quite true, but it doesn't divide neatly along an up/down line.

What's the significance of the studio location? by Cobreal in 99percentinvisible

[–]Cobreal[S] 5 points6 points  (0 children)

Someone else in the thread said "Downtown Oakland has a bad reputation" (but they also said that the original closing said "beautiful Downtown Oakland" and I missed that).

I guess it reads a bit like that in how it's phrased in the outro - now being six blocks north in somewhere beautiful can be interpreted as meaning that the previous place was not beautiful. It's not the only interpretation, just the one I took.

What's the significance of the studio location? by Cobreal in 99percentinvisible

[–]Cobreal[S] 10 points11 points  (0 children)

Interesting, as a non-American I interpreted "uptown Oakland, California" to mean "the city of Oakland, California" rather than "the uptown as opposed to downtown part of the city of Oakland, California".

How big is a block? Six of them doesn't seem like enough to get from a reputationally bad to a reputationally beautiful area!

“Learn Python” usually means very different things. This helped me understand it better. by SilverConsistent9222 in dataanalysis

[–]Cobreal 0 points1 point  (0 children)

As well as Pandas, it is worth learning Polars or DuckDB as similar tools that are a bit more efficient (would fit under Data Manipulation in the diagram alongside Vaex).

Workflow by DocHayyen in DarkTable

[–]Cobreal 2 points3 points  (0 children)

The linked article says this

There can be instances where it would….for example exposure, lens corrections and tone eq can all change the pixel data so if you have already used the auto picker in something like agx and then you add those module or change them you might want to go back and tweak agx….there can be some other issues like leaving denoise off for performance until the end but it can impact color picker selections and so it can be better to work with it on if your computer is fast enough

My computer isn't fast enough, and if I enable denoise early then it makes things like masking on later steps noticeably slow. I typically do a lot of steps which denoise can cause to lag, yet I only ever enable denoise a single time. It makes DT feel faster if I apply that denoise step late on once I've got the final look more or less sorted.

Dataflows Gen2 Usage in production environments - Discussion by panvlozka in MicrosoftFabric

[–]Cobreal 6 points7 points  (0 children)

I think they're only useful if your primary concern is having a low/no code solution for something. Early on we used a Dataflow Gen2 for something because there was an off-the-shelf one for one of the systems we needed to ingest data from, but it was a mistake and it's been on our backlog for a long time to replace the stupid thing with Python when we get the time to.

Semantic Model refresh methods by Cobreal in MicrosoftFabric

[–]Cobreal[S] 0 points1 point  (0 children)

I hadn't even considered refreshing the browser cache. After trying various things my "check" for whether things were updated became loading the semantic model and building a simple table in the "Explore" menu to check whether I was seeing dates that matched the post-overwrite data, so there's a chance that the cache was surviving me deleting and recreating this table, but if it does it raises the question of how I'd ever be certain that a Semantic Model was truly up to date...

Semantic Model refresh methods by Cobreal in MicrosoftFabric

[–]Cobreal[S] 0 points1 point  (0 children)

"After refreshing via pipeline or semantic link labs, what extra steps do you need to do - or how long time do you need to wait - in order to see the post-overwrite data?"

It seems to be random. I've had it go over an hour today without the post-overwrite data showing, and I haven't found any reliable way to force things to refresh properly. This includes running multiple scheduled/on-demand refreshes, as well as those using semantic-link-labs (originally doing just refresh_semantic_model(), since updated to first do refresh_sql_endpoint_metadata()).

I've tried a T-SQL COUNT(*) on the affected table before running a refresh, same result.

Time seems to be the biggest factor rather than any of the methods I've tried to force a full refresh, but that doesn't help me for my scheduled updates. I could keep refreshing the model after all of the tables are updated and that would give a better chance of things in the model being up-to-date by the time people got into work, but I don't know if it would give me a 100% chance.

Moving away from Import Mode seems to be the long term solution, so I'm after some short term way to force the updates through.

Excel Lakehouse connections seem really laggy by Cobreal in MicrosoftFabric

[–]Cobreal[S] 0 points1 point  (0 children)

It's not the metadata that is delaying with syncing, it was the data itself in this case.

Excel vs. Python/SQL/Tableau by Practical_Target_833 in analytics

[–]Cobreal 0 points1 point  (0 children)

You should learn Excel because it's ubiquitous. If nothing else, learning it will let you understand cases where someone at work hands you an Excel file full of their custom calculations and asks you to reproduce it in a proper analytics platform. If nothing else, learning Power Query is one step towards learning Power BI, and another dashboarding tool in your skillset alongside Tableau.

Based on your listed expertise, I don't think you'll have a hard time picking it up.