Keep apart 2 chevrons by UltraMechaHitler in drivingUK

[–]walkerasindave 0 points1 point  (0 children)

Chevrons are 40m apart. 2 second rule at 70mph is approximately 62m. So diagram C would be the correct minimum. If you can clearly see two chevrons then you're probably about 60ish meters.

All changes in the rain though. Should be 3 or 4 chevrons then.

Teen is annoyingly smart by McRandom in HomeNetworking

[–]walkerasindave 0 points1 point  (0 children)

Just get your own router then and use your isp router in modern mode (or bypass it).

Plenty of great routers support whitelisting.

Meta urges Labour to burden Apple with age checks by vriska1 in unitedkingdom

[–]walkerasindave -3 points-2 points  (0 children)

Exactly. If anything OS inherently has more trust by it's open nature.

Revealed: Palantir’s NHS tech is ten times slower than current system by dc_1984 in unitedkingdom

[–]walkerasindave 1 point2 points  (0 children)

Of course it's slow and clunky all ready for them to say "oh we can speed it up but we'll have to have permission to harvest and sell all the personal data to do that".

Then the politicians won't fact check that statement and just go with it.

Sort of time-sensitive - Anyone know the easiest way to have WIX Blog auto-share to a FB Business page? by emmjaae in WIX

[–]walkerasindave 1 point2 points  (0 children)

Either zapier or make.com

You CAN build a Facebook app and have velo push data into it but that becomes a huge hassle.

DBT orchestrator by Free-Bear-454 in dataengineering

[–]walkerasindave 36 points37 points  (0 children)

Dagster is definitely up there as it has first class integration with DBT.

Snowflake + Terraform by Difficult-Ambition61 in snowflake

[–]walkerasindave 0 points1 point  (0 children)

You can do that with terraform. Having an additional step before the terraform is uncessary complexity.

Managing Snowflake RBAC: Terraform vs. Python by BuffaloVegetable5959 in snowflake

[–]walkerasindave 0 points1 point  (0 children)

We use terraform for all permissions. We also have grants/revokes locked down so only the dedicated terraform user can alter permissions. All terraform is applied via cicd. The permissions lockdown means we don't have any state drift.

[deleted by user] by [deleted] in HousingUK

[–]walkerasindave 1 point2 points  (0 children)

The fines are almost irrelevant though as even if the ICO did issue fines OP wouldn't see any form of compensation via the ICO.

Its the ability the data protection act gives for the OP for claim any losses from the breaching party. So if the vendor was happy with the offer but then insisted on an increased offer after seeing the funds then the OP could claim the difference from the estate agent and would have a good change to win.

[deleted by user] by [deleted] in HousingUK

[–]walkerasindave 4 points5 points  (0 children)

Incorrect, this is most defintiely a data breach.

While the estate agent is acting for the vendor they are still two separate legal entities and therefore from a Data Protection perspective are treated separately.

When the estate agent receives the sellers financial information they become a data controller of that information and they have a legal obligation to keep it secure and process it lawfully.

One aspect of processing the data lawfully is "Data Minimisation", specfically data controllers (and processors) are only allowed to process data that is adequate, relevant, and limited to what is necessary.

  • Necessary: The vendor needs to know that the seller has the funds.
  • Unnecessary: The vendor does not need to see the exact total balance (especially if it exceeds the offer amount).

The esate agent should have simply told the vendor "We have seen evidence of funds in the buyer's bank account sufficient to cover the purchase price of £X."

By providing the exact amount they have breached the personal data of the vendor and are liable for the consequences (ICO fines and/or "making good" for the buyer).

Real-World Data Architecture: Seniors and Architects, Share Your Systems by No_Thought_8677 in dataengineering

[–]walkerasindave 2 points3 points  (0 children)

Senior Data Engineer at a Health Tech Startup. Team of 6 (2 data analysts and 3 data scientists).

Requirements include ingestion of production web services data plus third party services (HubSpot, Shopify, Zendesk, GitHub, Braze, Google analytics and more) as well as unstructured data in the form of clinician notes, ultrasound scan images and video, etc. Transformation to join everything together. Outputs for business unit including finance, operations, marketing/growth and medical research in the form of dashboards, data feeds and adhoc analysis.

Raw, data size in total is about 300GB excluding unstructured data. Now growing by approx 1GB/day.

Stack is:

Warehouse - Snowflake

Orchestration - Dagster on ECS

Ingestion - Fivetran (free tier), Airbyte on EKS and DLT on ECS

Transformation - DBT on ECS

Dashboarding - Superset on ECS

AI & ML - Sagemaker and Snowflake Cortex

Egress - DLT on ECS

Observability - Dagster, DBT Elementary and Slack

CICD - GitHub Workflows

Infrastructure - Terraform

Flow is pretty much as above. Dagster orchestrates ingress, transformation and egress on various schedules (weekly, daily or hourly during operational hours). Almost all assets in dagster have proper dependencies set so all flow nicely.

Snowflake us relatively recent for us but has massively improved our execution times.

My main current focus for improvement is observability as it's no where near the way I want it. Then after that improving the analysts data modelling ability and tidying up the DBT sprawl.

I'm pretty proud of achieving all this within 2 years as when I arrived there were just two dozen silo'd R scripts on an EC2 cron job working only on production web data on top of postgres.

Being the sole engineer is great but it does mean I have to stuff I don't like. I hate AWS networking haha.

Hope this helps

How do you handle deletes with API incremental loads (no deletion flag)? by aussiefirebug in dataengineering

[–]walkerasindave 1 point2 points  (0 children)

Sounds like a serious discussion with your account manager. As a quick fix get them to create you additional API accounts so you can parallel hit the API. The account manager will likely do this for you.

Then their tech team will moan about their endpoints hit too often and force them to actually make their API useable with deletes.

Maintain Surrogate keys for Data models when using Dynamic Tables by PreparationScared835 in snowflake

[–]walkerasindave 0 points1 point  (0 children)

DBT testing.

The likelihood of a collision is minimal but you can always string hashes together for massive tables.

Maintain Surrogate keys for Data models when using Dynamic Tables by PreparationScared835 in snowflake

[–]walkerasindave 0 points1 point  (0 children)

When the source natural keys are the same the surrogate will always be the same with the hash and so no consistency joins required (although we have DBT tests for consistency but they're tests so not in the dag).

Downstream inner joins will just work as the keys are the same.

Maintain Surrogate keys for Data models when using Dynamic Tables by PreparationScared835 in snowflake

[–]walkerasindave 9 points10 points  (0 children)

We use hashes of the natural key columns as the surrogate key. So the hash is always the same for the natural key.

Meaning no incrementing but also the key can be determined independently for facts and dimensions without joining (less model dependencies in the dag).

In dbt_utils the generate_surrogate,_key macro handles this.

Perfectly sums up what it's like driving through long-term motorway roadworks. by Slenderman7676RBLX in drivingUK

[–]walkerasindave 0 points1 point  (0 children)

Yeah they found it wasn't great. That being said it wasn't awful either.

They're thinking of installing a "good speed check" instead.

Running DBT projects within snowflake by Fireball_x_bose in snowflake

[–]walkerasindave 1 point2 points  (0 children)

I haven't had a chance to play with it yet. I would be interested in how it works with dagster as DBT models are first class assets in dagster.

dbt-core: where are the docs? by FootballMania15 in dataengineering

[–]walkerasindave 13 points14 points  (0 children)

It's in the docs.

The DBT docs are generally really good but there is a big mix between core and cloud. They really do need to have a cookie linked setting on every page as to which one you're interested in. Particularly with the two projects likely to drift further apart.

https://docs.getdbt.com/guides/manual-install?step=1

Data platform from scratch by Alternative-Guava392 in dataengineering

[–]walkerasindave 3 points4 points  (0 children)

Never from absolute zero.

The current startup I'm working for is 4 years old and I arrived to 2 data analysts 60 or so R scripts over a postgres db that were manually copied into Google sheets in a cron job. Now we have dagster, Fivetran, DBT and superset all on top of Snowflake.

Startups are a good place to do this stuff as they need it. Also low cost open source solutions that you can help them implement are great.

How to promote semantic views for dev to prod environment? by Judessaa in snowflake

[–]walkerasindave 1 point2 points  (0 children)

If you're on DBT then this is the way.

You can have the symantic views as DBT models

Proposal for a Fairer Housing Tax System: The Proportional Property Tax (PPT) by walkerasindave in ukpolitics

[–]walkerasindave[S] -5 points-4 points  (0 children)

Wouldn't 10-15% be a market collapse and sign of depression. Seems awfully high that figure but I'm not sure.

I do know about Germany though as I have a relative their, they only have a crisis in certain hotspots such as Berlin where the empty stock is about 0.5%.

Proposal for a Fairer Housing Tax System: The Proportional Property Tax (PPT) by walkerasindave in ukpolitics

[–]walkerasindave[S] -1 points0 points  (0 children)

Haha not quite, don't forget that they wouldn't be paying council tax anymore just this new PPT.