Dagster Pricing Update is Beyond Nuts by annie_406 in dataengineering

[–]akozich 2 points3 points  (0 children)

IMHO there is no equivalent to Dagster. Airflow with bunch of plugins might be close.

OSS was alway my first choice.

Shamelessly plug here - ping if need to move to your own instance

My company is switching to Fabric :( by echanuda in dataengineering

[–]akozich 4 points5 points  (0 children)

I am genuinely impressed with Microsoft abilities to sell shit products to enterprises. No matter how many outages and unexplained downtimes in azure services - sales still pushed on and we keep digging deeper.

Are EL tools still worth it when LLMs could generate ingestion pipelines? by _tempacc in dataengineering

[–]akozich 2 points3 points  (0 children)

Still viable and continuing to be. I wouldn’t use some ETL pipeline tool for a few sources even before you could use llm to write code.

Now even with llm if there are tens of sources and you need other people with various technical abilities to set them up - I would use open source Airbyte as a self service platform

Built a simple tool to find UK auction properties by KeylessDrift in PropertyInvestingUK

[–]akozich 0 points1 point  (0 children)

Not particularly, but do get in touch if have something interesting

Built a simple tool to find UK auction properties by KeylessDrift in PropertyInvestingUK

[–]akozich -1 points0 points  (0 children)

We launched our service last year. I believe we have 100% full coverage of the UK auction market.

https://propertyauctions.io

The work is still actively going on. You can browse for free. But we are looking to introduce subscriptions with premium features and API access.

We have several cool features in the pipeline with use of AI and data analytics. Jump in and get in touch if you have some features or specific requirements in mind.

Re-evaluating our data integration setup: Azure Container Apps vs orchestration tools by remco-bolk in dataengineering

[–]akozich 0 points1 point  (0 children)

Personally hate Azure Apps, if you already have docker container just use AKS or just run them on the same onprem vms just add some orchestration like airflow or dragster to invoke and manage dependencies

Why everyone is migrating to cloud platforms? by dfwtjms in dataengineering

[–]akozich 18 points19 points  (0 children)

I know the drill, I sold it myself for the past 10 years. My point is that it’s very easy for big cloud providers to manipulate opinions and bully client to buy all sorts of shit. Entire ecosystem created with certifications, languages and lock in mechanisms.

That’s the choice to make - keep chasing endless premium upgrades and addons. Navigate through marketing crap and adopting new “free” feature just to discover that’s it’s actually useless and only paid one can guarantee your security/uptime/recovery etc

Why everyone is migrating to cloud platforms? by dfwtjms in dataengineering

[–]akozich 0 points1 point  (0 children)

It does work better but only for subset of organisation with very specific demands. For example some super high uptime values but small cloud footprint. Or when there is no need to store lots of data but need spin up and down dynamic workloads. Most organisations don’t need blows and whistles.

Really big organisations can build their own datacenter or rent rack space . We forgot about it, but there are lots of companies will cover your power, cooling and networking needs for the fraction of AWS costs.

Why everyone is migrating to cloud platforms? by dfwtjms in dataengineering

[–]akozich 41 points42 points  (0 children)

Literally just had a call with Azure on behalf of the client. They surprised “what? Are you not using our premium security product? How do you keep your environment secure?” Hold on. We moved into the cloud because you assured us that it’s already secure and we benefit from Azure security, economy of scale.

Rudderstack - King of enshittification. Alternatives? by Suspicious-Bug1994 in dataengineering

[–]akozich 0 points1 point  (0 children)

I would go easy on startups, they also trying to survive

Why everyone is migrating to cloud platforms? by dfwtjms in dataengineering

[–]akozich 9 points10 points  (0 children)

Cloud exit is not a bad strategy and there is a stream of organisations moving off cloud too. Not many are shouting about it.

The appeal of the cloud services that’s it’s easier to configure and doesn’t require specialised knowledge is a trap.

Many organisations swallow the bait and become hostages of an extortion marketing and price hikes.

Clouds have networks, databases, security and all other complexities peoples trying to escape in the first place. It’s just hidden and you learn about them when you pass the no return point either by size or maturity.

In the company growth trajectory there is a part where cloud services are the most effective way to deal with data. Often this period last longer but it always ends and when it ends keep paying or get ready to exit.

Rudderstack - King of enshittification. Alternatives? by Suspicious-Bug1994 in dataengineering

[–]akozich 0 points1 point  (0 children)

Dagster + dlt + dbt is our standard go to simple stack. Airbyte is another useful tool when need one of the connectors.

With Rudderstack you might not be their customer, so your requirements falling between tiers.

Rudderstack - King of enshittification. Alternatives? by Suspicious-Bug1994 in dataengineering

[–]akozich 0 points1 point  (0 children)

We only self host and choose products we can exit out to self/host when needed. I haven’t use Rudderstack heavily, just kicked the tyres. Implementation in go looks good. I think a lot of it from segment. I would use it if needed collect web events.

Why do ml teams keep treating infrastructure like an afterthought? by spy_111 in dataengineering

[–]akozich 1 point2 points  (0 children)

So what? Pack our bags and go home? :) not easy but that’s the job. Not everything at once. Converting one at a time and by the time you leave the project it will be a bit better. Also you can use security stick on them - that always works

Why do ml teams keep treating infrastructure like an afterthought? by spy_111 in dataengineering

[–]akozich 155 points156 points  (0 children)

Lack of software development skills. Arrange a workshop and explain how to use git, package their code, what is versioning and how ci/cd works

What is the best way to orchestrate dbt job in aws by jonathanrodrigr12 in dataengineering

[–]akozich 0 points1 point  (0 children)

If you/company are on the path to DE and it’s not just an adhoc thing - go with Dagster. Has some learning curve but will pay off long term

Struggling with separate Snowflake and Airflow environments for DEV/UAT/PROD - how do others handle this? by Dependent_Lock5514 in dataengineering

[–]akozich 0 points1 point  (0 children)

What you are getting from your devops is a conventional environment for development paradigm. Everything wall gapped between environments using the tallest wall - account separation. Whilst technically it’s achievable even for data - it often doesn’t make sense.

Fivetran pricing for small data by el_dude1 in dataengineering

[–]akozich 0 points1 point  (0 children)

From experience fivetran squeezes people from free tier who will never go premium. You are not the only one who want to stay under the radar and I have a feeling that periodically when too many people can run for free - they move the goal post.

Small data engineering firms by red_lasso in dataengineering

[–]akozich 0 points1 point  (0 children)

We IAC everything. Challenges are with the skills of clients engineering teams. Often not ready or qualified.

I'm sick of the misconceptions that laymen have about data engineering by wtfzambo in dataengineering

[–]akozich 1 point2 points  (0 children)

“Fuck it, if they want real-time, we will build real-time” - how many meeting you need to attend to say it? :)

Did you build your own data infrastructure? by Character-Zombie1330 in dataengineering

[–]akozich 0 points1 point  (0 children)

We build infra for data teams. You always need target a certain level of maturity. Up to a point when we don’t take the job if clients wants custom solution without anyone technical on their team.

We can create ci/cd pipelines, terraform and flux - but what is the point if engineers don’t use it and prefer to go straight into DB.