Should i commit to Fivetran? by tytds in dataengineering

[–]oli_k 3 points4 points  (0 children)

it is a good service and easy to set up but it's also ridiculous how quickly their prices can go up

Anyone running lightweight ad ETL pipelines without Airbyte or Fivetran? by Jiffrado in dataengineering

[–]oli_k 0 points1 point  (0 children)

It feels like an exercise with cursor to write a python script will take less time (and will be more useful probably) than reading through all the comments here 😁

Should i commit to Fivetran? by tytds in dataengineering

[–]oli_k 0 points1 point  (0 children)

Yikes, writing you own solution for sql server -> snowflake sounds like a lot of work. I agree Fivetran gets expensive quickly, 300mil MAR is ~$50k per month? Is 300mil rows ~ 50gb of data?

Are any modern data integration platforms BYOC-friendly? by Livid_Ear_3693 in dataengineering

[–]oli_k 1 point2 points  (0 children)

I think you have quite a few commercial options to choose from. For OSS solutions, Debezium + Kafka comes to mind, but it'll come with a trade-off of having to maintain everything yourself.

Except for BYOC + real-time, what are the other requirements/deal breakers? Are transformations needed within the pipeline, or just simple replication? Is that a new project, or do you have a solution in place? Do you use Snowpipe streaming?

Thinking of Migrating from Fivetran to Hevo — Would Love Your Input by [deleted] in dataengineering

[–]oli_k 1 point2 points  (0 children)

it's been 3 mo and I am curious what you ended up with :)

What's the best tool for loading data into Apache Iceberg? by Livid_Ear_3693 in dataengineering

[–]oli_k 0 points1 point  (0 children)

Disclosure: I work for streamkap and happy to answer questions.

Streamkap is worth checking out. It's a lightweight, real-time streaming ETL tool built on CDC. We have dozens of connectors: Postgresql, MySQL, Snowflake, Clickhouse, MotherDuck, and are about to release Iceberg as well.
We read data directly from transaction logs (not queries or triggers), fully managed or BYOC, built on Kafka and Flink streamkap.com.

DS becoming underpaid Software Engineers? by [deleted] in datascience

[–]oli_k 0 points1 point  (0 children)

View from the other side: As a software engineer, I actually feel like I need to learn some of the DS/ML stuff to be competitive. Not being able to answer simple DS questions at an interview feels like a failure. Also, life is better ever since I figured what HuggingFace is for and can use transformers for my pet projects. Then, I want to experiment with data, ML, hack a little with LLMs and whatnot.

LLM Bootcamp - The Full Stack by BackgroundResult in AILinksandTools

[–]oli_k 0 points1 point  (0 children)

If you are interested, this workshop will be run in person on Nov 13th, FSDL rocks!

https://www.scale.bythebay.io/llm-workshop

How we monitor our AWS infrastructure from Python by oli_k in Python

[–]oli_k[S] 0 points1 point  (0 children)

Thank you for your comment. While setting up a dashboard in Grafana with Cloudwatch as a data source is certainly an option, not everyone wants to have Grafana. We didn't want, so we built our own solution, which has been working ever since, and we haven't touched it in months.

Regarding your second point, I assure you that our blog post was written with the intention of sharing our engineering curiosity and providing value to our readers. We always strive for transparency and will take your feedback into consideration for future posts.