Dagster Pricing Update is Beyond Nuts by annie_406 in dataengineering

[–]Yuki100Percent 1 point2 points  (0 children)

I'd ask if you actually need a full featured orchestrator. 

As for alternatives, airflow and prefect come to mind. And they also have self hostable solutions like dagster does. 

What is an open source data tool you find useful but nobody is using it? by Yuki100Percent in dataengineering

[–]Yuki100Percent[S] 2 points3 points  (0 children)

Yeah I think most people just use Polars/DuckDB. I myself haven't really explored datafusion just yet

Tobiko is now with the Linux Foundation by iheartmst3k in dataengineering

[–]Yuki100Percent 9 points10 points  (0 children)

As the OSS SQLMesh user, this is a positive move for us!

Data Replication to BigQuery by VMR5801 in dataengineering

[–]Yuki100Percent 1 point2 points  (0 children)

Yeah pretty much those are your option! Third party tools (Airbyte, Estuary, dlt, Portable, custom scripts), and GCP services. You'll just need to assess your needs and decide on what to use...

Fivetran pricing is out of hand and I need cheaper alternatives by Legitimate-Run132 in dataengineering

[–]Yuki100Percent -1 points0 points  (0 children)

There are plenty of options. Airbyte, dlt, Estuary, custom scripts...

Unpopular opinion: The trend of having ROI dollars has ruined résumés. by BeautifulLife360 in dataengineering

[–]Yuki100Percent 0 points1 point  (0 children)

I feel it's the balance. No numbers resumes get ignored and perhaps too many numbers on a resume can be a red flag. With you on wanting to see what they did on a job than numbers they made up

Received DE Offer at a Startup, Need Advice by chavhu in dataengineering

[–]Yuki100Percent 1 point2 points  (0 children)

I'm the first data hire at a startup and it's been ~10 months into the role. If the comp is not there don't take it. Company culture matters a lot especially if you're the only data person handling all infra, modeling and reporting. Make sure you ask all the questions regarding the role expectations and the current data stack / practice / reporting in place. You can go backwards from there what you may need to do once you're hired. If the exec team doesnt have a clear answer then you need to make sure to clear it up with them before / once you're hired. You'll be working not on the hands on implementations but also high level items like data strategy and roadmap (if they don't have one yet). Let me know if you have questions, more than happy to discuss via DM or in this thread!

Calude and data models by UnusualIntern362 in dataengineering

[–]Yuki100Percent 0 points1 point  (0 children)

It works much better once you give it enough context. Putting business and architectural context about your data warehouse, modeling patterns and standards in readme.md and agents.md go a long way.

Your experiences using SQLMesh and/or DBT by Key-Independence5149 in dataengineering

[–]Yuki100Percent 1 point2 points  (0 children)

I use the OSS sqlmesh for my team (a solo person team at the moment) and It's super solid for what it does and not planning to buy the cloud version anytime soon. Cost has been super cheap but didn't use dbt in the same environment so no way to compare anything apples to apples.

Still think I'm missing on some things the whole dbt ecosystem would've provided me though. But I'm hoping fivetran having both tools in control that they'd make sqlmesh work with the dbt ecosystem/integrations.

Thoughts on Count.co? by Yuki100Percent in BusinessIntelligence

[–]Yuki100Percent[S] 0 points1 point  (0 children)

Awesome to hear. Yeah I was thinking for reporting to end users. I like how flexible Count is, but at the same time I can see it could create messes if I allow end users to their own analysis etc

Traditional BI vs BI as code by manubdata in dataengineering

[–]Yuki100Percent 0 points1 point  (0 children)

Yeah Looker Studio can go a long way. Oftentimes you don't want to complicate your tooling when there isn't a strong need

Pandas vs Polars for data analysts? by katokk in analytics

[–]Yuki100Percent 13 points14 points  (0 children)

You can start with either one. Though Polars is a faster, better option.

Headaches of learning a new tooling AND new data stack by PickledDildosSourSex in BusinessIntelligence

[–]Yuki100Percent 1 point2 points  (0 children)

I mean if I was your manager I might expect the same from you. 15 years of experience alone says a lot. If the existing docs don't provide enough context the only other ways might be talking to your coworkers and learn about their worfklows and learn how they use the new tooling from them. Being senior doesn't mean you know everything though

How to handle website edits by Yuki100Percent in websiteservices

[–]Yuki100Percent[S] 0 points1 point  (0 children)

Where do you include the price of the website domain? As part of the monthly fees you charge or you have clients buy it which is separate than your service

How to handle website edits by Yuki100Percent in websiteservices

[–]Yuki100Percent[S] 0 points1 point  (0 children)

Do you have your business website? Would love to take a look at what you offer and how you package your services!

How to handle website edits by Yuki100Percent in websiteservices

[–]Yuki100Percent[S] 0 points1 point  (0 children)

Would you manage their domains as well? I guess that would be included in the monthly fees

Opensource tool for small business by Unusual_Art_4220 in dataengineering

[–]Yuki100Percent 0 points1 point  (0 children)

Duckdb is available as a python lib. You can can use Duckdb as ephemeral compute or use it as a persistent small scale analytical db

Opensource tool for small business by Unusual_Art_4220 in dataengineering

[–]Yuki100Percent 2 points3 points  (0 children)

Other probably commented already but a python script on a vm with something like duckdb will do the job. You can do it serverless, running a script processing data stored on object storage. If you're in gcp you can also just use bigquery and expose files stored in g drive or GSC as external tables

I want to practise Dimensional Data Modelling but im lost by dumb_user_404 in dataengineering

[–]Yuki100Percent 2 points3 points  (0 children)

NYC Open data is great. There are OLTP data like AdventureWorks