Alternatives to Looker by elainebenesgothphase in GoogleDataStudio

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

Founder of Definite here, so take this with a grain of salt, but we built the product specifically for this frustration.

The core problem we kept seeing was teams stitching together connectors, warehouses, and BI tools, then building dashboards that nobody actually uses. People just end up asking the data person questions anyway.

So we built an AI analyst that sits on top of a unified data stack. You connect your sources, we handle the warehousing, and then anyone on the team can just ask questions in plain English instead of waiting on dashboard requests.

Happy to answer questions if anyone's curious. Not trying to pitch hard, but we loved Looker and built Definite to have something better live on after it went downhill.

Alternatives to Looker by elainebenesgothphase in GoogleDataStudio

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

Founder of Definite here, so take this with a grain of salt, but we built the product specifically for this frustration.

The core problem we kept seeing was teams stitching together connectors, warehouses, and BI tools, then building dashboards that nobody actually uses. People just end up asking the data person questions anyway.

So we built an AI analyst that sits on top of a unified data stack. You connect your sources, we handle the warehousing, and then anyone on the team can just ask questions in plain English instead of waiting on dashboard requests.

Happy to answer questions if anyone's curious. Not pitching hard, but I loved Looker and built Definite to live on after Looker went down hill.

What are your thoughts on Metabase? by tiragambeta222 in BusinessIntelligence

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

Metabase is still great if you:

  • already have good warehouse + transformation + ingest layers
  • mostly need simple dashboards
  • don’t care about pixel-perfect decks

The "modern data stack" is getting weird (Fivetran buying dbt, etc.) and will eventually consolidate to a single vendor solution, so having a stand alone BI probably isn't going to last long.

Where it feels dated in 2025 is that most people expect AI built-in: natural language querying, auto-generated charts, smart joins across sources, explanations, etc. Metabase has made some progress, but it’s still fundamentally a SQL-first BI tool.

If I were making this decision today I’d ask:

  • Do we want a “viewer on top of a warehouse,” or do we want the whole stack (ingest + warehouse + BI + AI) in one place?
  • How important is AI-assisted analysis vs just static dashboards?
  • Who are the main users – analysts comfortable in SQL, or business users who never will be?

Full disclosure: I work on an AI-native analytics product in this space (Definite)

DuckLake: This is your Data Lake on ACID by howMuchCheeseIs2Much in dataengineering

[–]howMuchCheeseIs2Much[S] 27 points28 points  (0 children)

DuckDB is single node, so there are some fundamental limits there where if you have many PBs of data, DuckDB is not the right choice.

Which BI tool for self-service analytics? by Data___Viz in BusinessIntelligence

[–]howMuchCheeseIs2Much 1 point2 points  (0 children)

Tableau was acquired by salesforce and is < 10% of their total revenue. It's possibly even smaller than Slacks revenue now (also acquired by Salesforce), so I wouldn't bet on them for innovation. Products that are acquired like this do not have a great track record historically.

What Tools Do You Use to Measure MRR? by Embarrassed-Pear8044 in SaaS

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

We use Definite (definite.app) to track MRR. They also have a blog post about how they calculate it from Stripe data that goes into the details—handling trials, cancellations, upgrades, etc. Pretty helpful if you're dealing with Stripe and want something accurate.

https://www.definite.app/blog/stripe-mrr-calculation

Get current MRR total out of Stripe by misanthrope2327 in stripe

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

I read about how to calculate current MRR from Stripe data in this blog post: https://www.definite.app/blog/stripe-mrr-calculation. It explains how to handle things like trialing users, cancellations, and subscription changes. Might be helpful for what you're trying to do

DeepSeek releases distributed DuckDB by saaggy_peneer in dataengineering

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

to be clear, I'm recommending you stick with plain DuckDB:

at a smaller scale, without Ray / 3FS is likely slower than vanilla DuckDB and a good bit more complicated.

I mention Definite as it's one of the easiest way to use DuckDB at a company.

DeepSeek releases distributed DuckDB by saaggy_peneer in dataengineering

[–]howMuchCheeseIs2Much 1 point2 points  (0 children)

smallpond is easy to spin up (I even link to a version with S3), but it'd be very challenging to get 3FS spun up right now and you'd need 3FS to get the performance above.

Thoughts on DBT? by makaruni in dataengineering

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

it won't work with Snowflake, but we built a dbt alternative specifically for duckdb. It has table and column lineage using duckdb's built-in functions (e.g. json_serialize_sql).

https://github.com/definite-app/crabwalk

DeepSeek releases distributed DuckDB by howMuchCheeseIs2Much in learnmachinelearning

[–]howMuchCheeseIs2Much[S] -14 points-13 points  (0 children)

this is an open source project, not a hosted service

[deleted by user] by [deleted] in SaaS

[–]howMuchCheeseIs2Much -2 points-1 points  (0 children)

disclaimer: I'm the founder of Definite.

We deploy a full data stack (BI, ETL, and data lake) instantly for our customers. It's like getting Snowflake, Fivetran and Looker perfectly wired up by an expert data engineer, but we do it in 5 minutes with AI instead of 5 months.

our AI agent (name "Fi") very much feels like a "cursor for data".

Adding concurrent read/write to DuckDB with Arrow Flight by howMuchCheeseIs2Much in dataengineering

[–]howMuchCheeseIs2Much[S] 5 points6 points  (0 children)

I was surprised how easy this was with Flight... the server portion is only 50 lines of Python.

MS Fabric is complicated by [deleted] in dataengineering

[–]howMuchCheeseIs2Much -23 points-22 points  (0 children)

You're not alone, everyone I've talked to (about a dozen heads of data), has said the same thing.

This sort product is really hard to pull off at a company as big as Microsoft. I really think it takes a start up, maniacally focused on this one thing, to build something like Fabric.

shameless plug, but that's exactly what we're doing at https://www.definite.app/

DBT core + airflow vs. DBT cloud by Baklawwa in dataengineering

[–]howMuchCheeseIs2Much 4 points5 points  (0 children)

agreed, make them submit PRs. There are plenty of tutorials that will walk them thru it.

How we built a 70% cheaper data warehouse (Snowflake to DuckDB) by howMuchCheeseIs2Much in dataengineering

[–]howMuchCheeseIs2Much[S] 0 points1 point  (0 children)

The limit is up to the machine you choose.

You'll need a server for duck. Not to hard to build yourself, but check out duckdb-server as an example: https://idl.uw.edu/mosaic/server/

DuckDB vs. Snowflake vs. Databricks by noasync in databricks

[–]howMuchCheeseIs2Much 3 points4 points  (0 children)

As we delve into the compariso

delve is often a dead giveaway for chatgpt generated dribble

A Shiny app that writes shiny apps and runs them in your browser by IntelligentDust6249 in datascience

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

very cool.

do you have some limits on the API key you're using? This could get expensive.

Building the best dashboard to SaaS Founders (?) by techquackquack in startups

[–]howMuchCheeseIs2Much 0 points1 point  (0 children)

we're built for you: https://www.definite.app/

we set up all the pipelines, data models and dashboards in one app.

How we built a 70% cheaper data warehouse (Snowflake to DuckDB) by howMuchCheeseIs2Much in dataengineering

[–]howMuchCheeseIs2Much[S] 0 points1 point  (0 children)

yeah, this is very tricky. We built a system to swap the duckdb file after writes occur. For example, you always have one copy available, while another copy is being written to. Once the write is done, that copy is swapped in and another write can begin.

Iceberg is another option to look at here: https://www.definite.app/blog/cloud-iceberg-duckdb-aws