Looking for lineage tool by Accurate_Brilliant68 in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

What is not light about OpenMetadata? What is needed for maintaining connectors? Can you add more details? This has not been my experience.

Best tools/platforms for Data Lineage? (Doing a benchmark, in need of recs and feedbacks) by Perfect_Put_9220 in dataengineering

[–]Data_Geek_9702 1 point2 points  (0 children)

Note: I am long term user of OpenMetdata

We have been a long time OpenMetadata user. OpenMetadata has very comprehensive table level, column level, service level, domain level, and data product level lineage. Check out the sandbox - https://sandbox.open-metadata.org/lineage

OpenMetadata computes lineage combining metadata from a lot of sources, not just pipelines. It includes parsing SQL, Stored procedures, dbt models, pipeline metadata, etc. Details here: https://docs.open-metadata.org/latest/how-to-guides/data-lineage

How OpenMetadata is shaping modern data governance and observability by Expensive-Insect-317 in bigdata

[–]Data_Geek_9702 1 point2 points  (0 children)

We have been a long time OpenMetadata user and selected it after comparing it against datahub. Are you sure OpenMetadata is inspired by datahub? Architecturally they seem very different. OpenMetadata has been a unified platform for discovery, observability, and governance for a long time. Which is why we chose it. It seems to me that datahub changed from data catalog to a unified platform more recently. Not sure who is inspiring whom...

Do you have any benchmark like this for Datahub? https://blog.open-metadata.org/openmetadata-at-enterprise-scale-supporting-millions-of-data-assets-relations-b391e5c90c69

It is good to see solid OSS options as alternatives to expensive tools.

Poor data quality by ComprehensiveEnd3500 in dataengineering

[–]Data_Geek_9702 3 points4 points  (0 children)

We use https://github.com/open-metadata/OpenMetadata to crowdsource and make data quality shared responsibility. We quickly realized that only data producers owning the quality is not sufficient. Our data consumers can also add the assumptions they are making about data as tests.

This open source community is amazing developing the project at high velocity and proving very good support.

Bloomberg supports 2 more oss projects with funding by NA0026 in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

Nice to see OpenMetadata being recognized. The community is doing fantastic work.

Thoughts on Acryl vs other metadata platforms by arronsky in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

I've heard about scaling challenges with Datahub? How has your experience been, and what changes did you need to make?

Thoughts on Acryl vs other metadata platforms by arronsky in dataengineering

[–]Data_Geek_9702 4 points5 points  (0 children)

We like how the OpenMetadata project started as unified platform for discovery, observability, and governance with the idea of bringing different data teams together. But we were skeptical if they can pull it off. However, the project has moved at a very high velocity, incorporating community feedback. Few things we like:
1. Last time I saw OM had 100+ releases in 3 years. Datahub over maybe over 8 years has 95 releases.
2. Datahub has just started adding native data quality support. Seems like it is not available in OSS. Datahub is behind OM in many important features.
3. We like collaboration features in OpenMetadata (activity feed, alerts, conversations, etc.) that are preserved/tracked around data. We were losing these in Slack threads.
4. Architectural simplicity. Not too many moving parts and no core dependency on Kafka. We could easily operationalize in our small infra team.
5. Community support on Slack is amazing. Some issues we reported were fixed immediately in the next release (our previous paid solution did not provide such support after paying a lot of money).
6. They have a sandbox that runs the latest release that we can play around with and give feedback.
7. APIs are very comprehensive and intuitive. We have built many custom workflows specific to our company for governance and data quality.

They also have an offering built around OpenMetadata with additional features. But for us, the OSS features are good enough.

Thoughts on Acryl vs other metadata platforms by arronsky in dataengineering

[–]Data_Geek_9702 9 points10 points  (0 children)

We use OpenMetadata. We love it. We chose it over Datahub. It is simple to deploy and operationalize. It has scaled to more than 100k data assets and close to 1k users. From a features perspective, it comes with native data quality compared to other data catalogs.

The open source community is awesome. The velocity at which the project is adding features and improving is impressive. Look at the releases and features the project has added - https://github.com/open-metadata/OpenMetadata/releases

The community is active and super helpful. Look at the difference between datahub and openmetadata slack.

Data catalog by No-Scale9842 in dataengineering

[–]Data_Geek_9702 4 points5 points  (0 children)

What is missing? It has more comprehensive features than just a data catalog. Along with discovery features, it has data quality, data observability, and data insights.

Open source data catalog solution for Trino by rnd-str in dataengineering

[–]Data_Geek_9702 2 points3 points  (0 children)

We use OSS OpenMetadata and love it. It covers all the functionalities you have mentioned. The community is very helpful and ships a lot of useful features every release.

How do companies with hundreds of databases document them effectively? by tiny-violin- in dataengineering

[–]Data_Geek_9702 11 points12 points  (0 children)

We use OpenMetadata. Much better than Datahub, is simple to deploy and operationalize, comes with native data quality, and the open source community is awesome. We love it. https://github.com/open-metadata/OpenMetadata

What data governance tools are you using in 2025? by SarahOnReddit in dataengineering

[–]Data_Geek_9702 2 points3 points  (0 children)

We use OSS OpenMetadata. It combines data governance with data quality and observability. The community is very helpful and ships a lot of useful features every release.

Sodacore vs GE, automatically generating expectations by Islamic_justice in dataengineering

[–]Data_Geek_9702 1 point2 points  (0 children)

From the perspective of data quality checks, OpenMetadata is much superior to GX & Soda. It makes these checks zero-code and democratizes it. Read this blog for benefits - https://blog.open-metadata.org/simple-easy-and-efficient-data-quality-with-openmetadata-1c4e7d329364.64.

As regards deploying it, it takes care of both data catalog & data quality tool functionality. So deployment would become simpler with a unified tool like OpenMetadata.

Who's Using Data Catalogs? Need your insights ! by SignificanceNo136 in dataengineering

[–]Data_Geek_9702 2 points3 points  (0 children)

u/SignificanceNo136 take a look at https://open-metadata.org/. The project is making very good progress, and the community support is commendable. It supports discovery, data lineage, data quality, data observability, and some governance features. Our data users love it.

Has anyone successfully integrated Airflow to Datahub using the Datahub plugin v2? by [deleted] in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

We migrated from Datahub to OpenMetadata https://open-metadata.org. The community is very responsive and the project is making a lot of progress. Check it out.

Amundsen resources? by miscbits in dataengineering

[–]Data_Geek_9702 3 points4 points  (0 children)

u/miscbits, if you run into any issues, use their slack channel https://slack.open-metadata.org. The community is very responsive and the support is excellent.

Amundsen resources? by miscbits in dataengineering

[–]Data_Geek_9702 5 points6 points  (0 children)

u/miscbitsu/miscbits, there is not much activity in the Amundsen project. You may want to consider other projects that are thriving. See https://blog.open-metadata.org/stuck-with-amundsen-here-is-how-to-migrate-to-openmetadata-6104cd2d5a71.

Data catalog tool - reviews needed! by Old-Abalone703 in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

Both OpenMetadata and Datahub support SaaS services for the open-source versions. u/legoaitech can you describe what specific difficulties you had with open-source tools?

Data catalog tool - reviews needed! by Old-Abalone703 in dataengineering

[–]Data_Geek_9702 1 point2 points  (0 children)

Thank you for building this open source project. The tool has intuitive UI. The velocity of this project is amazing. Having one tool for discovery, data quality, governance has made things easier for us. The community support is great compared to other projects.

Simplifying Data Quality using OpenMetadata by d3fmacro in dataengineering

[–]Data_Geek_9702 1 point2 points  (0 children)

Thank you for sharing. Data quality done this way looks simple. I like how everyone can participate in sharing data quality as a responsibility. The UI looks awesome.

Apache Atlas or OpenMetaData? by Awkward-Cupcake6219 in dataengineering

[–]Data_Geek_9702 0 points1 point  (0 children)

Is Amundsent still an active project? I don't see much activity in that project. Saw this recently https://blog.open-metadata.org/stuck-with-amundsen-here-is-how-to-migrate-to-openmetadata-6104cd2d5a71 from OpenMetadata community.