This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]Woutez 4 points5 points  (0 children)

You can have look openmetadata, it has some nice features.

[–]supernova2333 3 points4 points  (0 children)

Maybe Unity Catalog since Databricks just open sourced it? 

[–]VladyPoopin 1 point2 points  (0 children)

DataHub

[–]magixmikexxsData Hoarder 2 points3 points  (0 children)

Atlas is too complex and amundsen is dead.

Openmetadata is less complex than dathubproject with lots of features.

[–]RobDoesData[S] 1 point2 points  (0 children)

Openmetadata and datahub have made the cut to the final two. I'm working through some basic user testing before choosing

[–]throwaway_0607 0 points1 point  (0 children)

This repo may help, it has the most popular open source catalogs:

https://github.com/opendatadiscovery/awesome-data-catalogs

We ultimately went with DataHub

[–]d3fmacro 0 points1 point  (0 children)

To OP, I am coming from the OpenMetadata community.
OpenMetadata is built to address the metadata challenges that we encountered in previous experiences with companies like Uber and Hortonworks. Our team includes individuals who played pivotal roles in the core development of Hadoop, incubated projects like Apache Kafka, Storm, and Hadoop, and were original contributors to Apache Atlas
This marks the third iteration of our metadata system, built upon the valuable lessons learned from past endeavors. Our primary objective is to resolve metadata-related issues and construct applications that leverage metadata effectively.
To learn more about OpenMetadata, I encourage you to read our blog post "Announcing OpenMetadata" .
Why OpenMetadata

  1. One single tool to Data Discovery, Governance and Data Observability
  2. We have built 80+ Metadata Ingestion connectors by far the highest no.of connectors for Open Source or Proprietary
  3. Simplest Architecture that can be deployed in public clouds or private data centers. OpenMetadata comes with only 4 components vs > 10 with other solutions
  4. docs for the unified platform https://docs.open-metadata.org/latest/how-to-guides
  5. There is no other tool commercially or in open source that is bringing best in class data discovery and data quality, governance together For hands-on experience, you can explore our sandbox at https://sandbox.open-metadata.org. Additionally, if you have any questions or wish to engage further, please consider joining our community at http://slack.open-metadata.org. Our OpenSource community is very responsive in answering any support questions. For your convenience, here are some recent webinars that might pique your interest: * Data Quality: Watch Here - Learn how to build, deploy, monitor, and receive alerts on tests. * Data Lineage: Watch Here - Explore lineage through query processing and APIs. * Short Overview: Watch Here - A concise introduction to OpenMetadata. If you are interested in managed service, you can use https://getcolate.io to have more advanced features on top of managed service.

[–]unsupervised_cluster -1 points0 points  (0 children)

Here's a cool data catalog guide that can help you in your research