What is Amundsen Data Catalog?
Amundsen is an open source data discovery platform and metadata engine that was developed by the Lyft Engineering team. Amundsen data catalog was built to improve the productivity and efficiency of data practitioners at Lyft.
It was open-sourced in October 2019, a year after launching in production. Amundsen since then has enjoyed a buzzing community of users, who have expanded it to build their data catalog on top of it.
The main capabilities of Amundsen include:
- Easy data discovery
- Automated and curated metadata - powering use cases
- Ability to share knowledge & context with coworkers
- Enabling learning from data usage
Amundsen Data Catalog Demo
Here's a hosted demo environment that should give you a fair sense of the Lyft Amundsen data catalog platform:
Sandbox (Coming Soon)
For a quick catch-up, also explore this video and others in the channel which has Amundsen data catalog demos, community meetings, conference presentations etc.
Amundsen Data Catalog Toolkit
While you take a demo and flip through the video resources to understand how Amundsen data catalog works, you may also want to keep the following resources handy:
- Amundsen GitHub Repo
- Link to connect with the Lyft Amundsen community on Slack
- A quick view of Amundsen data catalog architecture
- Link to their official website
Are you evaluating Amundsen Data Catalog for querying, lineage, profiling and other specific use cases? Trying the open source data catalog tool hands-on is an important step of this evaluation process. What are the other crucial steps that you must undertake while evaluating a data catalog? Get hold of this check list to stay on track!
Also interested in other open soure data catalogs? Check out this compilation of the most popular open source data catalog tools to ensure you aren't missing out on any of them.