COST OPTIMIZATION

Clean up data landscape by removing duplicate assets

Prevent duplicate assets to reduce costs.

🔄 Tasks

Detect duplicates:
1. Analyze lineage: Detect the creation of the same downstream asset through lineage.
2. Calculate differences: Compare the asset to existing data assets, using a data diff tool.
Recommend alternatives: Present the user with alternative data assets that are similar.

The user can then decide whether to reuse an existing (similar) data asset or continue with creating their own asset.

For example, imagine a user is writing a query that creates an output table. This workflow would use a data diff tool to compare that output table to existing tables. If there is a significant amount of overlap, the user could reuse the existing table. This would reinforce the existing table as a data product rather than creating duplicate assets.

🎉 Outcome

Prevent duplicate assets to reduce costs.

COST OPTIMIZATION

Purge stale or unused assets

Maintain a clean data landscape.

COST OPTIMIZATION

Dynamic data pipeline optimization

Reduce unnecessary data processing and improve resource utilization.

COST OPTIMIZATION

Allocate compute resources dynamically

Improve resource utilization and reduce processing delays.

Clean up data landscape by removing duplicate assets

🔄 Tasks

🎉 Outcome

Related

Purge stale or unused assets

Dynamic data pipeline optimization

Allocate compute resources dynamically