On August 10th, dbt_artifacts v1.0.0 was published. We’re super excited about this release, not only because it greatly improves the package for users, but also because it represents a wonderful collaboration between Brooklyn Data engineers and others in the data community.
What is dbt_artifacts?
dbt_artifacts is a package for modeling a dbt project and its run metadata. It includes the following models to help you understand the current state of a dbt project and its performance over time.
- dim_dbt__current_models
- dim_dbt__exposures
- dim_dbt__models
- dim_dbt__seeds
- dim_dbt__snapshots
- dim_dbt__sources
- dim_dbt__tests
- fct_dbt__invocations
- fct_dbt__model_executions
- fct_dbt__seed_executions
- fct_dbt__snapshot_executions
- fct_dbt__test_executions
It has many use cases, from identifying flakey tests to understanding the slowest running models for performance optimization.
What makes v1 so great?
This release reflects a complete rewrite of the package, doing away with loading dbt's json artifact files and instead using the `graph` and `results` context variables dbt makes available. This solves several issues from the pre-v1 releases:
- Overcomes the 16MB variant limit in Snowflake
- Now uses the `on-run-end` hook, which always fires regardless of run result status. This mitigates an issue where dbt-artifacts would not run in dbt Cloud if previous steps had failed.
- Smooths the path for additional database support. This release includes support for Databricks, and support for BigQuery is already underway!
Version 1 also benefits from a significant speed increase now that it no longer needs to process any json files, and now that all of its models are views.
If you're an existing dbt-artifacts user, there's a straightforward process for migrating to v1.
A huge thank you to the Brooklyn Data engineers who contributed to v1:
Have fun, and happy data modeling!