Top Gradient

Feature overview

The SELECT dbt integration allows you better understand and optimize Snowflake spend attributed to dbt.

  • Dissect your dbt spend across a variety of different dimensions such as project, target, materialization and resource type.
  • Identify expensive models and tests, and understand how their cost and performance varies over time.
  • Drill into the execution history of any dbt resource and view query profile diagnostics for each run.
  • Receive tailored optimization recommendations to improve model cost and performance

dbt model timing chart

With SELECT's dbt integration, you can also understand the performance of each dbt run with our model timing chart, located on the Timeline tab of each dbt run page:

SELECT Snowflake dbt model run timing chart

A great feature of this chart is you can click straight into a given model's run to understand why it took the time it did using our version of the Snowflake query profile:

dbt runs view

Understand the performance and cost of your dbt project over time using the dbt runs view. Filter for specific commands, targets, users and more. Click into individual runs in the table or chart to view more detail and access the timing chart.

Setup

SELECT uses query tags and comment metadata to power this integration. To enable this integration, follow the configuration steps below.

Step 1: Update packages.yml

Add the following entry to your dbt project's packages.yml file in the root of the dbt project, creating the file if it doesn't exist:

1packages:
2 - package: get-select/dbt_snowflake_query_tags
3 version: [">=2.0.0", "<3.0.0"]

Step 2: Adding the query tags

Option 1: If running dbt >= 1.2, simply configure the dispatch search order in dbt_project.yml.

1dispatch:
2 - macro_namespace: dbt
3 search_order:
4 - <YOUR_PROJECT_NAME>
5 - dbt_snowflake_query_tags
6 - dbt

Option 2: If running dbt < 1.2, create a folder named macros in your dbt project's top level directory (if it doesn't exist). Inside, make a new file called query_tags.sql with the following content:

1{% macro set_query_tag() -%}
2{% do return(dbt_snowflake_query_tags.set_query_tag()) %}
3{% endmacro %}
4
5{% macro unset_query_tag(original_query_tag) -%}
6{% do return(dbt_snowflake_query_tags.unset_query_tag(original_query_tag)) %}
7{% endmacro %}

Step 3: Configure the query comments

Add the following config to dbt_project.yml.

1query-comment:
2 comment: '{{ dbt_snowflake_query_tags.get_query_comment(node) }}'
3 append: true # Snowflake removes prefixed comments.

That's it! All dbt-issued queries will now be tagged.

Example dbt query metadata

This configuration adds metadata describing each dbt-issued query to a tag on the query. Once enabled, SELECT will begin to provide insights, analytics and recommendations on your dbt project. An example of the metadata added to each query is shown below:

1{
2 "dbt_snowflake_query_tags_version": "2.0.0",
3 "app": "dbt",
4 "dbt_version": "1.4.0",
5 "project_name": "my_project",
6 "target_name": "dev",
7 "target_database": "dev",
8 "target_schema": "larry_goldings",
9 "invocation_id": "c784c7d0-5c3f-4765-805c-0a377fefcaa0",
10 "node_name": "orders",
11 "node_alias": "orders",
12 "node_package_name": "my_project",
13 "node_original_file_path": "models/staging/orders.sql",
14 "node_database": "dev",
15 "node_schema": "mart",

Step 4: All done!

That's all that's required for setup. The next time your dbt project runs, the generated queries will automatically be tagged with this metadata. SELECT will then process this query metadata and present it back to you in the web application.

Leveraging dbt meta & tags

Consider the example dbt_project.yml below, which implements a combination of dbt meta fields and tags.

1# dbt_project.yml
2name: 'select_dbt'
3version: '1.0.0'
4config-version: 2
5
6models:
7 +meta:
8 # Set defaults for all models
9 team: 'data'
10 owner: "@james.fischer"
11 business_owner: "sales_team"
12 model_maturity: "production"
13 refresh_frequency: "daily"
14 domain: "customer_analytics"
15 contains_pii: true

The file structure for this dbt project would look something like what is shown below. All resources under the engineering directory would be get a team meta field added with the value engineering. Similarly, the one model in the forecasts directory would get a team field of finance added. It would also have tags: ["sensitive", "critical"].

1.
2├── dbt_project.yml
3├── packages.yml
4└── models/
5 ├── engineering/
6 | ├── syncs.sql
7 | └── ai.sql
8 ├── marketing/
9 | ├── attribition.sql
10 | └── page_clicks.sql
11 └── forecasts/
12 └── forecast.sql

If you would like to set meta per model instead of per folder or "resource path", here is how you would do that:

1models:
2 - name: customers
3 description: 1 row per customer.
4 config:
5 meta:
6 team: "data_team"
7 business_owner: "sales_team"
8 domain: "customer_analytics"
9 model_maturity: "production"
10 refresh_frequency: "daily"
11 contains_pii: true
12 columns:
13 - name: customer_id
14 description: The unique key.
15 data_tests:

To expose dbt meta fields in SELECT, head to the dbt settings and specify which fields you'd like to filter on. No such specification is required for dbt tags.

SELECT dbt meta keys settings

Once added, you can filter and slice by this metadata throught the UI!

Looking to do cost chargebacks or showbacks using dbt metadata? You can now flexibly allocate costs using dbt meta fields and tags you've specified using SELECT's Usage Groups feature!

Accessing the dbt query metadata

If you would like to access the dbt query metadata created by our dbt package, you have two options.

Option 1: Snowflake Query History

You can access the dbt query metadata by parsing the query comments/tags our package adds by querying the Snowflake query history. Here is an example SQL script you can run in your account:

1with
2dbt_queries as (
3 select
4 start_time,
5 query_text,
6 execution_time/1000 as execution_time_s,
7 try_parse_json(query_tag) as dbt_query_tag_meta,
8 try_parse_json(regexp_substr(query_text, $$\/\*\s({"app":\s"dbt".*})\s\*\/$$, 1, 1, 'ie')) as dbt_comment_meta
9 from snowflake.account_usage.query_history
10 where
11 dbt_query_tag_meta is not null
12 and dbt_comment_meta is not null
13 and start_time > current_date - 7
14)
15select

Option 2: Via a data share

All data visible in the SELECT web application can be accessed through a secure data share, including a table named dbt_queries which includes all of the metadata (and more) shown above. See our data sharing documentation for more details.

Get up and running in less than 15 minutes

Connect your Snowflake account and instantly understand your savings potential.

CTA Screen