Integrations

Custom Workloads

What are custom workloads?

Many Snowflake users leverage the platform for unique workloads, such as custom ETL pipelines, machine learning applications, or analytics applications. It's important for users to monitor these workloads to understand their cost and performance over time.

Feature overview

The SELECT custom workload integration allows you better understand and optimize Snowflake spend associated with these unique workloads.

  • Easily dissect your workload spend across a variety of different dimensions such as environment, team and tenant_id.
  • Identify expensive workloads, and understand how their cost and performance varies over time.
  • Drill into the execution history of any workload and view query profile diagnostics for each run.
  • Receive tailored optimization recommendations to improve workload cost and performance
SELECT Looker integration dashboard overview

Integrating with custom workloads

To support custom workload monitoring (i.e. queries run by custom tooling), SELECT utilizes query metadata. The query metadata is a JSON object, and can be set in either the query's tag or in a comment. The schema is as follows:

{
# [REQUIRED]
"select_schema_version": "1.0.0",
# [REQUIRED]
# identifiers
"app": "generated_tables", # first grouping category
"workload_id": "my_etl_job", # secondary grouping category
# [OPTIONAL]
# If a workload contains multiple queries, include a run_id to group them together
# If run_id isn't present, each query is considered an individual run
# This value might be a date but can be anything
"run_id": "2020-01-01",
# [OPTIONAL]
# First class filters supported in UI
"environment": "dev",
"team": "data",
"tenant_id": "customer 1",
# [OPTIONAL]
# Custom tags - you can put whatever you want in here!
# These will be shown on the individual run page
"creator": "Hermione Granger",
"description": "Processes the data",
"link": "https://airflow.org.net/dag/my_etl_job",
"git_sha": "abc123"
}

Example query comments

Both multi-line and single-line comment syntax are supported. If using single-line comments, ensure that the final character in the comment is a closing curly brace and not a semi-colon.

Multi-line comment

select * from my_table
/*
{
"select_schema_version": "1.0.0",
"app": "ingest",
"workload_id": "facebook_ads"
}
*/

Single-line comment:

select * from my_table
-- {"select_schema_version": "1.0.0", "app": "ingest", "workload_id": "facebook_ads"}

Example query tag

Alternatively, query tags can be used. The downside to using query tags is the need to both set and unset the query tag in the session, adding latency and additional complexity.

{"select_schema_version": "1.0.0", "app": "ingest", "workload_id": "facebook_ads"}

Example Python implementation using query comments

import json
import snowflake.connector
# Connect to Snowflake & create a cursor
conn = snowflake.connector.connect(
user='<your_username>',
password='<your_password>',
account='<your_account>',
)
cursor = conn.cursor()
# Generate query comment
select_query_comment = {
"select_schema_verison": "1.0.0",
"app": "data_transformations",
"workload_id": "customers",
"environment": "prod"
}
select_query_comment_str = f"\n\n/* {json.dumps(select_query_comment)} */"
## Execute query with comment
query = """
create or replace table db.analytics.customers as (
select *
from raw_customers
where
not deleted
and created_at > '2019-01-01'
)
"""
cursor.execute(query + select_query_comment_str)
Previous
dbt