In Snowflake, a Dynamic Table is a table that is materializes the result of a SQL query and is automatically kept up to date on a schedule. Much like a view, the SQL to create the dynamic table can include significant complexity: join multiple tables, different join types (left join, full join, Cartesian), unions, calculations, etc. But unlike a view which never persists the data, the result of a dynamic table is persisted as a physical table.
Dynamic Tables are refreshed at a frequency you specify, called “target lag” or target_lag
. Therefore, selecting (reading) from a dynamic table is often much more performant than reading a view with a lot of complex joins.
Dynamic tables are a simple and cost effective way to create performant data pipelines that will always stay up to date.
In Snowflake, a Materialized View is a view of a single table. It persists as a physical table so it is fast to query, but the kept up to date in real time, just like a view.
Below are the key differences between Materialized Views and Dynamic Tables.
In Snowflake, a Materialized View cannot have joins. This is a major limitation which severely limits the usefulness of Materialized Views.
A Dynamic Table in Snowflake is much more like a Materialized View in other databases, such as PostgreSQL, where there are very few limitations on the SQL you can write. However, in Postgres, a Materialized View must be manually refreshed with the refresh materialized view
command, whereas a Dynamic Table in Snowflake can automatically be refreshed based on target lag.
Materialized views in Snowflake have the advantage that they are always kept up to date. They are real-time, with nothing you need to do to refresh it.
Dynamic Tables, however, refresh on a schedule. The end user may need to be aware of how fresh the data is.
When querying the base tables of a materialized view, Snowflake’s query optimizer may re-write your query to use the materialized view instead.
When querying the base data of a dynamic table, snowflake does not re-write the query to use the dynamic table.
While materialized views have two advantages (always fresh and query rewrite), the fact they must be based on a single table limits the usefulness so much that reaching for a dynamic table will be much more common.
Use a materialized view when all you are doing is aggregating or adding calculations, or transforming a single table.
Use a Dynamic Table for more complex use cases.
A dynamic table is created similar to the familiar CTAS, but with few extra parameters, target_lag
, warehouse
, and many more optional parameters shown below.
Full syntax, copied from Snowflake docs:
CCREATE [ OR REPLACE ] [ TRANSIENT ] DYNAMIC TABLE [ IF NOT EXISTS ] <name> ( -- Column definition <col_name> <col_type> [ [ WITH ] MASKING POLICY <policy_name> [ USING ( <col_name> , <cond_col1> , ... ) ] ] [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ] [ COMMENT '<string_literal>' ] -- Additional column definitions [ , <col_name> <col_type> [ ... ] ] ) TARGET_LAG = { '<num> { seconds | minutes | hours | days }' | DOWNSTREAM } WAREHOUSE = <warehouse_name> [ REFRESH_MODE = { AUTO | FULL | INCREMENTAL } ] [ INITIALIZE = { ON_CREATE | ON_SCHEDULE } ] [ CLUSTER BY ( <expr> [ , <expr> , ... ] ) ] [ DATA_RETENTION_TIME_IN_DAYS = <integer> ] [ MAX_DATA_EXTENSION_TIME_IN_DAYS = <integer> ] [ COMMENT = '<string_literal>' ] [ [ WITH ] ROW ACCESS POLICY <policy_name> ON ( <col_name> [ , <col_name> ... ] ) ] [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ] [ REQUIRE USER ] AS <query>
Practically speaking, your create
statement may look like this:
create or replace dynamic table my_table target_lag = '1 Day' warehouse = 'TRANSFORMING' refresh_mode = 'incremental' as select customers.name, count(*) as total_orders from orders inner join customers using (customer_id) group by 1
The refresh is initialized immediately, because the default value for intitalize
is on_create
.
Let’s dig in to the two unique arguments you will use the most: target_lag
and refresh_mode
Target lag is the maximum amount of time you want the data to lag behind changes in the source data. It is expressed as an integer and the datetime part (seconds, minutes, hours, days). For example, 5 minutes
is a valid target lag.
If you stack a dynamic table on top of another dynamic table, you can set target_lag = downstream
for all dynamic tables in the chain except for the last one, which should have a time schedule. Consider the example below:
In the image above, we have two dynamic tables chained together. The first one (orders_joined
) has a target lag of downstream
. The second one (orders_aggregated
) has a target lag of 1 day. In this case, the entire DAG will update once daily. You don’t need to worry about setting a schedule on each table.
If the last dynamic table in your DAG has a target lag of downstream
, your data will never refresh!
Snowflake provides this callout in the docs: Target lag is not a guarantee. Instead, it is a target that Snowflake attempts to meet. Data in dynamic tables is refreshed as closely as possible within the target lag. However, target lag may be exceeded due to factors such as warehouse size, data size, query complexity, and similar factors.
The Refresh Mode can be auto
, full
, or incremental
. The default refresh mode is auto
which attempts to refresh incrementally and defaults to full if it cannot.
Dynamic Tables refresh incrementally based on changes
in the source data. You don’t need to provide any info about a primary key or how to check for changes, Snowflake does it all for you, like magic! For more info on Changes, check out the Snowflake Docs and our blog post on Streams.
It is worth mentioning that Dynamic Tables are related to Streams in Snowflake, as they use the same change tracking technology under the hood, changes
. In fact, these features are maintained by the same team at Snowflake!
Dynamic Tables are a declarative, user friendly way to build a pipeline.
Streams are imperative, and much more customization is involved to make the pipeline work.
A dynamic table cannot be refreshed incrementally if:
current_timestamp
or random
is used.pivot
, unpivot
, union
, minus
, intersect
, except
.union all
is supported for incremental! Except in some edge cases.partion_by
clauses are used in various window functions.in
, any
, all
, exists
are used.There are several more limitations worth watching out for! Above, I only listed the limitations I think you will face most often. For a full explanation of Incremental Refresh limitations for Dynamic Tables, please read this page.
auto
refresh may be a nice choice… but…If you successfully create a Dynamic Table using incremental
and one day Snowflake cannot refresh it incrementally due to one of the documented limitations, the refresh will fail. It will fail silently unless you’ve engineered some type of alerting mechanism to monitor the refreshes. To avoid this headache, you can use auto
to allow Snowflake to full refresh your table when needed. Then, monitor your dynamic tables to observe how Snowflake is handling your refreshes.
However, Snowflake does give a warning about using auto
refresh mode:
For consistent behavior, explicitly set the refresh mode on all production tables. The behavior of AUTO
might change between Snowflake releases, which can cause unexpected changes in performance if used in production pipelines.
The easiest way to monitor Dynamic Tables is using the Snowsight UI.
In the Data tab on the left sidebar in Snowsight, navigate to your dynamic table. Click it, then click on the Refresh History tab.
A few super useful things to call out:
For automated monitoring of refresh failures, see our articles on how to send alerts to Slack and Microsoft Teams.
Not only can you monitor individual DAGs by selecting any dynamic table from the Data tab, you can also see the status of all dynamic tables in one place using the “Monitoring” tab.
Here, you can sort on “Refresh Status” to find failed refreshes.
Snowflake provides several properties that can be updated via alter dynamic table
. Here is the exhaustive list from Snowflake’s documentation:
ALTER DYNAMIC TABLE [ IF EXISTS ] <name> { SUSPEND | RESUME } ALTER DYNAMIC TABLE [ IF EXISTS ] <name> RENAME TO <new_name> ALTER DYNAMIC TABLE [ IF EXISTS ] <name> SWAP WITH <target_dynamic_table_name> ALTER DYNAMIC TABLE [ IF EXISTS ] <name> REFRESH [ COPY SESSION ] ALTER DYNAMIC TABLE [ IF EXISTS ] <name> { clusteringAction } ALTER DYNAMIC TABLE [ IF EXISTS ] <name> { tableColumnCommentAction } ALTER DYNAMIC TABLE <name> { SET | UNSET } COMMENT = '<string_literal>' ALTER DYNAMIC TABLE [ IF EXISTS ] <name> dataGovnPolicyTagAction ALTER DYNAMIC TABLE [ IF EXISTS ] <name> searchOptimizationAction ALTER DYNAMIC TABLE [ IF EXISTS ] <name> SET [ TARGET_LAG = { '<num> { seconds | minutes | hours | days }' | DOWNSTREAM } ] [ WAREHOUSE = <warehouse_name> ] [ DATA_RETENTION_TIME_IN_DAYS = <integer> ] [ MAX_DATA_EXTENSION_TIME_IN_DAYS = <integer> ] [ DEFAULT_DDL_COLLATION = '<collation_specification>' ] ALTER DYNAMIC TABLE [ IF EXISTS ] <name> UNSET [ DATA_RETENTION_TIME_IN_DAYS ], [ MAX_DATA_EXTENSION_TIME_IN_DAYS ], [ DEFAULT_DDL_COLLATION ]
-- Suspend the dynamic table / turn of automatic updates alter dynamic table my_dynamic_table suspend -- turn auto updates back on alter dynamic table my_dynamic_table resume -- update target lag alter dynamic table my_dynamic_table set target_lag = '2 Days' -- change target_lag from time interval to downstream alter dynamic table my_dynamic_table set target_lag = 'downstream' -- change refresh_mode from incremental to full alter dynamic table my_dynamic_table set refresh_mode = 'full' -- manually trigger the refresh of a dynamic table right now alter dynamic table my_dynamic_table refresh
It is notable that you cannot add a column or change the SQL via alter
command; you must replace the table completely. This can be done by drop table my_dynamic_table
then recreating it. Or, you can use create or replace
in the DDL: create or replace dynamic table my_dynamic_table...
Let’s create a simple example that everyone can follow along. For this example, we will be copying the sample data from the snowflake_sample_data
database. (Unfortunately you cannot clone or create dynamic tables on top of data from a share, so we will just duplicate the data, then create the dynamic tables).
Here is a picture of the DAG we are creating:
Let’s create some new objects so we are all working from the same starting point.
use role sysadmin; create warehouse example_wh_xs warehouse_size = xsmall auto_suspend = 60 auto_resume = true; use role securityadmin; create role example_role; grant role example_role to user jeff; -- swap in your user grant all on warehouse example_wh_xs to role example_role; use role sysadmin; create database dynamic_demo; grant ownership on database dynamic_demo to role example_role; use role example_role; create schema dynamic_demo.sample_data;
Now let’s copy some sample data into our new database. (Again, clone will not work!)
use schema dynamic_demo.sample_data; create or replace table orders as select * from snowflake_sample_data.tpch_sf10.orders; create or replace table customer as select * from snowflake_sample_data.tpch_sf10.customer;
Now let’s create two dynamic tables: a table to join our two new tables, and a table which aggregates the data. The first dynamic table in the DAG will have a target lag of downstream
. The second dynamic table will have a target lag of 1 day. This will control the lag for the entire DAG.
create or replace dynamic table obt_orders target_lag = 'downstream' warehouse = 'EXAMPLE_WH_XS' refresh_mode = 'incremental' as select * from orders join customer on orders.o_custkey = customer.c_custkey; create or replace dynamic table current_month_fulfillments target_lag = '1 day' warehouse = 'EXAMPLE_WH_XS' refresh_mode = 'incremental' as with current_month as ( select date_trunc('month',max(o_orderdate)) as current_month_start from obt_orders ) select c_mktsegment as market_segment, sum(o_totalprice) as order_amount from obt_orders, current_month where o_orderstatus in ('F','P') and date_trunc('month',o_orderdate) = current_month_start group by 1 ;
The current_month_fulfillments
table should have no data, because none of the orders have F or P status for that month. (Check the where
clause on our query above).
Now let’s update the raw data and manually refresh the current_month_fulfillments
table:
update orders set o_orderstatus = 'F' where date_trunc('month', o_orderdate) = '1998-08-01'; -- yes, the max order month in the data is august 1998! -- 12,466 rows updated -- manually refresh the last table in the DAG so we don't have to wait: alter dynamic table current_month_fulfillments refresh;
We can see the first table refreshed automatically, because it was set to target_lag='downstream'
:
We see that 12.5K rows were deleted and re-inserted.
When we check the second table in the DAG, we can see that 5 rows were inserted:
Click on the “Graph” tab, which shows a great view of all of the tables in the DAG. The dynamic tables will show a status: “Succeeded” in green or “Failed” in red.
drop database dynamic_demo; use role sysadmin; drop warehouse example_wh_xs; use role securityadmin; drop role example_role;
Creating a dynamic table in dbt is as easy as adding this config to your normal .sql file:
{{ config( materialized="dynamic_table", on_configuration_change="apply" | "continue" | "fail", target_lag="downstream" | "<integer> seconds | minutes | hours | days", snowflake_warehouse="<warehouse-name>", refresh_mode="AUTO" | "FULL" | "INCREMENTAL", initialize="ON_CREATE" | "ON_SCHEDULE", ) }}
Or in a property file:
version: 2 models: - name: [<model-name>] config: materialized: dynamic_table on_configuration_change: apply | continue | fail target_lag: downstream | <time-delta> snowflake_warehouse: <warehouse-name> refresh_mode: AUTO | FULL | INCREMENTAL initialize: ON_CREATE | ON_SCHEDULE
When dbt runs the first time, it will create the dynamic table. On subsequent runs, it will see the table already exists and skip it. The table is only refreshed by the target_lag
, not by running dbt.
As mentioned earlier, Snowflake does not support changing the SQL for a dynamic table. Therefore, any change to your model definition requires a --full-refresh
.
Running dbt with --full-refresh
will drop and re-create the dynamic table.
Dynamic Tables:
Incremental Models:
Dynamic tables have quite a few limitations. For a full list of limitations, I suggest referring to the docs. But here are the limitations I think you are most likely to face:
changes
in Snowflake leverages time travel. Time travel must be enabled.data_retention_time_in_days
of the upstream tables.access_history
view.select my_sequence.nextval
sample
or tablesample
in the dynamic table definition.Here are my top recommendations to consider when using Dynamic Tables:
downstream
target lag on all tables except for the last table in the DAG.With dynamic tables, you are charged for three main things:
Snowflake recommends to use a dedicated warehouse to monitor cost for dynamic tables; however, I would recommend not proliferating warehouses.
Instead, you can use a tool like SELECT that will automatically show you the cost of each dynamic table and how that is changing over time.
You can also see the cost of your entire dynamic table DAG, which can often let you spot bigger issues like things running to frequently.
Dynamic tables are a great tool to have in your data pipeline toolbelt. The automated refreshes and simplicity of not needing to specify incremental logic make dynamic tables a very attractive option. Just be aware of the many limitations of dynamic tables, especially the limitations we discussed around incremental refresh, which are significant.
Hopefully you now feel ready to use dynamic tables in your data pipelines! I’d love to hear about your experiences with dynamic tables.
Jeff Skoldberg is a Sales Engineer at SELECT, helping customers get maximum value out of the SELECT app to reduce their Snowflake spend. Prior to joining SELECT, Jeff was a Data and Analytics Consultant with 15+ years experience in automating insights and using data to control business processes. From a technology standpoint, he specializes in Snowflake + dbt + Tableau. From a business topic standpoint, he has experience in Public Utility, Clinical Trials, Publishing, CPG, and Manufacturing.
Get up and running with SELECT in 15 minutes.
Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.