In Snowflake's November 2022 release, they quietly announced some exciting new SQL syntax, EXCLUDE
and RENAME
, which allows users to remove and rename specific columns when running a SELECT style query. This is particularly exciting, since it is very common when operating in non-production SQL workflows to run `SELECT ` style exploratory queries like:
select * from table where id = 5
Users often do this in order to investigate a particular record or subset of records. But what happens when you want to exclude a specific column (like a large text column), or rename one of them? Users are forced to type out all of the columns they want, along with any desired renaming using the traditional AS
syntax:
select column_1, column_2, column_3 as column_3_renamed, column_4, column_5, column_6, column_7, column_8, column_9, -- columns 10 and 11 are intentionally left out column_12 from table where id = 5
Not the best when you're trying to get answers, quickly.
Instead of typing out each column, a common ask{% superscript id=1 /%} from users is the ability to specify a subset of columns to exclude from the table. In many cases, this is much more efficient to write since the number of columns to be excluded is often much lower than the number to be included. As of Snowflake's version 6.37 release, you can now include an EXCLUDE
clause in SELECT *
style SQL queries where you specify one or more columns that you don't want returned from the table:
select * exclude(column_10, column_11) from table where id = 5
This can also be used to exclude a single column:
select * exclude (column_10) from table where id = 5
Brackets are optional when excluding a single column. The SQL above can be rewritten as:
select * exclude column_10 from table where id = 5
Another common need for excluding columns is when joining multiple tables. Imagine you have a query like this{% superscript id=2 /%}:
select * from orders join customers on orders.customer_id=customers.customer_id join items on orders.order_id=items.order_id
In this example, we have two order_id
columns and two customer_id
columns in the output, since they are present in multiple tables. We can easily exclude these by changing our SQL to:
select orders.*, customers.* exclude customer_id, items.* exclude order_id from orders join customers on orders.customer_id=customers.customer_id join items on orders.order_id=items.order_id
Snowflake's EXCLUDE
syntax is similar to BigQuery's and Databrick'sEXCEPT
syntax:
-- Snowflake select * exclude(column_10, column_11) from table -- BigQuery/Databricks select * except(column_10, column_11) from table
For anyone familiar with DuckDB, Snowflake follows the same EXCLUDE
syntax that they use. Both databases presumably chose to avoid using the EXCEPT
keyword to provide this functionality, since it is already used in set operations.
While this new syntax is great for speeding up adhoc workflows, it is not recommended for production SQL code. It's better to be explicit about the exact columns you are selecting. Writing out all column names improves readability and auditability. Users reading your SQL code can immediately understand which columns you are using from which table. It can also lead to clearer errors. If a column is suddenly removed from a table, your SQL code will immediately fail. This is much better than a downstream application suddenly erroring due to a missing column.
No. Using the EXCLUDE
keyword to remove a few columns does not result in a different query plan or query performance compared to explicitly writing out all columns. You can validate this behaviour by running both types of queries on a dataset in your Snowflake account and inspecting the query profile for each query.
Prior to the new RENAME
functionality, users were forced to write out every column name as soon as they needed to rename at least one when running a SELECT *
query. Now, users can easily select all columns while renaming a subset:
select * rename (column_3 as column_3_renamed, column_5 as column_5_renamed) from table where id = 5
This is much better than having having to list out every field just to rename one or two:
-- old method 👎 select column_1, column_2, column_3 as column_3_renamed, column_4, column_5 as column_5_renamed, column_6, ... column_12 from table where id = 5
Similar to EXCLUDE
, this can also be done for a single column, brackets optional:
select * rename (column_3 as column_3_renamed) from table where id = 5
EXCLUDE
and RENAME
can easily be combined, as shown in the example below:
select * exclude(column_10, column_11) rename column_3 as column_3_renamed from table where id = 5
Much better than the code from the beginning of the blog post:
select column_1, column_2, column_3 as column_3_renamed, column_4, column_5, column_6, column_7, column_8, column_9, -- columns 10 and 11 are intentionally left out column_12 from table where id = 5
{% footnote id=1 /%} To give you a sense of demand, this stack overflow question, asked almost 14 years ago, has 1.3 million views and over 1000 upvotes. {% scroll-back id=1 /%}
{% footnote id=2 /%} Shoutout to Nate Sooter for motivating this example with his recent tweet! {% scroll-back id=2 /%}
Ian is the Co-founder & CEO of SELECT, a SaaS Snowflake cost management and optimization platform. Prior to starting SELECT, Ian spent 6 years leading full stack data science & engineering teams at Shopify and Capital One. At Shopify, Ian led the efforts to optimize their data warehouse and increase cost observability.
Get up and running with SELECT in 15 minutes.
Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.