All posts

Exclude and rename columns when using SELECT * in Snowflake

Date
  • Ian Whitestone
    Co-founder & CEO of SELECT

Why do we need EXCLUDE & RENAME?

In Snowflake's November 2022 release, they quietly announced some exciting new SQL syntax, EXCLUDE and RENAME, which allows users to remove and rename specific columns when running a SELECT * style query. This is particularly exciting, since it is very common when operating in non-production SQL workflows to run SELECT * style exploratory queries like:

select *
from table
where id = 5

Users often do this in order to investigate a particular record or subset of records. But what happens when you want to exclude a specific column (like a large text column), or rename one of them? Users are forced to type out all of the columns they want, along with any desired renaming using the traditional AS syntax:

select
column_1,
column_2,
column_3 as column_3_renamed,
column_4,
column_5,
column_6,
column_7,
column_8,
column_9, -- columns 10 and 11 are intentionally left out
column_12
from table
where id = 5

Not the best when you're trying to get answers, quickly.

How to exclude columns when running a SELECT * SQL statement in Snowflake?

Instead of typing out each column, a common ask 1 from users is the ability to specify a subset of columns to exclude from the table. In many cases, this is much more efficient to write since the number of columns to be excluded is often much lower than the number to be included. As of Snowflake's version 6.37 release, you can now include an EXCLUDE clause in SELECT * style SQL queries where you specify one or more columns that you don't want returned from the table:

select
*
exclude(column_10, column_11)
from table
where id = 5

This can also be used to exclude a single column:

select
*
exclude (column_10)
from table
where id = 5

Brackets are optional when excluding a single column. The SQL above can be rewritten as:

select
*
exclude column_10
from table
where id = 5

Using Snowflake's EXCLUDE with multiple tables

Another common need for excluding columns is when joining multiple tables. Imagine you have a query like this 2:

select *
from orders
join customers
on orders.customer_id=customers.customer_id
join items
on orders.order_id=items.order_id

In this example, we have two order_id columns and two customer_id columns in the output, since they are present in multiple tables. We can easily exclude these by changing our SQL to:

select
orders.*,
customers.* exclude customer_id,
items.* exclude order_id
from orders
join customers
on orders.customer_id=customers.customer_id
join items
on orders.order_id=items.order_id

Snowflake's equivalent of BigQuery's & Databrick's EXCEPT

Snowflake's EXCLUDE syntax is similar to BigQuery's and Databrick's EXCEPT syntax:

-- Snowflake
select
*
exclude(column_10, column_11)
from table
-- BigQuery/Databricks
select
*
except(column_10, column_11)
from table

For anyone familiar with DuckDB, Snowflake follows the same EXCLUDE syntax that they use. Both databases presumably chose to avoid using the EXCEPT keyword to provide this functionality, since it is already used in set operations.

Should you use EXCLUDE in production SQL code?

While this new syntax is great for speeding up adhoc workflows, it is not recommended for production SQL code. It's better to be explicit about the exact columns you are selecting. Writing out all column names improves readability and auditability. Users reading your SQL code can immediately understand which columns you are using from which table. It can also lead to clearer errors. If a column is suddenly removed from a table, your SQL code will immediately fail. This is much better than a downstream application suddenly erroring due to a missing column.

Does using EXCLUDE in Snowflake result in a different query plan or performance?

No. Using the EXCLUDE keyword to remove a few columns does not result in a different query plan or query performance compared to explicitly writing out all columns. You can validate this behaviour by running both types of queries on a dataset in your Snowflake account and inspecting the query profile for each query.

How to rename columns when running a SELECT * SQL statement in Snowflake?

Prior to the new RENAME functionality, users were forced to write out every column name as soon as they needed to rename at least one when running a SELECT * query. Now, users can easily select all columns while renaming a subset:

select
*
rename (column_3 as column_3_renamed, column_5 as column_5_renamed)
from table
where id = 5

This is much better than having having to list out every field just to rename one or two:

-- old method 👎
select
column_1,
column_2,
column_3 as column_3_renamed,
column_4,
column_5 as column_5_renamed,
column_6,
...
column_12
from table
where id = 5

Similar to EXCLUDE, this can also be done for a single column, brackets optional:

select
*
rename (column_3 as column_3_renamed)
from table
where id = 5

Combining EXCLUDE and RENAME

EXCLUDE and RENAME can easily be combined, as shown in the example below:

select
*
exclude(column_10, column_11)
rename column_3 as column_3_renamed
from table
where id = 5

Much better than the code from the beginning of the blog post:

select
column_1,
column_2,
column_3 as column_3_renamed,
column_4,
column_5,
column_6,
column_7,
column_8,
column_9, -- columns 10 and 11 are intentionally left out
column_12
from table
where id = 5

Notes

1 To give you a sense of demand, this stack overflow question, asked almost 14 years ago, has 1.3 million views and over 1000 upvotes.

2 Shoutout to Nate Sooter for motivating this example with his recent tweet!

Ian Whitestone
Co-founder & CEO of SELECT
Ian is the Co-founder & CEO of SELECT, a software product which helps users automatically optimize, understand and monitor Snowflake usage. Prior to starting SELECT, Ian spent 6 years leading full stack data science & engineering teams at Shopify and Capital One. At Shopify, Ian led the efforts to optimize their data warehouse and increase cost observability.

Get up and running with SELECT in 15 minutes.

Snowflake optimization & cost management platform

Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.

SELECT web application screenshot

Want to hear about our latest Snowflake learnings? 
Subscribe to get notified.