Understand column usage & clustering health
- Date
- Fernando BritoSoftware Engineer at SELECT
- Ian GraySoftware Engineer at SELECT
The tables page now features a brand new columns tab, designed to help you understand:
- How often each column in a table is being accessed
- The clustering health of each column in your table. A score of 100% means the column is well clustered, and therefore including a filter on this column in your query will likely result in very good query pruning. We’ve found this feature incredibly helpful on our internal Snowflake account, as we’ve learned about columns we previously didn’t know were well clustered (i.e.
query_id
in this screenshot). This feature is also very helpful for understanding how tables without automatic clustering enabled are naturally clustered.
Additional setup required
The ability to list all columns in a table and analyze their clustering health requires a new stored procedure to be installed and granted access to your SELECT Snowflake user. Learn more in our setup documentation.
In conjunction with these, you can now filter workloads by the columns they’ve accessed. In the example below, you can see that all columns accessing the query_id
column are pruning very well. The pruning efficiencies of over 90% indicate that most micro-partitions are being automatically removed.
Not available for Snowflake Standard Edition Customers
Filtering by columns accessed relies on the Snowflake access history view which is not available for customers on the Standard Edition of Snowflake. Standard edition users can still analyze their clustering health (first screenshot)!