Snowflake Access History: 8 ways to audit your account

Date: Sunday, April 28, 2024

Ian Whitestone
Co-founder & CEO of SELECT

Access History is a view provided in the Snowflake database that is one of the most useful datasets when it comes to auditing and understanding usage in your Snowflake account. In this post, I’ll dive into what data the access history contains, and then share a number of different examples you can run in your account today.

Snowflake access history base vs. direct objects accessed

What is in the Snowflake Access History?

The Access History contains 1 row per query executed in your account. For each query, it stores a number of different columns about objects accessed and/or modified by the query.

To start with, there are three columns that are helpful for looking up the queries you are interested in:

query_id: the unique identifier for the query
user_name: the user who ran the query
query_start_time: when the query started

If additional information about the query is required, like the role it was executed with or the warehouse it ran on, you can join the access history dataset with the Snowflake Query History view.

In terms of what objects the query accessed, the view provides two columns:

direct_objects_accessed: a JSON array of data objects the query directly accesses
base_objects_accessed: a JSON array of data objects that a query either directly or indirectly accesses (i.e. the underlying tables that populate a view)

For objects a query modified, there are two columns:

objects_modified: a JSON array that specifies the objects modified by a query. This will be populated for INSERT, UPDATE, MERGE, CREATE or similar types of queries that update/insert/delete records in a table
objects_modified_by_ddl: contains information about the DDL operation on a database, schema, table, view and/or column.

Direct vs. Base Objects Accessed

To understand the difference between direct and base objects accessed, consider the following query which accesses two columns from a view named user_sales_summary :

select
	user_name,
	total_sales
from user_sales_summary

The direct_objects_accessed column would include 1 entry for the direct access of the user_sales_summary view, while the base_objects_accessed column would contain two entries for the two underlying tables (users and sales) that power the view.

Access History Data Retention

Similar to other account usage views like the Snowflake Query History, Snowflake retains data from the last 365 days.

Is Access History available to all Snowflake customers?

The access history view is only available for Snowflake customers on a Snowflake Enterprise editions or higher.

Now that we’ve covered the basics, let’s get into some real examples you can run in your account to answer a variety of common questions.

1. Find all tables a given user accessed in the last 30 days

The query below shows how to find all tables accessed by a given user in the last 30 days. Because the base_objects_accessed column is an array, we must use a lateral join combined with the flatten table function to explode each entry in the array into a separate row. You’ll see this pattern used throughout the blog post.

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
    where
		    access_history.query_start_time > current_date - 30
)

Note how I filter on object_domain='Table'. You can modify this as required to answer related questions like:

What views did a user access?
What functions did they use?

2. Find all tables accessed in a schema?

To find all tables accessed in a particular schema, we can leverage the access_history_flattenedCTE from above. The object_name present in Access History will always be the fully qualified name, meaning it will be in the format of database_name.schema_name.table_name. As a result, we parse this object name to get the database and schema name, and then filter as needed:

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
),
access_history_flattened_w_names as (
	select

3. Return all users who accessed a specific table in the last 30 days

Imagine you are trying to identify users that may have accessed sensitive data in a table. You can leverage the access_history view to quickly identify the full list of users:

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
    where
	   access_history.query_start_time > current_date - 30 -- adjust as needed
),

4. Identify Unused Tables

I previously wrote about how to identify unused tables by leveraging the access history view. You can refer to that post for a detailed explanation. Here is the code you can use:

with
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectId::integer as table_id,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from snowflake.account_usage.access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
),
table_access_history as (
	select

5. Identify Unused Views

We can easily modify the query from above to identify Views that have not been used in the last 30 days. All we need to do is change object_domain = 'Table' to object_domain = 'View':

with
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectId::integer as table_id,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from snowflake.account_usage.access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
),
table_access_history as (
	select

6. Identify the most common columns accessed in a given table

The examples thus far have solely addressed questions about table/schema access. We can go a layer deeper to analyze column usage for a given table by leveraging the columns array present in the base/direct_objects_accessed fields. After performing an extra lateral flatten, we can get a dataset with 1 row per column accessed in a query (see access_history_flattened_columns CTE).

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
    where
	   access_history.query_start_time > current_date - 30 -- adjust as needed
),

7. Return all users who accessed a specific column in the last 30 days

Working off the example from above, we can easily identify users who have accessed a specific column:

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_accessed.value:objectName::text as object_name,
        objects_accessed.value:objectDomain::text as object_domain,
        objects_accessed.value:columns as columns_array

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.base_objects_accessed) as objects_accessed
    where
	   access_history.query_start_time > current_date - 30 -- adjust as needed
),

8. Identify all queries that have modified a table

When investigating why or how a given table changes, it can be helpful to quickly identify the queries or users that modified the object. Or perhaps you want to see how often a table is being updated. Using similar approaches to ones from above, we can identify all queries that have modified a table by flattening the objects_modified column:

with
-- This will output 1 row per table accessed in a query
access_history_flattened as (
    select
        access_history.query_id,
        access_history.query_start_time,
        access_history.user_name,
        objects_modified.value:objectName::text as object_name,
        objects_modified.value:objectDomain::text as object_domain

    from admin.audit.access_history_last_30d as access_history, lateral flatten(access_history.objects_modified) as objects_modified
    where
	   access_history.query_start_time > current_date - 30 -- adjust as needed
)
select