Event Tables for Structured Logging & Tracing in Snowflake

Date: Sunday, May 19, 2024

Tomáš Sobotík
Senior Data Engineer & Snowflake SME at Norlys

Tracking your data pipeline or application is essential in development. Without proper logging and tracing, maintaining smooth operation is challenging. Alerts are needed to provide context during disturbances. If discrepancies occur, you must trace the issue through the transformation logic. For debugging, gaining insights into your code's behavior is crucial. Examples include needing context information like variable values, output messages, and formula results, typically logged in various forms. Snowflake offers native logging and tracing features using industry-standard libraries, making them accessible for developers. Let's explore how to establish event logging and tracing in Snowflake using these features.

What are Event Tables?

Event tables have been released as public preview in May 2023. They are a special type of Snowflake table with several differences compared to standard tables:

Predefined set of columns which can’t be modified
Used only for tracking logging and tracing data
You can have only one active event table associated with account

Typical use cases for using event table is capturing logging information from your code handlers used as stored procedures, UDFs or collecting tracing data from native apps.

Working with Event Tables

Enabling event table for your account is done in several steps, as shown in the following image:

1. Create EVENT TABLE

First we have to create an event table. This is done with special CREATE TABLE statement:

CREATE EVENT TABLE my_db.logging.my_event_table;

You do not have to specify the table columns because it contains predefined list of columns. For now you can have only one active event table for whole account.

2. Assign the Event table to account

To get event table in use we have to associate it with our account. It’s done with ALTER ACCOUNT statement, which means it can be done only with ACCOUNTADMIN role. Plus there is a need to have either OWNERSHIP OR INSERT privileges for the event table.

ALTER ACCOUNT SET EVENT_TABLE = my_db.logging.my_event_table;

3. Start capturing the log events

Now we can enrich the UDF/UDTF or Stored procedure with logging code. Based on your handler language, you can use native logging APIs and libraries.

Language	Logging Library
Java	SLF4J API
JavaScript	Snowflake JavaScript API
Python	logging module
Scala	SLF4J API
Snowflake scripting	Snowflake SYSTEM$LOG function

Let’s take a Python as an example.

CREATE OR REPLACE FUNCTION my_UDF()
RETURNS VARCHAR
RUNTIME_VERSION = 3.8
HANDLER = 'run'
AS $$
import logging
logger = logging.getLogger("my_logger")

def run():
        logger.info("Processing start")
        ...
        ...
        logger.error("Some error in your code")
        return value

To start logging we have to import the logging module and instantiate the logger object. Then we can start using it in the same way like any standard Python app and log different levels like INFO, WARNING or ERROR.

Let’s take one more example for SQL scripting and how to add logging in case you have your handler code in SQL. In case of Snowflake Scripting we must use SYSTEM$LOG function. It also supports different levels of log messages like info, warning or error.

We have simple stored procedure returning a table. In case we want to enrich it with logging info messages we can do following:

create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
    result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
    SYSTEM$LOG('info', 'Returning a table');
        return table(result);
end;

4. Querying the event table

We have added the logging into our handler code, now we want to check the logged events. It’s time for querying the event table. Each logged message contains:

Timestamp - when was the event logged
Scope - e.g. name of the class where the log event was created
Severity level of the log - e.g. info, warning, error
Log message itself

For a complete list of event table columns, refer to the documentation. Some of the columns are key-value pairs to store multiple attributes. You can extract them with similar queries:

create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
    result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
    SYSTEM$LOG('info', 'Returning a table');
        return table(result);
end;

And here is how the event table output looks:

Using traces to log structured data

Another use case for event tables is collecting trace data from your code. Trace data is structured logging information in form of key-value pairs which can provide more detailed overview of code’s behaviour than log data usually provides. Let’s go through example where we will start collecting trace data from UDF written in Python:

select
    resource_attributes:"snow.database.id"::number,
    resource_attributes:"snow.database.name"::varchar,
    resource_attributes:"snow.executable.name"::varchar,
    resource_attributes:"snow.executable.type"::varchar,
    resource_attributes:"snow.owner.name"::varchar,
    resource_attributes:"snow.query.id"::varchar,
    resource_attributes:"snow.warehouse.name"::varchar,
    resource_attributes:"telemetry.sdk.language"::varchar,
    record,
    value
from my_event_table;

We have to import the snowflake-telemetry-package which contains the required methods.

We can use set_span_attribute method to set key-value pairs to span object. Span objects hold telemetry data created after the function or procedure is executed successfully. It represents the execution unit of the UDF or stored procedure. You can add multiple events to that execution unit with add_event method.

If you want to know more how Snowflake represents the trace events, refer to their documentation.

Event Table for Native Applications

You can use the event table for collecting logging events and telemetry data in your native apps. It requires additional configuration as the native app code runs in consumer account where events are collected. It requires configuration on both ends - provider and consumer accounts.

Provider setup in a nutshell:

Configure log and event level in manifest file
Configure an account to store events

Consumer setup in a nutshell:

Set up an event table
Review the events in the event table
Enable logging and sharing the events with provider

Both consumer and provider have access the event table and logged entries. As a result, it’s important for the consumer to review what kind of information is logged and shared with the provider before enabling it.

Snowflake provides detailed overview with step by step instructions for both providers and consumers. If you want to learn more about overall setup, check the documentation.

Alerts based on events

Event tables can be easily combined with Snowflake Alerts to automate notifications based on logged events. Let’s try to create an alert which will run once per hour and in case there will be a logged error it will send an email.

CREATE OR REPLACE ALERT alert_logged_errors
  WAREHOUSE = my_warehouse
  SCHEDULE = '60 MINUTE'
  IF (EXISTS (
      SELECT *
      FROM my_event_table
      WHERE record:"severity_text"::VARCHAR == 'ERROR' and timestamp BETWEEN                SNOWFLAKE.ALERT.LAST_SUCCESSFUL_SCHEDULED_TIME()
       AND SNOWFLAKE.ALERT.SCHEDULED_TIME()
  ))
  THEN CALL SYSTEM$SEND_EMAIL(...);

Event Table Limitations & Gotchas

There is a limit for log and trace payload size. It can’t be over 1 MB.

The event table is available in ACCOUNT_USAGE view with list of all tables in your account. It has EVENT TABLE as a value for TABLE_TYPE column.

select * 
from snowflake.account_usage.tables 
where table_name = 'MY_EVENT_TABLE'
;

The event table can’t be updated. If you try to run update statement, you will get an error saying that UPDATE statement's target must be a table. Supported operations on EVENT TABLES are:

SHOW EVENT TABLE
DESCRIBE EVENT TABLE
DROP TABLE
UNDROP TABLE
TRUNCATE TABLE
DELETE
ALTER TABLE

Event Table Pricing and Billing

Collecting the log events is billed as a Serverless feature. Snowflake uses Snowflake managed resources to collect events, meaning they don’t require one of your virtual warehouses. If you want to check how much you have been charged for this feature, you can query the EVENT_USAGE_HISTORY view:

select 
    start_time, 
    end_time, 
    credits_used, 
    bytes_ingested 
from snowflake.account_usage.EVENT_USAGE_HISTORY 
order by start_time desc;