Tracking your data pipeline or application is essential in development. Without proper logging and tracing, maintaining smooth operation is challenging. Alerts are needed to provide context during disturbances. If discrepancies occur, you must trace the issue through the transformation logic. For debugging, gaining insights into your code's behavior is crucial. Examples include needing context information like variable values, output messages, and formula results, typically logged in various forms. Snowflake offers native logging and tracing features using industry-standard libraries, making them accessible for developers. Let's explore how to establish event logging and tracing in Snowflake using these features.
Event tables have been released as public preview in May 2023. They are a special type of Snowflake table with several differences compared to standard tables:
Typical use cases for using event table is capturing logging information from your code handlers used as stored procedures, UDFs or collecting tracing data from native apps.
Enabling event table for your account is done in several steps, as shown in the following image:
First we have to create an event table. This is done with special CREATE TABLE
statement:
CREATE EVENT TABLE my_db.logging.my_event_table;
You do not have to specify the table columns because it contains predefined list of columns. For now you can have only one active event table for whole account.
To get event table in use we have to associate it with our account. It’s done with ALTER ACCOUNT
statement, which means it can be done only with ACCOUNTADMIN
role. Plus there is a need to have either OWNERSHIP OR INSERT privileges for the event table.
ALTER ACCOUNT SET EVENT_TABLE = my_db.logging.my_event_table;
Now we can enrich the UDF/UDTF or Stored procedure with logging code. Based on your handler language, you can use native logging APIs and libraries.
Language | Logging Library |
---|---|
Java | SLF4J API |
JavaScript | Snowflake JavaScript API |
Python | logging module |
Scala | SLF4J API |
Snowflake scripting | Snowflake SYSTEM$LOG function |
Let’s take a Python as an example.
CREATE OR REPLACE FUNCTION my_UDF()
RETURNS VARCHAR
RUNTIME_VERSION = 3.8
HANDLER = 'run'
AS $$
import logging
logger = logging.getLogger("my_logger")
def run():
logger.info("Processing start")
...
...
logger.error("Some error in your code")
return value
To start logging we have to import the logging module and instantiate the logger object. Then we can start using it in the same way like any standard Python app and log different levels like INFO
, WARNING
or ERROR
.
Let’s take one more example for SQL scripting and how to add logging in case you have your handler code in SQL. In case of Snowflake Scripting we must use SYSTEM$LOG
function. It also supports different levels of log messages like info
, warning
or error
.
We have simple stored procedure returning a table. In case we want to enrich it with logging info messages we can do following:
create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
SYSTEM$LOG('info', 'Returning a table');
return table(result);
end;
We have added the logging into our handler code, now we want to check the logged events. It’s time for querying the event table. Each logged message contains:
For a complete list of event table columns, refer to the documentation. Some of the columns are key-value pairs to store multiple attributes. You can extract them with similar queries:
create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
SYSTEM$LOG('info', 'Returning a table');
return table(result);
end;
And here is how the event table output looks:
Another use case for event tables is collecting trace data from your code. Trace data is structured logging information in form of key-value pairs which can provide more detailed overview of code’s behaviour than log data usually provides. Let’s go through example where we will start collecting trace data from UDF written in Python:
select
resource_attributes:"snow.database.id"::number,
resource_attributes:"snow.database.name"::varchar,
resource_attributes:"snow.executable.name"::varchar,
resource_attributes:"snow.executable.type"::varchar,
resource_attributes:"snow.owner.name"::varchar,
resource_attributes:"snow.query.id"::varchar,
resource_attributes:"snow.warehouse.name"::varchar,
resource_attributes:"telemetry.sdk.language"::varchar,
record,
value
from my_event_table;
We have to import the snowflake-telemetry-package
which contains the required methods.
We can use set_span_attribute
method to set key-value pairs to span object. Span objects hold telemetry data created after the function or procedure is executed successfully. It represents the execution unit of the UDF or stored procedure. You can add multiple events to that execution unit with add_event
method.
If you want to know more how Snowflake represents the trace events, refer to their documentation.
You can use the event table for collecting logging events and telemetry data in your native apps. It requires additional configuration as the native app code runs in consumer account where events are collected. It requires configuration on both ends - provider and consumer accounts.
Both consumer and provider have access the event table and logged entries. As a result, it’s important for the consumer to review what kind of information is logged and shared with the provider before enabling it.
Snowflake provides detailed overview with step by step instructions for both providers and consumers. If you want to learn more about overall setup, check the documentation.
Event tables can be easily combined with Snowflake Alerts to automate notifications based on logged events. Let’s try to create an alert which will run once per hour and in case there will be a logged error it will send an email.
CREATE OR REPLACE ALERT alert_logged_errors
WAREHOUSE = my_warehouse
SCHEDULE = '60 MINUTE'
IF (EXISTS (
SELECT *
FROM my_event_table
WHERE record:"severity_text"::VARCHAR == 'ERROR' and timestamp BETWEEN SNOWFLAKE.ALERT.LAST_SUCCESSFUL_SCHEDULED_TIME()
AND SNOWFLAKE.ALERT.SCHEDULED_TIME()
))
THEN CALL SYSTEM$SEND_EMAIL(...);
There is a limit for log and trace payload size. It can’t be over 1 MB.
The event table is available in ACCOUNT_USAGE
view with list of all tables in your account. It has EVENT TABLE as a value for TABLE_TYPE
column.
select *
from snowflake.account_usage.tables
where table_name = 'MY_EVENT_TABLE'
;
The event table can’t be updated. If you try to run update statement, you will get an error saying that UPDATE statement's target must be a table
. Supported operations on EVENT TABLES are:
Collecting the log events is billed as a Serverless feature. Snowflake uses Snowflake managed resources to collect events, meaning they don’t require one of your virtual warehouses. If you want to check how much you have been charged for this feature, you can query the EVENT_USAGE_HISTORY
view:
select
start_time,
end_time,
credits_used,
bytes_ingested
from snowflake.account_usage.EVENT_USAGE_HISTORY
order by start_time desc;
Tomas is a longstanding Snowflake Data SuperHero and general Snowflake subject matter expert. His extensive experience in the data world spans over a decade, during which he has served as a Snowflake data engineer, architect, and admin on various projects across diverse industries and technologies. Tomas is a core community member, actively sharing his expertise and inspiring others. He's also an O'Reilly instructor, leading live online training sessions.
Get up and running with SELECT in 15 minutes.
Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.