Snowflake 101: Working with Stored Procedures

Date: Sunday, March 17, 2024

Tomáš Sobotík
Senior Data Engineer & Snowflake SME at Norlys

Stored procedures have been a staple of RDBMS for ages, predating the cloud era and dedicated transformation tools. They have played a crucial role in building data pipelines and automating databases. In this post, we'll explore what stored procedures are and how to harness their potential in modern cloud-based platforms like Snowflake.

What are Stored Procedures?

Stored procedures enhance the SQL language with procedural logic, allowing the use of conditional statements, loops, and the creation of functionalities otherwise impossible without these constructs.

Snowflake stored procedures example SQL definition

Difference between UDFs and Stored Procedures

In addition to stored procedures, you can create user-defined functions (UDFs). Understanding the distinction between them is crucial to deciding when to use UDFs and when to opt for stored procedures (SP).

Stored procedures are typically employed for administrative actions in Snowflake, such as cleaning up old data, dropping unused tables, or custom data backup. UDFs, on the other hand, come into play when you need to calculate and return a value as part of a SELECT SQL query, like revenue for a given sales executive or employee bonuses.

The invocation of stored procedures uses the CALL keyword as an independent statement:

CALL my_stored_procedure(input_param);

UDFs, however, are called as part of a SELECT statement:

SELECT column1, my_udf(input_parameter) FROM table1;

Another distinction lies in the return value; stored procedures may not return anything, simply performing a task. UDFs, on the contrary, must always return a value.

Let's explore some typical use cases and determine whether stored procedures or UDFs are more suitable:

Use case	UDF or Stored Procedure?
Creating a new user and dedicated warehouse	Stored Procedure
Deleting all temporary tables	Stored Procedure
Cleaning up unused tables	Stored Procedure
Determining the city from an IP address	UDF
Extracting browser type from a user agent string	UDF
Calculating discounts for orders	UDF
Loading a table from a stage	Stored Procedure

Supported Programming Languages

Snowflake supports various languages for developing stored procedures. Choosing the right one depends on several factors:

Personal language preference and knowledge
Availability of necessary libraries
Consistency with existing code in supported languages
Whether you want to keep the code in-line or externally (as a standalone file in a stage)

Here is a summary of available languages along with information on whether they support in-line or staged handling:

Language	Handler Location
Java	In-line or staged
JavaScript	In-line
Scala	In-line or staged
Snowflake SQL Scripting	In-line

Both in-line and staged options have their pros and cons. In-line development may be simpler, allowing you to fine-tune the code and add it directly to the SP definition. Staged handling is advantageous for compiled code (Java, Scala), enabling the reuse of existing compiled code in Snowflake stored procedures.

How to Create a Stored Procedure

Using Snowflake SQL Scripting

Let's delve into the syntax of creating stored procedures (SP) using the Snowflake SQL Scripting language and decipher the meaning behind different keywords. As shown in the picture, a stored procedure consists of the following:

Stored procedure name
Input parameters (arguments) along with their data types
The return type of the stored procedure output
The language specification
The body of the stored procedure, where the actual logic will live

The hello_world stored procedure above simply returns the message passed in as an argument. By calling it, we can see the output printed:

Using Javascript

Let's modify the hello_world example and create it in JavaScript. Here is the code for the same procedure but now in JavaScript:

create or replace procedure hello_world_javascript(message varchar)
returns varchar not null
language javascript
as
$$
    return message;
$$;

We defined JavaScript as the language, using different characters to enclose the procedure body ($$).

Calling the procedure is the same:

Snowflake call javascript stored procedure

Using Python

In this last example, we'll create the same hello_world procedure in Python:

create or replace procedure hello_world_python(message varchar)
returns varchar not null
language python
runtime_version = 3.11
handler = 'hello_world'
packages = ('snowflake-snowpark-python')
as
$$
def hello_world(session, message):
    return message;
$$;

You can notice multiple changes compared to SQL Scripting or JavaScript. Let’s update our diagram and highlight the changes:

For Python, we must specify the Python version to use
We must define a handler. This is the python function that will be run in the stored procedure
We must specify any Python packages we will use in our stored procedure

Snowflake Python stored procedure syntax

Calling the Python stored procedure is the same as any other language:

Snowflake Python stored procedure output

Calling (Running) Stored Procedures

Next, let’s talk about how you can run (or “call”) a stored procedure, especially with arguments.

Using Arguments in Stored Procedures

Stored procedures support defining arguments that you can pass from outside. We’ve used the argument called message in our hello_world example. You can also see how to specify the value when calling the stored procedure.

Arguments can be optional if they have a defined default value in SP definition. In that case, you can omit the argument when calling SP, and the default value will be used. Let’s modify our example to use an optional argument.

create or replace procedure hello_world(message varchar DEFAULT 'Hello from Stored procedure written in SQL scripting!')
returns varchar not null
language sql
as
begin
  return message;
end;

Working with Optional and Required Arguments

In case you have both optional and required arguments, you have to specify the required ones first. Now we can call the procedure without the argument value, and Snowflake will use the default one.

Snowflake call stored procedure no argument

But, of course, we can overwrite the default value and pass a new value when calling the procedure:

Snowflake call stored procedure with argument

How to Use Bind Variables in a SQL Statement?

Arguments are also often used to pass values into SQL queries as bind variables inside stored procedures. This allows the SQL query to be dynamically created based on the arguments. Examples of this include passing a date, customer identifier, or table name that should be cleared. Let’s create a simple example where we are going to dynamically construct a SELECT statement and return values from a table for a given id:

-- create a test table
create table SP_TEST (
    id number,
    value varchar
);

-- insert data
insert into SP_TEST values (1, 'value 1');
insert into SP_TEST values (2, 'value 2');

-- create a stored procedure
CREATE PROCEDURE DYNAMIC_QUERY(ID number)
RETURNS VARCHAR
language sql
as

The stored procedure code contains two new concepts. First is how to bind input arguments to SQL statements. Here we use a colon character before the argument name :id. The code also declares an output variable and passes the result of the SELECT statement into this variable as part of SQL code. This is done by the :into <variable_name> syntax.

Privileges and Execution Types

Stored procedures are database objects like tables or views. This means they are owned by a role. Apart from the OWNERSHIP privilege, there is also a USAGE privilege on the SP that can be granted to other roles. There is one more topic related to privileges and running stored procedures. Snowflake supports two different modes for how you can interact with them. You can either use Caller’s Rights or Owner’s Rights. When you create a procedure, you define whether you want to use the caller’s or owner’s rights. The default option is owner’s rights. Let’s explain both modes.

Caller’s Rights

When you run a SP with caller’s rights, it means the SP uses the privileges that the caller has. The procedure also has access to the caller’s session information - it can access session variables and use them in queries or modify them. Such session changes will persist after the SP call is done. SP uses database privileges that the caller has - it can access the same DB object as the caller. If SP contains any statement or DB object that the caller doesn't have permission to use, then the SP will raise a permissions error.

Owner’s Rights

In this case, the procedure runs with the rights of the owner and is decoupled from the caller. This means the SP can do things that the caller can’t do directly. This is a great example of how to delegate some tasks to other roles without giving them the privileges to do so. You can create an SP to delete old data from tables and give it to users for use without giving them the DELETE privilege on those tables. The procedure running with owner’s rights does not have access to the session information of the caller and can’t modify it either. SP also does not have access to variables created outside the SP - if that is needed, you have to pass the value as an input argument.

What Mode to Use?

It depends on your needs. In case you want to delegate something to other users, use the owner’s rights. Use the same mode if it doesn’t prevent SP from working properly or you don’t want to expose the procedure code to callers. On the other hand, use the caller’s rights mode if you need to have access to current session information or the procedure uses only objects that the caller owns or can use. Let’s summarize both options in the table:

Caller’s Rights	Owner’s Rights
Runs with privileges of the caller	Runs with privileges of the SP owner
Has access to caller’s current session	Limited access to caller’s session
Changes in session persist after end of SP call	Can’t change session state
Can view, set, unset caller’s session variables and parameters	Can’t view, set, unset caller’s session variables and parameters
Can use only objects and operations which caller can do	Decoupled from caller’s privileges

Return Types

Procedures might return a value (number, string, boolean). It is also very very common for a SP to return nothing, since it may just execute some code and not need to return anything.

A special return type from SPs is a table. Let’s go through examples and show how to write a procedure without a return value and a procedure that returns a table.

create or replace procedure no_return_value()
returns varchar null
language sql
as
begin
    select 1;
    select 2;
    select 3;
    select 4;
end;

When you check CREATE PROCEDURE statement in docs, you will find out that RETURNS keyword is mandatory. To define stored procedure with no return value you have to define some return data type and omit the return keyword in SP code block. It’s not necessary to define return type specifically with the NULL definition. The code would work without that as well, i.e. RETURNS VARCHAR.

When you call such a procedure, you will get a null value as output:

Snowflake call stored procedure with no output

To return a table, you have to specify the return type as a table. If you know the data types of returned columns you can specify them right away with column names. Otherwise, you can do it during runtime.

To return a table, we need to introduce one more data type called RESULTSET. This data type can hold the result of a SQL query. If you need to process it further you can iterate over rows or just return the result as we are going to do in this example:

create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
    result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
    return table(result);
end;

In this example, we assign a default value to our result variable and then return it inside the table() function. When you call this SP you will get a table with two columns as output:

Snowflake call stored procedure with table return value

Execute Multiple SQL Statements by Looping Through Results

We have gone through basic concepts around writing stored procedures in Snowflake. Let’s try to put all those concepts together now and create some real use cases to showcase how to use stored procedures for real tasks. One of the common administrative tasks could be executing multiple SQL statements dynamically created based on the results of another query. For instance, you might want to clean all empty tables. You have to find all tables with 0 records first and then iterate through and delete them. You can solve it with the following stored procedure:

create or replace procedure clean_empty_tables(db_name varchar)
returns varchar
language sql
as
declare
    table_list RESULTSET DEFAULT (SELECT TABLE_NAME, TABLE_TYPE, TABLE_SCHEMA FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES WHERE table_catalog = :db_name and row_count = 0 and deleted is null);
    c1 CURSOR FOR table_list;
    drop_statement varchar;
begin
    for record in c1 do
        drop_statement := 'DROP ' || record.table_type || ' ' || :db_name || '.' || record.table_schema || '.' || record.table_name || ';';
        execute immediate drop_statement;
    end for;
return 'cleaning done;';
end;

Let’s go through the code to explain the logic.

We pass the database name as a parameter to clean tables and views only in the defined DB.
We declare a RESULTSET variable table_list holding the empty table names as a result of the query.
We also define a cursor for iterating through the results and a variable for the DROP statement which we will be dynamically creating and executing in the loop.
The procedure body then just goes through the query result row by row and constructs a drop statement based on values retrieved from the cursor. We listed both views and tables but if you want to clean only tables, you have to modify the input query.
Once the drop statement is created we just execute it through EXECUTE IMMEDIATE.

When you call the stored procedure it should return a message 'cleaning done'. You can find the dropped tables/views in the query history or you can modify the return message and return also dropped object names.

Snowflake stored procedure query history

Documenting Stored Procedures

A stored procedure is a piece of code and as such, it should be properly commented to simplify future maintenance and usage. There are multiple options on how you can do it. Either you will leverage some external tool like a data catalog or internal wiki and describe the stored procedure there. You should always focus on two aspects:

Documentation for users and callers
Documentation for programmers

Here are some examples of what should be included in the documentation:

Describe the logic of the stored procedure and intention
Describe authors
Describe location - database and schema
Describe input parameters - names, data types, and meaning
Describe return values, possible errors, and exceptions
Describe prerequisites
What are the needed privileges

You can add comments directly into the source code and describe the algorithm. The last recommendation from my side would be to keep the stored procedures' source code under version control (GIT). Remember that stored procedures are not part of Time Travel in Snowflake. You can’t get the previous version of the code directly in Snowflake.

Stored Procedure Examples

Let's summarize all the stored procedure examples we showed in this post for quick reference:

Hello World

SQL Scripting

create or replace procedure hello_world(message varchar)
returns varchar not null
language sql
AS
begin
    return message;
end;

Javascript

create or replace procedure hello_world_javascript(message varchar)
returns varchar not null
language javascript
as
$$
    return message;
$$;

Python

create or replace procedure hello_world_python(message varchar)
returns varchar not null
language python
runtime_version = 3.11
handler = 'hello_world'
packages = ('snowflake-snowpark-python')
as
$$
def hello_world(session, message):
    return message;
$$;

Dynamically generate SQL query using arguments

-- create a test table
create table SP_TEST (
    id number,
    value varchar
);

-- insert data
insert into SP_TEST values (1, 'value 1');
insert into SP_TEST values (2, 'value 2');

-- create a stored procedure
CREATE PROCEDURE DYNAMIC_QUERY(ID number)
RETURNS VARCHAR
language sql
as

Return a table from a SQL query

create or replace procedure returning_table()
returns table(id number, name varchar)
language sql
as
declare
    result RESULTSET DEFAULT (SELECT 1 id, 'test value' name);
begin
    return table(result);
end;

Execute multiple SQL statements

create or replace procedure clean_empty_tables(db_name varchar)
returns varchar
language sql
as
declare
    table_list RESULTSET DEFAULT (SELECT TABLE_NAME, TABLE_TYPE, TABLE_SCHEMA FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES WHERE table_catalog = :db_name and row_count = 0 and deleted is null);
    c1 CURSOR FOR table_list;
    drop_statement varchar;
begin
    for record in c1 do
        drop_statement := 'DROP ' || record.table_type || ' ' || :db_name || '.' || record.table_schema || '.' || record.table_name || ';';
        execute immediate drop_statement;
    end for;
return 'cleaning done;';
end;

Send alerts for suspended tasks

Here's an example from a previous blog post. This stored procedure, named task_state_monitor, takes a task_name as a parameter and checks the state of the task. If the state is 'suspended', it sends an email alert and returns a message. Otherwise, it returns a message indicating that the task state is okay.

create or replace procedure task_state_monitor(task_name string)
returns varchar not null
language SQL
AS
$$
DECLARE
    task_state string;
    c CURSOR FOR SELECT "state" from table(result_scan(last_query_id())) where "name" = ?;
BEGIN
    show tasks;
    open c USING (task_name);
    fetch c into task_state;
    IF(task_state = 'suspended') THEN 
        CALL SYSTEM$SEND_EMAIL(
            'my_email_int',