Ever since I started using Snowflake, there's one thing I always thought was missing: a native Git integration. Without version control, it was easy to mess up your infrastructure, but thankfully this is no longer the case. Snowflake's Git integration was released in April 2024 and was a feature I personally requested multiple times! Let’s take a closer look at this feature, explain how to use it, and showcase some typical use cases.
The Snowflake Git integration allows you to integrate your Snowflake account with one of the supported Git platforms (I.e. GitHub, GitLab, etc.) natively and synchronize the contents of a remote repository with your Snowflake account. For this synchronization, Snowflake uses a special type of stage called a repository stage. This repository stage is then synchronized with the remote Git repository and becomes a local repository with a full clone of the remote repository, including everything you would expect—branches, commits, etc.
With this synchronization, you have a full clone of the repository available directly in your Snowflake account. You can use the fetched files in your Snowflake applications or write UDFs/stored procedures where the handler code is stored in a remote Git repository and synchronized with the repository stage. This allows you to keep it under version control easily and refresh the stage whenever the repository is updated.
You can also use any file from any branch or commit directly in Snowflake.
Thanks to this integration, you can simplify the development lifecycle for your code if you want to keep it under version control. Let’s take a stored procedure handler code as an example. Before this integration, you had to manage version control on your own and outside of Snowflake. You wrote and tested your handler code in VS Code. When you finished development, you had to commit the changes to a remote repository to keep them under version control. At the same time, you also had to deploy the code into Snowflake and create a stored procedure with that code as the handler.
This meant either manually uploading the code to a Snowflake stage and then creating the stored procedure or using the Snowflake CLI to handle both uploading and stored procedure creation. Either way, this was an additional step that was not directly connected to your versioned code. If you needed to modify the code, you had to go through the entire process again—develop the code, commit it to the repository, and update the handler file in Snowflake.
Let’s look at how the native Git integration can streamline this process. Since you can reference the handler code directly in the repository stage, whenever you commit changes and synchronize your repository stage, the stored procedure handler is automatically updated and remains under version control. No more multiple standalone processes (version control & SP updates)—just a single streamlined workflow.
The Snowflake Git integration allows you to connect Git repositories from the following supported platforms:
In this example, we are going to write a simple Hello World stored procedure handler in Python, commit the changes to a remote GitHub repository, and bring it into Snowflake through the Git integration. Then, we will modify the handler code and see how easily we can update the SP handler code in Snowflake thanks to the Git integration. First, let’s look at a simple diagram illustrating what we are going to build:
Let's walk through each step in more detail.
I created a simple handler function called hello_world and pushed it to my repository. Now, we can proceed with creating the necessary objects in Snowflake and linking this remote repository to a new repository stage in Snowflake.
Let’s start with secret creation. I use a personal access token for authentication.
We also need a new API integration to connect Snowflake with GitHub. The API_ALLOWED_PREFIXES
parameter points to my GitHub account URL, and for authentication, we use the secret created in the previous step.
Finally, we can create a Git repository stage and pass the values we created earlier. The origin is our Git repository that we want to connect to.
We have a repository stage in place; let’s sync it with the remote repository.
And that’s it! Now, we have a working integration between Snowflake and the remote Git repository. Our stored procedure handler is sourced from the repository stage, which is synchronized with the remote repository. Let’s create a stored procedure:
How can we try running it?
Let’s now modify the code to see how easily we can fetch updates in Snowflake. Again, we are going to make these updates locally in VS Code and commit them to the remote repository. Let’s make a simple change to convert the message to uppercase.
Once the changes are committed to the repository, we have to update (synchronize) our repository stage in Snowflake.
ALTER GIT REPOSITORY snowflake_git_demo FETCH;
Now, we have the updated handler code available in Snowflake! That’s it—nothing more is needed. No procedure recreation or anything like that. We can just run the stored procedure to verify that the code is updated:
There is a way to automate this. Suppose you have a standard way of working with Git, where you keep your code changes in standalone feature branches that need to be merged into the main branch. We can build a GitHub Actions workflow that runs the ALTER GIT REPOSITORY
command when the PR is merged, so the repository stage is automatically updated every time you push new code into your main branch!
Here is a simple example. You can use either SnowSQL or SnowCLI to run the SQL statement.
TORY command when the PR is merged, so the repository stage is automatically updated every time you push new code into your main branch!
Here is a simple example. You can use either SnowSQL or SnowCLI to run the SQL statement.
name: Deploying Stored procedure updates
env:
SNOWSQL_ACCOUNT: ${{secrets.SF_ACCOUNT}}
SNOWSQL_USER: ${{secrets.SF_USER}}
SNOWSQL_PWD: ${{secrets.SF_PASSWORD}}
SNOWSQL_DATABASE: ${{ secrets.SF_DATABASE }}
SNOWSQL_SCHEMA: ${{ secrets.SF_SCHEMA }}
SNOWSQL_ROLE: ${{ secrets.SF_ROLE }}
SNOWSQL_WAREHOUSE: ${{ secrets.SF_WAREHOUSE }}
on:
And here is the output from the GitHub Actions workflow run:
Now that we've walked through how to set up the Git integration, there are more features which you may want to explore:
Our example showed just two operations related to Git repositories in Snowflake: the creation of the repository stage and the ALTER command for fetching remote updates. Let’s go through other examples of what you might want to do with your repositories:
List repository branches
We can list all branches in our repository stage, along with the path and commit hash.
SHOW GIT BRANCHES IN snowflake_git_demo;
List repository files
Similarly to listing files in any internal or external stage, we can also list the files in the repository stage using the LIST command:
LS @snowflake_git_demo
Repository stage
But for the repository stage, we can list files by individual branch name:
LS @snowflake_git_demo/branches/main
Individual commit hash
LS @snowflake_git_demo/commits/<<add_my_commit_hash>>
List files by tag name
LS @snowflake_git_demo/tags/tag_name
As with many other Snowflake objects, there are also SHOW
and DESCRIBE
commands for repository stages that can display useful metadata, such as where your repository stage is located (database and schema name), what the origin of the remote repo is, which API integration is used to connect Git and Snowflake, and what the secret name is that holds the secret for the remote Git repository connection.
Keeping files in a repository is useful, but what about running the code stored in those files? Have you thought about that? Snowflake has the EXECUTE IMMEDIATE FROM
command, which allows you to run SQL code directly from files. You can store your account configuration (user, roles, warehouse creation) in SQL files in the repo, and thanks to this command, run it directly from the file:
EXECUTE IMMEDIATE FROM @snowflake_git_demo/branches/main/sql/users.sql
You can also copy code from files stored in the repository stage into your worksheets. You can copy content from either .sql or .py files. Then, you can edit or run such code in Snowsight worksheets. You just need to navigate to the file you want to copy from and then select the Copy into worksheet option.
If you want to save your changes back into the repository, you have to do it in your local copy of the repository because currently, repositories are read-only in Snowflake, except for in Notebooks, which can also write back. So you can only use Notebooks to write back to the repositories. Other limitations of the feature at the time of writing include:
The end-to-end example showed one of the possible use cases for using the Git integration in Snowflake: keeping your procedure/UDF code under version control in a remote repository. But it’s not the only use case for this feature.
Git integration is also crucial for managing your account using DevOps practices. You can keep the definitions of your Snowflake objects (databases, warehouses, users, roles, etc.) in a remote repository as SQL files with CREATE
or ALTER
definitions and run them as part of your CI/CD pipelines. Thanks to repository stages, you have a copy of those definitions available right in Snowflake, and you can run the code there directly without needing any external runner to deploy those objects!
Another use case could be related to your native/Streamlit apps, where you can have the code natively integrated with remote repositories. Last but not least, if you use Snowflake for your data transformations—DAGs with tasks or dynamic tables—you can keep the pipeline definitions under version control and easily integrate them with repository stages and the Snowflake environment.
Tomas is a longstanding Snowflake Data SuperHero and general Snowflake subject matter expert. His extensive experience in the data world spans over a decade, during which he has served as a Snowflake data engineer, architect, and admin on various projects across diverse industries and technologies. Tomas is a core community member, actively sharing his expertise and inspiring others. He's also an O'Reilly instructor, leading live online training sessions.
Get up and running with SELECT in 15 minutes.
Gain visibility into Snowflake usage, optimize performance and automate savings with the click of a button.