Information Security at SELECT

Date

This document includes some of our key security practices, and outlines our approach to continuously enhance the security of our systems.

While not intended to outline every policy, procedure, and control at SELECT, we expect this document will act as a starting point for conversation - demonstrating our commitment to providing a secure and reliable platform for Snowflake metadata ingestion and analysis.

For additional information, or to discuss security related questions, do not hesitate to contact us at [email protected].

Niall Woodward

CTO & CISO, SELECT


Security Operations

Our Team

Our team has expertise managing Snowflake accounts, analytics infrastructure, and data-driven products for public companies. We were previously responsible for managing data warehouse optimization for hundreds of companies, including many companies in the Fortune 500.

We know the value of privacy and security by design, and have architected our team, our operations, and our systems with this in mind. Everyone on our team has spent time working with major cloud infrastructure providers and data warehouses, and is well versed in cloud security. We ensure the bare minimum of team members have access to our own Google cloud infrastructure.

Our Operations

To ensure that we are able to iterate quickly when it comes to operations, we manage key company policies, standard operating procedures, and security controls through Drata, our compliance management partner. This provides version control, automated notifications, and change logs for each policy, procedure and control.

Policies

It is important to set expectations of our leadership and our employees. As such, we have a variety of different policies covering secure software development, data protection and vulnerability management policies, and more. Reach out to our team to obtain a copy of our policies for review.

Monitoring

We monitor our production systems and our critical vendors for availability and address issues in a timely manner. For our policies, procedures and controls - each artifact has a named owner, and we conduct regular reviews to ensure our operations remain aligned with the data we process, the risks we face, and the clients we serve.

Controls

We have detailed a number of our security controls in the sections that follow. For additional information, or questions about our other controls, reach out to [email protected].

Our security model

Secure by Design

SELECT only requires read access to a customer's Snowflake account metadata database. This database only includes metadata about how the customer is using Snowflake. No actual customer data or sensitive information is stored in this database. Some examples of this information include, but are not limited to:

  • The number of tables in each database, and the size of each table
  • The amount the customer was billed by Snowflake on a particular day
  • How frequently Snowflake's automatic clustering service is running
  • Metadata about the queries being run in an account (query runtime, tables accessed, etc.)

This data is stored in SELECT's own Snowflake account where insights can be derived and presented to our customers. We follow a principle of least privilege, and only extract the minimum subset of metadata required for SELECT's services.

Zero access to any customer data!

We do not have read or write access to any of the customer's data that is stored in Snowflake. This access is tightly controlled during the onboarding process where customers create a new user for SELECT with an extremely limited set of permissions.

SELECT Architecture

The security features discussed above can be further visualized in the diagram below outlining SELECT's secure & limited data access architecture:

  1. A user with limited, read only access to the Snowflake metadata database is created by the customer in their Snowflake account. This user can only access the Snowflake metadata database and does not have any access whatsoever to customer data.
  2. SELECT uses this user to extract a subset of the customer's Snowflake metadata into our Snowflake account.
  3. Insights are derived from this dataset and presented to the customer in SELECT's web application.
SELECT system architecture with limited access

In addition to being secure by design and not accessing or storing any customer data, SELECT also follows various security and software engineering best practices, as outlined in the rest of this document.

Query Text Sanitization

While rare, we recognize it is possible that Snowflake users within a customer's company may inadvertently include sensitive information in their query text. For example, an engineer may be debugging an issue and query all notifications sent to a particular user. They may even store some notes to themselves in the query comments:

/*
Customers to investigate:
1. Joe Smith - [email protected] - 123456789
2. Steve Jones - [email protected] - 987654321
*/
select
notification_id,
date_sent
from notifications_sent
where
email = '[email protected]'
or phone_number = 123456789 -- 987654321

SELECT is designed for this worst case scenario and can strip out any literal values and sensitive comments before storing the metadata in our account. This functionality can be enabled upon request. Using the same query example from above, we would only store the following query_text in our database:

select
notification_id,
date_sent
from notifications_sent
where
email = $1
or phone_number = $2

Similar scrubbing can be performed across any free-form text fields ingested from the customer's Snowflake account metadata database into SELECT's database.

What Snowflake metadata do we access?

SELECT accesses Snowflake usage metadata to present users with insights and recommendations related to cost & performance optimization. More information on the exact views we access and their purpose is provided below.

Account Usage

The following views from the account_usage schema are accessed. All views contain metadata about the customer's Snowflake usage. Examples include performance statistics about historical queries run, billing amounts for different Snowflake services, and performance data for virtual warehouses. Please refer to the Snowflake documentation for each view if additional information is required. The account usage views accessed are required to present customers with comprehensive cost and performance insights.

  • snowflake.account_usage.query_history
  • snowflake.account_usage.warehouse_events_history
  • snowflake.account_usage.warehouse_load_history
  • snowflake.account_usage.warehouse_metering_history
  • snowflake.account_usage.stage_storage_usage_history
  • snowflake.account_usage.database_storage_usage_history
  • snowflake.account_usage.storage_usage
  • snowflake.account_usage.metering_daily_history
  • snowflake.account_usage.metering_history
  • snowflake.account_usage.task_history
  • snowflake.account_usage.task_versions
  • snowflake.account_usage.serverless_task_history
  • snowflake.account_usage.automatic_clustering_history
  • snowflake.account_usage.materialized_view_refresh_history
  • snowflake.account_usage.pipe_usage_history
  • snowflake.account_usage.query_acceleration_history
  • snowflake.account_usage.search_optimization_history
  • snowflake.account_usage.replication_usage_history
  • snowflake.account_usage.access_history
  • snowflake.account_usage.tables
  • snowflake.account_usage.table_storage_metrics

Organization Usage

The following views from the organization_usage schema are accessed:

  • snowflake.organization_usage.contract_items: Contains information about a customer's current Snowflake contract. We use this to help provide users with budgeting forecasts.
  • snowflake.organization_usage.remaining_balance_daily: Contains information about a customer's remaining contract balance. Required to determine the effective rates to apply when calculating costs and for budget forecasting.
  • snowflake.organization_usage.rate_sheet_daily: Contains information about the effective rates applied on each day. Required to calculate spend data.
  • snowflake.organization_usage.usage_in_currency_daily: Contains information about how much a customer is being billed each day. Required to provide customers with Snowflake spend analytics.

Security Essentials

Fixed IP address

We also support Snowflake’s network policies for customers who want to restrict inbound traffic to trusted IP address ranges. Please reach out to [email protected] to receive the fixed IP address ranges for our cloud infrastructure.

Key-Pair Authentication

SELECT's service requires a new Snowflake user with read-only access to the Snowflake metadata database to be created. In addition to limited access and firewall restrictions, SELECT customers can further secure this user by using key-pair authentication instead of the traditional username & password approach.

Serverless Cloud Infrastructure

We do not maintain any of our own physical infrastructure and rely on Google Cloud Platform, our cloud provider, to host SELECT. We make use of serverless infrastructure wherever possible to ensure systems are automatically and regularly updated, continually monitored, and assessed for vulnerabilities. Google Cloud provides an extensive list of compliance assurances, including SOC 1/2-3, PCI, and ISO 27001.

We use software development best practices

This includes version control, declarative infrastructure, service oriented architecture, and test-driven development. We release changes to production environments via continuous integration, continuous deployment (CI/CD).

Hosting

All of our production systems and databases are running on Google Cloud facilities, hosted in the US East regions. For full information on the measures Google has implemented to secure their facilities, visit the Google Cloud Compliance page.

Physical & environmental security

SELECT relies on Google Cloud and their robust controls to manage the physical and environmental security of our systems. Visit the Google Cloud Compliance page for more information.

Encryption at rest and in transit

All application web traffic (in transit) uses HTTPS encryption and data stored (at rest) is encrypted by Google Cloud Platform with AES-256 encryption. We make full use of Google's secret management portfolio to store sensitive data like API keys. You can learn more about their capabilities for encryption at rest and in transit.

Password Policies

For third party software, we use Google as our SAML provider for Single Sign On when available, and we enforce two-factor authentication whenever possible. We have defined best practices for password creation, and when SSO is not available, we mandate employees to use the 1Password password manager to generate and store secure passwords.

Access Control - Secrets and Snowflake Metadata

Client secrets are provided directly by users that must be authenticated via our web application using a verified email. Secrets are programmatically and securely stored. Production systems are restricted so that application servers are authorized with access only when needed. Access to these production systems is restricted to our core engineering team.

Client Access Control - Production Systems

For our production systems, SELECT leverages Auth0 by Okta for client authentication to ensure secure access to our application (you can read more about Auth0's security practices here).

Internally, role-based access control is in place to protect our code base and production systems, and is granted on a need to know basis leveraging the principle of least privilege.

Security Measures

Monitoring & Incident Response

We have automated monitoring and alerting for our critical systems and services. To handle and resolve issues that arise, our engineering team maintains an on call rotation 24/7.

Personal Account Information

We aim to minimize the amount of personal data we collect and store about our clients; however, we do collect and store information such as name, email address, as well as billing address information in the normal course of business. You can learn more about the data we collect in our privacy policy.

Additionally, we may leverage tools to track usage of our product such as analytics tools and server logs that may receive information such as IP address or potentially email address and / or name.

When personal data is stored within our application, it is stored securely and encrypted while in transit and at rest.

Risk Assessment & Risk Management

It is important to constantly re-evaluate the risks to our business, to evaluate the effectiveness of our operations, and to constantly improve our controls.

As such, we track our IT assets and review access on a regular cadence, we update architecture diagrams for our systems as we make large changes to our systems, and we re-evaluate the risks to our business on a continuous basis. When we sign contracts, we review them to make sure our policies, procedures, and controls align with the expectations of our clients.

We strive for a culture of open dialogue within the company about the latest security threats and best practices, and our leadership team is expected to stay up to date on the latest regulation and compliance considerations that are relevant to our business.

Vulnerability Disclosures

If you identify a security concern with SELECT, please contact [email protected]. We will review your disclosure, respond to you within five business days of receipt, and take the necessary steps to remediate.

Please make a good faith effort to avoid privacy violations as well as destruction, interruption or segregation of services and/or data.

Snowflake Metadata Handling

How is Snowflake metadata managed on the SELECT platform?

SELECT is a cloud-based SaaS web application hosted on Google Cloud Platform in the US East regions. Customer Snowflake metadata is securely copied into SELECT's Snowflake account, hosted in the same cloud region.

The SELECT application is multi-tenant. Snowflake metadata collected for each customer is segregated and stored in a dedicated cloud storage bucket and Snowflake database schema.

What metadata is stored?

Some examples of the Snowflake metadata stored include, but are not limited to:

  • The amount each virtual warehouse warehouse was billed per hour
  • The number of tables in each database, and the size of each table
  • The amount the customer was billed by Snowflake on a particular day
  • How frequently Snowflake's automatic clustering service is running
  • Metadata about the queries being run in an account (query runtime, tables accessed, etc.)

See above for a full list of metadata we access.

How is it stored?

Snowflake metadata is stored in both Google Cloud Storage and Snowflake. Data is automatically encrypted at rest using AES 256 encryption in all locations. All application web traffic uses HTTPS encryption.

What are the retention policies?

We store a subset of the customer’s Snowflake metadata in our cloud environment. Most metadata we store contains 365 days of usage information. This data is retained as long as the customer is using the product, and is automatically deleted once the customer stops using the product.

FAQ

What region(s) are our services located in?

Our platform is hosted on Google Cloud Platform and Snowflake in the US East regions.

Does SELECT have access to any of our company data in Snowflake?

No. We only have access to the Snowflake metadata database. This includes object metadata and usage metrics for your account. For example, the number and names of the tables in your databases, or the historical queries that have been run in the account. We cannot access any of the underlying tables or datasets in your account(s).

How SELECT mitigate the risk of customer data exposed in the raw query text?

As discussed in our query text sanitization practices, we do not store any raw query text. All numbers and strings are removed from the query text before it is stored in our databases.

Does SELECT hold any security certifications such as SOC 2, ISO27001?

Yes, we are SOC 2 Type 2 certified. Please reach out to [email protected] to request a copy of our SOC 2 report.

How is customer's data protected and who has access to SELECT data?

All data is encrypted in transit and at rest. Access to a Snowflake customer's metadata is restricted to scenarios which warrant access, such as investigating application issues. In the event access is required, the employee must submit an access request detailing their use case and rationale and receive approval, prior to being temporarily granted access to the customer's metadata.

Does SELECT support 2FA, SSO, or any other defensive options?

Yes. We support most SSO methods (Okta, Microsoft Azure/Entra AD, etc.) and 2FA.

In terms of application security, How does SELECT deal with security reports received from security researchers?

SELECT receives reports through an email address of [email protected]. We review every single report that we receive. We do not have a formal bug bounty program but we do have a process and set of policies and standards we adhere to to process security requests.

Do you have a security contact person in case of breaches?

The team responds to messages at [email protected]. Upon request a security employee can be temporarily assigned to your account as well.

In terms of logging, do you log access activities of SELECT's employees who have access to the data?

Yes.