Information Security at SELECT
This document includes some of our key security practices, and outlines our approach to continuously enhance the security of our systems.
While not intended to outline every policy, procedure, and control at SELECT, we expect this document will act as a starting point for conversation - demonstrating our commitment to providing a secure and reliable platform for Snowflake metadata ingestion and analysis.
For additional information, or to discuss security related questions, do not hesitate to contact us at [email protected].
Ian & Niall
Our team is entirely based in Canada and the United Kingdom, and has expertise managing Snowflake accounts, analytics infrastructure, and data-driven products for public companies. We were previously responsible for managing data warehouse optimization for multiple companies, including many companies in the Fortune 500.
We know the value of privacy and security by design, and have architected our team, our operations, and our systems with this in mind. Everyone on our team has spent time working with major cloud infrastructure providers and data warehouses, and is well versed in cloud security. We ensure the bare minimum of team members have access to our own Google cloud infrastructure.
To ensure that we are able to iterate quickly when it comes to operations, we manage key company policies, standard operating procedures, and security controls through GitHub. This provides version control, automated notifications, and change logs for each policy, procedure and control.
It is important to set expectations of our leadership and our employees. As such, we have two key manuals:
- Employee Manual that includes our code of conduct, our exceptions policy, and our disciplinary policy.
- Security Manual covering access control, acceptable use, encryption, incident response, password management, and our software development lifecycle (among other key items).
We monitor our production systems and our critical vendors for availability and address issues in a timely manner. For our policies, procedures and controls - each artifact has a named owner, and we conduct regular reviews to ensure our operations remain aligned with the data we process, the risks we face, and the clients we serve.
We have detailed a number of our security controls in the sections that follow. For additional information, or questions about our other controls, reach out to [email protected].
Our security model
Secure by Design
SELECT only requires read access to a customer's Snowflake account metadata database. This database only includes metadata about how the customer is using Snowflake. No actual customer data or sensitive information is stored in this database. Some examples of this information include, but are not limited to:
- The number of tables in each database, and the size of each table
- The amount the customer was billed by Snowflake on a particular day
- How frequently Snowflake's automatic clustering service is running
- Metadata about the queries being run in an account (query runtime, tables accessed, etc.)
This data is stored in SELECT's own Snowflake account where insights can be derived and presented to our customers. We follow a principle of least privilege, and only extract the minimum subset of metadata required for SELECT's services.
Zero access to any customer data!
We do not have read or write access to any of the customer's data that is stored in Snowflake. This access is tightly controlled during the onboarding process where customers create a new user for SELECT with an extremely limited set of permissions.
The security features discussed above can be further visualized in the diagram below outlining SELECT's secure & limited data access architecture:
- A user with limited, read only access to the Snowflake metadata database is created by the customer in their Snowflake account. This user can only access the Snowflake metadata database and does not have any access whatsoever to customer data.
- SELECT uses this user to extract a subset of the customer's Snowflake metadata into our Snowflake account.
- Insights are derived from this dataset and presented to the customer in SELECT's web application.
In addition to being secure by design and not accessing or storing any customer data, SELECT also follows various security and software engineering best practices, as outlined in the rest of this document.
Query Text Sanitization
While rare, we recognize it is possible that Snowflake users within a customer's company may inadvertently include sensitive information in their query text. For example, an engineer may be debugging an issue and query all notifications sent to a particular user. They may even store some notes to themselves in the query comments:
/*Customers to investigate:1. Joe Smith - [email protected] - 1234567892. Steve Jones - [email protected] - 987654321*/selectnotification_id,date_sentfrom notifications_sentwhereemail = '[email protected]'or phone_number = 123456789 -- 987654321
SELECT is designed for this worst case scenario and can strip out any literal values and sensitive comments before storing the metadata in our account. This functionality can be enabled upon request. Using the same query example from above, we would only store the following query_text in our database:
selectnotification_id,date_sentfrom notifications_sentwhereemail = $1or phone_number = $2
Similar scrubbing can be performed across any free-form text fields ingested from the customer's Snowflake account metadata database into SELECT's database.
What Snowflake metadata do we access?
SELECT accesses Snowflake usage metadata in order to present users with insights and recommendations related to cost & performance optimization. More information on the exact views we access and their purpose is provided below.
The following views from the account_usage schema are accessed. All views contain metadata about the customer's Snowflake usage. Examples include performance statistics about historical queries run, billing amounts for different Snowflake services, and performance data for virtual warehouses. Please refer to the Snowflake documentation for each view if additional information is required. The
account usage views accessed are required to present customers with comprehensive cost and performance insights.
The following views from the organization_usage schema are accessed:
snowflake.organization_usage.contract_items: Contains information about a customer's current Snowflake contract. We use this to help provide users with budgeting forecasts.
snowflake.organization_usage.remaining_balance_daily: Contains information about a customer's remaining contract balance. Required to determine the effective rates to apply when calculating costs and for budget forecasting.
snowflake.organization_usage.rate_sheet_daily: Contains information about the effective rates applied on each day. Required to calculate spend data.
snowflake.organization_usage.usage_in_currency_daily: Contains information about how much a customer is being billed each day. Required to provide customers with Snowflake spend analytics.
Fixed IP address
We also support Snowflake’s network policies for customers who want to restrict inbound traffic to trusted IP address ranges. Please reach out to [email protected] to receive the fixed IP address ranges for our cloud infrastructure.
Serverless Cloud Infrastructure
We do not maintain any of our own physical infrastructure and rely on Google Cloud Platform, our cloud provider, to host SELECT. We make use of serverless infrastructure wherever possible to ensure systems are automatically and regularly updated, continually monitored, and assessed for vulnerabilities. Google Cloud provides an extensive list of compliance assurances, including SOC 1/2-3, PCI, and ISO 27001.
We use software development best practices
This includes version control, declarative infrastructure, service oriented architecture, and test-driven development. We release changes to production environments via continuous integration, continuous deployment (CI/CD).
All of our production systems and databases are running on Google Cloud facilities, hosted in the US. For full information on the measures Google has implemented to secure their facilities, visit the Google Cloud Compliance page.
Physical & environmental security
SELECT relies on Google Cloud and their robust controls to manage the physical and environmental security of our systems. Visit the Google Cloud Compliance page for more information.
Encryption at rest and in transit
All application web traffic (in transit) uses HTTPS encryption and data stored (at rest) is encrypted by Google Cloud Platform with AES-256 encryption. We make full use of Google's secret management portfolio to store sensitive data like API keys. You can learn more about their capabilities for encryption at rest and in transit.
For third party software, we use Google as our SAML provider for Single Sign On when available, and we enforce two-factor authentication whenever possible. We have defined best practices for password creation, and when SSO is not available, we mandate employees to use the 1Password password manager to generate and store secure passwords.
Access Control - Secrets and Snowflake Metadata
Client secrets are provided directly by users that must be authenticated via our web application using a verified email. Secrets are programmatically and securely stored. Production systems are restricted so that application servers are authorized with access only when needed. Access to these production systems is restricted to our core engineering team.
Client Access Control - Production Systems
For our production systems, SELECT leverages Auth0 by Okta for client authentication to ensure secure access to our application (you can read more about Auth0's security practices here).
Internally, role-based access control is in place to protect our code base and production systems, and is granted on a need to know basis leveraging the principle of least privilege.
Monitoring & Incident Response
We have automated monitoring and alerting for our critical systems and services. To handle and resolve issues that arise, our engineering team maintains an on call rotation 24/7.
Personal Account Information
Additionally, we may leverage tools to track usage of our product such as analytics tools and server logs that may receive information such as IP address or potentially email address and / or name.
When personal data is stored within our application, it is stored securely and encrypted while in transit and at rest.
Risk Assessment & Risk Management
It is important to constantly re-evaluate the risks to our business, to evaluate the effectiveness of our operations, and to constantly improve our controls.
As such, we track our IT assets and review access on a regular cadence, we update architecture diagrams for our systems as we make large changes to our systems, and we re-evaluate the risks to our business on a continuous basis. When we sign contracts, we review them to make sure our policies, procedures, and controls align with the expectations of our clients.
We strive for a culture of open dialogue within the company about the latest security threats and best practices, and our leadership team is expected to stay up to date on the latest regulation and compliance considerations that are relevant to our business.
If you identify a security concern with SELECT, please contact [email protected]. We will review your disclosure, respond to you within five business days of receipt, and take the necessary steps to remediate.
Please make a good faith effort to avoid privacy violations as well as destruction, interruption or segregation of services and/or data.
Does SELECT have access to any of our data in Snowflake?
No. We only have access to the Snowflake metadata database. This includes object metadata and usage metrics for your account. For example, the number and names of the tables in your databases, or the historical queries that have been run in the account. We cannot access any of the underlying tables or datasets in your account(s).
How SELECT mitigate the risk of customer data exposed in the raw query text?
As discussed in our query text sanitization practices, we do not store any raw query text. All numbers and strings are removed from the query text before it is stored in our databases.
Does SELECT hold any security certifications such as SOC2, ISO27001?
No. We are currently pursuing SOC2 certification and believe that we already adhere to the standards set forth in these certifications.
How is customer's data protected and who has access to SELECT data?
All data is encrypted in transit and at rest. Only a subset of senior employees have access to data on the SELECT side. Those with access to customer data is regularly reviewed and revised by the team.
Does SELECT support 2FA, SSO, or any other defensive options?
Yes. We support SAML SSO and 2FA is on the roadmap.
In terms of application security, How does SELECT deal with security reports received from security researchers?
SELECT receives reports through an email address of [email protected]. We review every single report that we receive. We do not have a formal bug bounty program but we do have a process and set of policies and standards we adhere to to process security requests.
Do you have a list of third parties that SELECT uses?
Do you have a security contact person in case of breaches?
The team responds to messages at [email protected]. Upon request a security employee can be temporarily assigned to your account as well.
In terms of logging, do you log access activities of SELECT's employees who have access to the data?
Does SELECT store any information regarding Snowflake metadata in the database?
Yes. We store basic metadata about a customers Snowflake account in our database in order to be able to show corresponding costs for associated Snowflake resources. We do not have the ability to access any of the underlying datasets or resources in your Snowflake account(s).