Snowflake
Why connect Sifflet to Snowflake?
By connecting Sifflet to Snowflake, you'll be able to leverage Sifflet's three core capabilities:
- Catalog all your data assets (with valuable metadata retrieved from Snowflake and enriched directly in Sifflet).
- Run monitors on your data assets to detect issues as soon as they arise and allow data consumers to trust the data.
- Benefit from Sifflet's end-to-end lineage that showcases your data pipelines and the assets upstream and downstream of your data platform.
Integration guide
You can integrate Sifflet with Snowflake by following these steps:
- Create a dedicated read-only role, user, and warehouse (the warehouse is optional but recommended)
- Grant privileges to databases to be discovered and monitored
- Create the Snowflake source in Sifflet
1. Create the Sifflet entities
To create the dedicated user, role, and warehouse, you will need to run the following SQL queries:
-- Set all your variables values: role, user, password, warehouse and database
set role_name = 'ROLE_CHANGE_ME'; -- TO REPLACE VALUE, make sure this role is not already created
set user_name = 'USER_CHANGE_ME'; -- TO REPLACE VALUE, make sure this user is not already created
set user_password = 'password_change_me'; -- TO REPLACE VALUE
set warehouse_name = 'SIFFLET_WAREHOUSE'; -- TO REPLACE VALUE
use role accountadmin; -- needed to create user/role
-- Create Sifflet Role
create role if not exists identifier($role_name);
grant role identifier($role_name) to role SYSADMIN;
-- Create Sifflet User
create user if not exists identifier($user_name)
password = $user_password
default_role = $role_name
default_warehouse = $warehouse_name;
grant role identifier($role_name) to user identifier($user_name);
-- Create a dedicated warehouse
create warehouse if not exists identifier($warehouse_name)
warehouse_size = xsmall
warehouse_type = standard
auto_suspend = 5
auto_resume = true
initially_suspended = true;
-- Grant the role access to the warehouse
grant USAGE,MONITOR on warehouse identifier($warehouse_name) to role identifier($role_name);
-- Grant access to query history
grant imported privileges on database "SNOWFLAKE" to role identifier($role_name);
Snowflake is case-sensitive
Please create the Snowflake entities with the following recommendations:
- Use an uppercase role name, make sure the role name is not already created
- Use an uppercase username, make sure the username is no already created
- Use the same case for the warehouse name, database name and schema name as the ones in your Snowflake instance
2. Grant privileges to databases to be discovered and monitored
Run the following SQL queries for every schema you want to see in Sifflet:
-- Read-only access to specific schemas
set database_name = 'DATABASE_CHANGE_ME'; -- TO REPLACE VALUE, database you want monitored
set schema_name = 'DATABASE_NAME.SCHEMA_NAME'; --TO REPLACE VALUE
grant USAGE on database identifier($database_name) to role identifier($role_name);
grant USAGE on schema identifier($schema_name) to role identifier($role_name);
grant SELECT on all tables in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on future tables in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on all external tables in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on future external tables in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on all views in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on future views in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on all streams in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on future streams in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on all dynamic tables in schema identifier($schema_name) to role identifier($role_name);
grant SELECT on future dynamic tables in schema identifier($schema_name) to role identifier($role_name);
grant VIEW LINEAGE on account to role identifier($role_name);
3. Create the Snowflake source in Sifflet
Create a new credential
To create the Snowflake credential, follow the below steps:
- Navigate to the "Credentials" page (you can find it in the side panel by selecting "Integrations" and then "Credentials") and then click on "New credential"
- In the "Credential" area, enter the credentials corresponding to the Sifflet-specific user created above:
- For password-based authentication: copy-paste the below text and replace it with the correct username and password:
{ "user": "<username>", "password": "<password>" }
- For key-pair authentication: copy-paste the below text and replace it with the correct username and unencrypted private key:
{ "user": "<username>", "private_key": "<private_key>" }
Key-pair authentication
Sifflet supports key pair authentication for Snowflake, offering enhanced authentication security as an alternative to using a username and password.
To use key pair authentication, create the key pair by following the guide provided by Snowflake and then use the unencrypted private key when adding the credentials to Sifflet as mentioned above.
When generating the private key, make sure to add the
-nocrypt
option to your command.
Create a new source
To connect to Snowflake on Sifflet, you will need two items:
- The connection details: your Account Identifier, the Warehouse name, your Database name, and Schema.
- the credential: corresponds to the username and password/key you previously created.
Retrieving the Snowflake account identifier
To retrieve your account identifier:
- Click on the arrow at the bottom left corner of the screen.
- Hover over the middle part of the widget.
- Click on the link icon.
This will copy your Snowflake console URL to your clipboard.
The Account Identifier is the string that precedes "snowflakecomputing.com" in your Snowflake console URL.
For instance:
- For "https://abcd123.eu-west-2.snowflakecomputing.com", your Account Identifier will be "abcd123.eu-west-2".
- For "https://acme-marketing_test_account.snowflakecomputing.com", your Account Identifier will be "acme-marketing_test_account".
- For https://app.snowflake.com/eu-west-2.aws/abcd123, your Account Identifier will be "abcd123.eu-west-2".
Several account identifiers for one Snowflake instance
There might be several ways to access your Snowflake environment (more details on Snowflake's docs). For instance:
https://<orgname>-<account_name>.snowflakecomputing.com
andhttps://<accountlocator>.<region>.<cloud>.snowflakecomputing.com
.If you used one specific URL when connecting other tools - such as a BI tool - to your Snowflake instance, please use the same one when adding the Snowflake source in Sifflet.
By default, Sifflet refreshes Snowflake metadata once per day. However you can use the frequency parameter to pick a different refresh schedule if required.
You can refer to this page for details on adding a source in Sifflet.
FAQ
What tables are accessed by Sifflet, and how are they used?
Sifflet accesses various tables in your Snowflake account to provide a complete user experience. Restricting access to some of those databases/schemas/tables might reduce the number of functionalities Sifflet provides.
Accessed table | Usage |
---|---|
SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY Only for Snowflake Enterprise (or higher). | Data usage computation, SQL transformation and lineage computation |
SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY | Data usage computation, SQL transformation and lineage computation |
SNOWFLAKE.ACCOUNT_USAGE.OBJECT_DEPENDENCIES | Lineage computation |
SNOWFLAKE.ACCOUNT_USAGE.TABLES | Data usage computation |
SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES | Retrieving Snowflake tags |
Any allowed [DB].[SCHEMA] | Connection test |
Any allowed [DB].[SCHEMA].[TABLE] | Execution of any user-defined monitors and column-level AI suggestions |
Any allowed [DB].INFORMATION_SCHEMA.TABLES | Evaluate the freshness (Update Time Gap) of the table(s) in the defined [DB] database. |
I refreshed the Snowflake sources on Sifflet but some of my latest SQL transformations, tags, etc. do not appear. Why?
Snowflake declares a range data latency times due to the process of extracting the data from Snowflake’s internal metadata store.
Overall, for ACCOUNT_USAGE tables, you should expect a latency of up to 3 hours from Snowflake. For more detailed information, please refer to the "Data latency" section of the Snowflake Account Usage documentation..
Updated about 1 month ago