Redshift
Why connect Sifflet to Redshift?
By connecting Sifflet to Redshift, you'll be able to leverage Sifflet's three core capabilities:
- Catalog all your data assets (with valuable metadata retrieved from Redshift and enriched directly in Sifflet).
- Run monitors on your data assets to detect issues as soon as they arise and allow data consumers to trust the data.
- Benefit from Sifflet's end-to-end lineage that showcases your data pipelines and the assets upstream and downstream of your data platform.
Redshift Serverless lineage
Although Sifflet supports both provisioned Redshift clusters and Redshift Serverless, lineage is currently only available for provisioned Redshift clusters. To benefit from our end-to-end lineage when using Redshift Serverless, you can leverage our dbt integration or our declarative lineage framework.
Integration guide
You can integrate Sifflet with Redshift by following these steps:
-
Create a dedicated user for Sifflet
-
Grant the Sifflet user the minimal permissions required for it to operate
-
Create a Redshift source in Sifflet
1. Create a dedicated user for Sifflet
To create the user, you will need to run the following SQL query:
-- Create a user for Sifflet
CREATE USER <username> PASSWORD '<password_to_replace>' SYSLOG ACCESS UNRESTRICTED;
Choosing and storing the username and password
We recommend choosing an identifiable username (for example, "sifflet_user") and a secure password. Store them both carefully, as you will need them to configure the new source in Sifflet later.
2. Grant Sifflet the required permissions
You can now grant the required permissions to the newly created user ("sifflet_user" for instance) by running the below queries. Ensure that you replace <schema_name>
with the name of the schema you want Sifflet to monitor and <username>
with the name of the newly created user:
-- Grant permission to the Sifflet user to read all available schemas
GRANT SELECT ON svv_all_schemas to <username>
-- Grant permission to the Sifflet user to read query history
GRANT SELECT ON TABLE SVL_STATEMENTTEXT to <username>
-- Grant permissions to the Sifflet user to read table metadata
GRANT SELECT ON svv_table_info to <username>
-- Repeat this per schema
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> to <username>
GRANT USAGE ON SCHEMA <schema_name> to <username>
ALTER DEFAULT PRIVILEGES IN SCHEMA <schema_name> GRANT SELECT ON TABLES to <username>
Required permissions for field-level lineage
The permission on table
SVL_STATEMENTTEXT
enables our field-level lineage feature (see here for more details). For more details, you can refer to the AWS documentation available here.
3. Create a new source in Sifflet
Create a new credential
To create the Redshift credential, follow the below steps:
- Navigate to the "Credentials" page (you can find it in the side panel by selecting "Integrations" and then "Credentials") and then click on "New credential"
- In the "Credential" area, enter the credentials corresponding to the Sifflet-specific user created above:
{
"user": "<username>",
"password": "<password>"
}
Create a new source
To connect Sifflet to Redshift, you will need to provide the below information:
- The connection details: your Host, Port, Database and Schema.
- Connect using SSL or not: Sifflet currently accepts certificates from all certificate authorities. More details are available here.
- The credential created above.
Source refresh frequency
By default, Sifflet refreshes Redshift metadata once per day. However, you can use the frequency parameter to pick a different refresh schedule if required.
You can also refer to this page for more details on how to add and configure a source in Sifflet.
Updated 9 months ago