Databricks
To connect Sifflet to Databricks, you will need an account with admin rights to create a read-only access.
You can integrate by following these steps:
- Create a dedicated Service Principal with its token or use a user's Personal Access Token
- Grant permissions
- Connect to Sifflet
Unity Catalog and Databricks SQL
We currently support Databricks with Unity Catalog and Databricks SQL. For other configurations, don't hesitate to reach out.
1- Create a Service Principal or a user Personal Access Token
A service principal is a dedicated identity created for use with automated tools such as Sifflet. Similar to Databricks' documentation, we recommend using a service principal instead of a Personal Access Token belonging to a Databricks user.
Service Principal
- To create a service principal, you have the two options below:
- Directly from your Databricks admin console: from your username on the top right, go to
Admin Console
->Service Principals
tab -> Click onAdd service principal
->Add new service principal
. You can name it for instanceSifflet service principal
- Or programmatically: you can refer to Databricks' documentation here
- Directly from your Databricks admin console: from your username on the top right, go to
The service principal should have only Databricks SQL access
as Entitlements.
- Once created, grant the usage of token for this service principal. More information here
- The last step is to generate a token for it. More information here. Save the token as you will need it to connect to Sifflet.
User Personal Access Token
You can create a user personal access token by referring to Databricks' documentation here. Save the token as you will need it to connect to Sifflet.
2- Grant permissions
You can grant permissions to the service principal at either Catalog, Schema or even table level.
Grant to existing and future tables
- Granting permissions at Catalog level will automatically propagate the permissions for the existing and future Schemas (and consequently Tables)
- Granting permissions at Schema level will automatically propagate the permissions for the existing and future Tables
To grant access, you have the two options below:
- Directly from Databricks console: in
Data Science & Engineering
-> Navigate to the Catalog or Schema you want to add to Sifflet ->Permissions
tab -> Click onGrant
and choose for the service principal/user theData Reader
preset permissions

- Or run the below SQL queries.
For the service principal, you will need the service principalApplication ID
that can be found inAdmin Console
->Service principals
tab
For the user Personal Access Token, you can replace theApplication ID
by the user name.

Granting permissions at Catalog level:
GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<Application_ID>`;
GRANT USE_SCHEMA ON CATALOG <catalog_name> TO `<Application_ID>`;
GRANT SELECT ON CATALOG <catalog_name> TO `<Application_ID>`;
Or at Schema level:
GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<Application_ID>`;
GRANT USE_SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<Application_ID>`;
GRANT SELECT ON SCHEMA <catalog_name>.<schema_name> TO `<Application_ID>`;
Or for specific tables:
GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<Application_ID>`;
GRANT USE_SCHEMA ON SCHEMA <schema_name> TO `<Application_ID>`;
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.<table_name> TO `<Application_ID>`;
3- Connect to Sifflet
Create the Secret
To add the newly created token in Sifflet, please follow the below steps:
- In "Integration" --> submenu "Secrets", create a new secret
- In the "Secret" area, copy-paste the token
Add the datasource
Warehouse
You can use an existing Warehouse or create a new dedicated one for Sifflet. You can follow the instructions here to create a new one.
You can choose the cluster size depending on the number of data assets you want to monitor. As a reference, X-Small is enough for environments with thousands of tables or fewer.
- First, let's find the information that Sifflet will require to connect. In your Databricks environment, go to
SQL Warehouse
-> Choose the Warehouse Sifflet will use -> Navigate to theConnection Details
tab.

- Back to Sifflet:
- Go to Integration --> click "+ New"
- Fill out the necessary information that was collected in the previous step.
Host
: corresponds to theServer hostname
on Databricks, with a formatxxxxx.cloud.databricks.com
Port
: 443Http Path
: corresponds to theHTTP path
on DatabricksCatalog
: the Catalog you want to add to SiffletSchema
: the Schema you want to add to SiffletSecret
: the name of the secret containing the token
You can refer to this page for more detailed information on adding a data source in Sifflet.
Updated 3 months ago