FAQ

BigQuery external tables

I have external tables on BigQuery, how does Sifflet handle them?

Sifflet treats your external tables the same way as any regular table on BigQuery: you can find them in the catalog and monitor them.

Sifflet currently supports the following external table types, requiring additional permissions based on the external system:

External Table type

Supported

Additional rights required

GCS (Cloud Storage)

You can grant the service account the following rights in order to have Viewer access for all GCS buckets/objects:
storage.buckets.list
storage.buckets.get
storage.objects.list
storage.objects.get

OR

You can choose to grant the service account Viewer access only on the buckets referenced by your external tables.

Google Drive, Google Sheets

You can grant the service account Viewer access only on directories/files referenced by your external tables.

GCP BigTable

Unable to add a BigQuery dataset

When creating the source, after pressing "List Datasets" I cannot find the dataset I want to add, is there any configuration issue?

Sifflet will display only the datasets the service account has access to. If you do not see one specific dataset, you might need to review the permissions.

Managing permissions at the dataset granularity

I would like to manage the permissions for the service account at the dataset level, how should I do this?

  • Create a role as described in the doc but without the jobs.create permission:
bigquery.datasets.get  
bigquery.tables.get  
bigquery.tables.getData  
bigquery.tables.list  
bigquery.jobs.listAll
  • Create another role with only the following permission:
bigquery.jobs.create
  • Create a service account as described above and assign it the second role (with the jobs.create permission). This will grant the service account the permission at the project level but does not give it access datasets.
  • For each dataset that you wish to see in Sifflet, follow the below steps:
    • In the BigQuery Explorer panel, select a dataset
    • Click Sharing -> Permissions
    • Click Add Principal
    • In the New Principals field: Select or enter the service account you created for Sifflet
    • In the "Select a role" List , select the custom role you created in the first step (the one with the 5 permissions)
    • Click Save

Billing on another BigQuery project

I would like to bill my queries on another project than the one I am using in production. Is it possible?

In order to better manage the costs associated with the queries executed by Sifflet, you can bill your queries on an additional project, an empty one for instance.

The service account will need the following permissions on the billing project:

bigquery.jobs.create

On the queried project, the permissions needed are:

bigquery.datasets.get  
bigquery.tables.get  
bigquery.tables.getData  
bigquery.tables.list  
bigquery.jobs.listAll

You will just need to specify the Id of the billing project in your data source parameters

Worker Project Ids

What is theWorker Project Ids parameter ?

The Worker Projects Ids is an optional parameter where you can set a list of project ids that you use in BigQuery to query datasets in the Project Id project.

The job history of these worker projects can contain useful usage and lineage information about the datasets in the Project Id project.

If the provided service account has sufficient permissions to query the job history of the worker projects (bigquery.datasets.get to list the datasets and bigquery.jobs.listAll to list the job), then Sifflet will use them to extract additional lineage information on the datasource datasets.

If the service account doesn't have sufficient permissions on these projects, neither the connection test nor the ingestion will fail. The application will simply not use their job history.