Apache Airflow ᴮᴱᵀᴬ

This page covers Sifflet's Airflow integration, which allows you to see Airflow metadata in Sifflet. To trigger Sifflet operations from within your Airflow DAGs, you can find all of Sifflet's custom operators and their purposes here.

By integrating Sifflet with Airflow, you can leverage our full data stack approach to obtain a complete view of your data pipelines, from your orchestrator to the data warehouse and your BI tool.

Once the integration is configured, you will have at a glance:

  • Various metadata about your DAGs, such as "Last/Next Execution Date" and "Last Updated Date".
  • The latest status of your DAG runs, allowing you to detect failures as soon as they happen.
1668

"example_bash_operator"'s last run was successful, while "example_branch_datetime_operator_2"'s last run failed and needs attention

This page covers integrating Sifflet with a self-hosted Airflow instance. If, on the other hand, you're using a cloud-managed variation, you can refer to its separate page:

To integrate Airflow with Sifflet, these are the steps to follow:

  1. Create a dedicated read-only user
  2. Connect to Sifflet

📘

Supported Airflow versions

We currently support any self-hosted Airflow instance (version 2.0.0+) in addition to cloud-managed variations (Amazon MWAA on AWS and Cloud Composer on GCP).

1. Create a read-only user

You can create a dedicated Sifflet user with a "Viewer" role.
Please choose a "User Name" (for instance, "sifflet_user") and a secure password. Store them carefully as you will need them when configuring the connection in Sifflet later.

1320

Sample configuration for a Sifflet user in Airflow

2. Connect to Sifflet

Add an Airflow secret

To create the Airflow secret, follow the below steps:

  • In "Integration" --> tab "Secrets", create a new secret.
  • In the "Secret" area, copy-paste the below text and replace it with the correct username and password previously created in step 1:
{
  "user": "<username>",
  "password": "<password>"
}

Create a new Airflow integration

To connect Airflow with Sifflet, you will need three items:

  • Connection details:
    • Host: You can add the entire URL. For instance, if your URL is http://xxxxx.yy, your Host value would be http://xxxxx.yy.
    • Port: The port used to interact with Airflow's REST API. By default, this is 8080.
  • Secret: corresponds to the username and password you previously chose.
  • Frequency: determines how often the information is refreshed.
1832

The different details that you need to provide when configuring the integration

You can also refer to this page on adding a data source in Sifflet.