Azure Data Factory ᴮᴱᵀᴬ

Integrate Azure Data Factory (ADF) with Sifflet to gain comprehensive visibility into your data orchestration. This integration ingests ADF assets into the Sifflet Catalog, allowing you to monitor pipeline status, view operational metadata, and centralize your data observability strategy.

🚀

Coming Soon

We are actively enhancing the ADF integration. Look out for these upcoming features:

  • Granular Lineage: Visualize asset-level lineage to understand exactly which datasets are consumed and generated by every ADF factory.
  • Pipeline Alerting: Receive real-time notifications via Slack, Microsoft Teams, or Email whenever a pipeline fails or experiences delays.

Before configuring the integration, ensure you have the following:

  • Admin access to your Sifflet instance.
  • Permissions in the Azure Portal to create App Registrations and manage IAM (Identity and Access Management) for the target Resource Group.

Step 1: Azure Configuration

To allow Sifflet to read metadata from ADF, you must create a Service Principal (App Registration) and assign it a custom role with specific read permissions.

1.1 Create an Azure App Registration

  1. Log in to the Azure Portal.
  2. Navigate to Microsoft Entra ID (formerly Azure Active Directory) > App registrations.
  3. Click + New registration.
  4. Name the application (e.g., sifflet-adf-app).
  5. Click Register.
  6. Important: Copy the Application (client) ID from the Overview page. You will need this for Sifflet.
  7. Navigate to Certificates & secrets > Client secrets.
  8. Click + New client secret. Add a description and expiry.
  9. Important: Copy the Client Secret Value immediately. You will need this for Sifflet.

1.2 Create a Custom Role

To adhere to the principle of least privilege, create a custom role with only the necessary read permissions.

  1. Navigate to Resource groups and select the Resource Group containing your Data Factories (e.g., MyAdfResourceGroup).
  2. In the left panel, click Access control (IAM).
  3. Click + Add > Add custom role.
  4. Name the role (e.g., sifflet-adf-role).
  5. In the Permissions tab, click + Add permissions. Search for and add the following permissions (all under the Microsoft Data Factory group):
    1. Read Data Factory: Microsoft.DataFactory/factories/read
    2. Read Pipeline: Microsoft.DataFactory/factories/pipelines/read
    3. Read the Result of Query Pipeline: Microsoft.DataFactory/factories/querypipelineruns/read
    4. Read any Dataset: Microsoft.DataFactory/factories/datasets/read
    5. Read Linked Service: Microsoft.DataFactory/factories/linkedServices/read
  6. Review and click Create.

1.3 Assign the Role

  1. Stay in the Access control (IAM) section of your Resource Group.
  2. Click + Add > Add role assignment.
  3. Search for and select the custom role you just created (sifflet-adf-role). Click Next.
  4. Under Assign access to, ensure "User, group, or service principal" is selected.
  5. Click + Select members.
  6. Search for the application created in Step 1.1 (sifflet-adf-app) and select it.
  7. Click Select, then Review + assign.

Step 2: Sifflet Configuration

Once the Azure permissions are set, configure the connection within Sifflet.

2.1 Create a Credential

  1. In Sifflet, navigate to Integrations > Credentials in the sidebar.
  2. Click New credential.
  3. Name your credential (e.g., my-adf-resource-group-sp).
  4. In the Secret input, paste the following JSON structure using the values retrieved in Step 1.1:
    {
      "clientId": "<client id>",
      "clientSecret": "<the secret of the client id>"
    }

2.2 Create the ADF Source

  1. Navigate to Integrations > Sources.

  2. Click New Source and select ADF.

  3. Fill in the configuration details:

    1. Azure AD tenant id: Found in Microsoft Entra ID > Overview.
    2. Subscription id: Found in the Azure Subscriptions page.
    3. Resource Group: The name of the group you configured (e.g., MyAdfResourceGroup).
    4. Credential: Select the credential created in Step 2.1 (my-adf-resource-group-sp).
  4. Click Load Assets and select the specific Data Factories you wish to monitor.

    Choosing the data factories to include in the integration.

  5. Click Test Connection to verify the setup.

  6. Click Save.

Verification

Once the integration is saved, Sifflet will begin ingesting metadata. You can verify the connection by:

  • Navigating to the Catalog.
  • Filtering by the ADF source.
  • Confirming that your pipelines appear and their latest status is visible.