improved

End-to-End Observability with our Revamped Airflow Integration!

We're thrilled to announce a massive upgrade to our Apache Airflow integration! We've moved beyond simply listing DAGs in the catalog to providing deep, actionable insights that connect your data pipelines to the assets they produce. This update gives you a complete picture of your data's journey, from orchestration to consumption.

Here’s what’s new:

Airflow DAGs Directly in Your Lineage

You can now visualize the direct relationship between your Airflow DAGs and your data assets. By adding a simple query tag to your Airflow operators, Sifflet will automatically map which DAGs generate or update specific tables and views. This closes a critical gap in data lineage, allowing you to instantly understand the upstream source of any asset.

  • Benefit: Instantly identify which pipeline populates a given dataset for faster debugging and impact analysis.
  • How to start: Check out our new documentation to learn how to tag your queries.
Airflow DAG within the Sifflet lineage

Airflow DAG within the Sifflet lineage

Live Airflow DAG Status in Sifflet

No more switching between tools to check if a pipeline ran successfully. Sifflet now pulls the latest run status for each of your DAGs and displays it directly in the catalog and lineage.

  • Benefit: Monitor the health of your data pipelines from the same platform you use to monitor your data quality.

📄 Pipeline Context on Asset Pages

When you link a DAG to an asset, that pipeline context now appears directly on the asset's page. See which DAG is responsible for the data without leaving the asset view, and navigate to the DAG page for its description, its owner(s), and its most recent run status.

  • Benefit: Gain immediate, valuable context about an asset's provenance and health, empowering data consumers and accelerating root cause analysis.

🔮 What's Coming Next

This is just the beginning of our push for comprehensive pipeline monitoring. Here's a sneak peek at what our team is working on:

  • Smarter Root Cause Analysis: Our AI agent, Sage, will soon incorporate Airflow DAG status into its incident analysis. It will automatically flag failed or delayed DAGs as the likely root cause of data quality issues.
  • Task-Level Granularity: Soon, you'll be able to drill down even further with detailed metadata and status for individual Airflow tasks.
  • Expanded Orchestrator Support: We're bringing these same powerful capabilities to other leading workflow orchestrators, including Databricks Workflows and Azure Data Factory.

We encourage you to explore the new Airflow integration today! As always, we'd love to hear your feedback.