Joining Tables

Sifflet allows you to define joins between datasets as part of your monitor configuration. This enables powerful monitoring use cases where you can group by or filter on fields that reside in a secondary dataset.


🧠 Why Use Joins?

Joins help you:

  • Group by fields that don’t exist in your base dataset
    e.g., CUSTOMER_TYPE from a joined CUSTOMERS table
  • Filter based on attributes stored in a related dataset
    e.g., exclude dummy users or customers from a specific region

➕ Adding a Join

To define a join in the UI:

  1. In the monitor builder, click on + Add join
  2. Choose the table or view you want to join with
  3. Select the join type:
    • Inner
    • Left
    • Right
    • Outer (coming soon)
  4. Define the join condition by specifying matching columns between the datasets

📊 Group By from Joined Table

Once you've added a join, you can group your monitor by any field in the joined dataset.

Example: Group by CITY from a STORES table joined via store_id.

🔍 Filtering with Joined Fields

Joins also enable filtering based on data from the joined dataset.

Examples:

  • Exclude records where CUSTOMERS.is_dummy = TRUE
  • Include only users where COUNTRY = 'FRANCE'

You can add such conditions in the filter section after defining your join.

🛠️ YAML Support

You can also define joins in your monitor YAML configuration if you're using Data Quality as Code.

datasets:
  - uri: snowflake://sifflet-internal/DEMO.TEST.AGG_CASE_WHEN
joins:
  - dataset:
      uri: snowflake://sifflet-internal/DEMO.TEST.ALLUSERS
    joinCondition:
      kind: Equality
      fieldPairs:
        - leftField:
            dataset:
              uri: snowflake://sifflet-internal/DEMO.TEST.AGG_CASE_WHEN
            name: FIRSTNAME
          rightField:
            dataset:
              uri: snowflake://sifflet-internal/DEMO.TEST.ALLUSERS
            name: FIRSTNAME