Joining Tables

Sifflet allows you to define joins between datasets as part of your monitor configuration. This enables powerful monitoring use cases where you can group by or filter on fields that reside in a secondary dataset.


🧠 Why Use Joins?

Joins help you:

  • Group by fields that don’t exist in your base dataset
    e.g., CUSTOMER_TYPE from a joined CUSTOMERS table
  • Filter based on attributes stored in a related dataset
    e.g., exclude dummy users or customers from a specific region

➕ Adding a Join

To define a join in the UI:

  1. In the monitor builder, click on + Add join
  2. Choose the table or view you want to join with
  3. Select the join type:
    • Inner
    • Left
    • Right
    • Outer (coming soon)
  4. Define the join condition by specifying matching columns between the datasets


📊 Group By from Joined Table

Once you've added a join, you can group your monitor by any field in the joined dataset.

Example: Group by CITY from a STORES table joined via store_id.


🔍 Filtering with Joined Fields

Joins also enable filtering based on data from the joined dataset.

Examples:

  • Exclude records where CUSTOMERS.is_dummy = TRUE
  • Include only users where COUNTRY = 'FRANCE'

You can add such conditions in the filter section after defining your join.



🛠️ YAML Support

You can also define joins in your monitor YAML configuration if you're using Data Quality as Code.

datasets:
  - uri: snowflake://sifflet-internal/DEMO.TEST.AGG_CASE_WHEN
joins:
  - dataset:
      id: 899c8d66-1ca7-4af9-b2f3-9ddd13cd187f
      name: ALLUSERS
      datasource:
        id: 7557b14e-abf3-4844-9c9a-f91abfb68f1e
        name: "TEST"
      uri: snowflake://sifflet-internal/DEMO.TEST.ALLUSERS
    joinCondition:
      kind: Equality
      fieldPairs:
        - leftField:
            dataset:
              id: dd30550c-03c1-4c74-93a6-aa34a8cc7f85
              name: AGG_CASE_WHEN
              datasource:
                id: 7557b14e-abf3-4844-9c9a-f91abfb68f1e
                name: "TEST"
              uri: snowflake://sifflet-internal/DEMO.TEST.AGG_CASE_WHEN
            name: FIRSTNAME
          rightField:
            dataset:
              id: 899c8d66-1ca7-4af9-b2f3-9ddd13cd187f
              name: ALLUSERS
              datasource:
                id: 7557b14e-abf3-4844-9c9a-f91abfb68f1e
                name: "TEST"
              uri: snowflake://sifflet-internal/DEMO.TEST.ALLUSERS
            name: FIRSTNAME

Need help defining joins or YAML-based monitors? Reach out to your Sifflet Customer Success Manager.