Thresholds

Thresholds in Data Quality Monitors

In Sifflet's Data Quality Monitors, thresholds define when alerts are triggered based on the values of monitored fields. Below are the types of thresholds you can use in YAML-based configuration.

Threshold Types

Data Quality Monitors support the following types of thresholds:

  • Dynamic Threshold (Default): Uses machine learning (ML) to automatically adjust thresholds based on seasonal patterns and detect anomalies.
  • Static Threshold: Predefined fixed limits, either as a minimum, maximum, or both.
  • Relative Percentage Threshold: Compares current values to previous values by a percentage change.
  • Relative Threshold (Absolute Mode): Compares current values to previous values by a fixed absolute value change.

If no threshold is specified, dynamic thresholds are applied by default.


Dynamic Threshold

Dynamic thresholds use machine learning to detect seasonality and anomalies in the data. These are ideal for time-series data where you want the system to adapt to changing trends.

threshold:
  kind: Dynamic
  sensitivity: 26
  • sensitivity: Adjusts how reactive the threshold is to data changes. A higher value means more sensitivity to smaller deviations.

Use Case: Detect seasonal variations in sales data automatically.


Static Threshold

Static thresholds are fixed limits. Alerts are triggered when the monitored value exceeds or falls below these specified boundaries.

Static Threshold (Minimum Only)

threshold:
  kind: Static
  min: 0
  • min: The lower bound that triggers an alert if the value falls below it.

Use Case: Ensure that no items fall below a certain count, such as triggering an alert if no sales are made.

Static Threshold (With Min and Max)

threshold:
  kind: Static
  min: 0
  max: 1000000
  isMaxInclusive: false
  • min: The lower bound that triggers an alert if the value falls below it.
  • max: The upper bound that triggers an alert if the value exceeds it.
  • isMaxInclusive: Whether the upper bound is inclusive (true) or exclusive (false).

Use Case: Use both min and max thresholds to ensure that the count stays within a valid range, e.g., preventing sales from dropping below zero or exceeding a defined upper limit.


Relative Thresholds

Relative Percentage Threshold

Relative Percentage thresholds compare the current data with the previous time window and alert when values change by a specific percentage.

threshold:
  kind: Static
  comparisonMode: RelativePercentage
  min: -5%
  isMinInclusive: false
  max: 5%
  isMaxInclusive: false
  • comparisonMode: Set to RelativePercentage to compare by percentage.
  • min: The lower percentage change that triggers an alert.
  • max: The upper percentage change that triggers an alert.
  • isMinInclusive / isMaxInclusive: Whether the min/max values are inclusive (true) or exclusive (false).

Use Case: Trigger alerts if the count changes by more than ±5% from the previous period.

Relative Threshold (Absolute Mode)

Relative Thresholds in absolute mode compare the current data with the previous time window by an absolute value rather than percentage.

threshold:
  kind: Static
  comparisonMode: Relative
  min: -5
  isMinInclusive: false
  max: 100
  isMaxInclusive: false
  • comparisonMode: Set to Relative to compare by absolute value change.
  • min: The lower absolute value change that triggers an alert.
  • max: The upper absolute value change that triggers an alert.
  • isMinInclusive / isMaxInclusive: Whether the min/max values are inclusive (true) or exclusive (false).

Use Case: Trigger alerts if the value drops by more than -5 units or increases by more than 100 units..


Default Behavior

If no threshold is specified in the monitor configuration:

  • For Volume and Metrics Monitors: a dynamic threshold is used by default. This ensures that even without explicit threshold settings, the monitor will dynamically adjust and detect anomalies using machine learning.
  • For Field Format and Field Profiling. Monitors: a static threshold of 0 is used by default. This ensures that even without explicit threshold settings , the monitor will trigger when duplicates/nulls/incorrect formats are detected.