Monitors library
A monitor can be set on either a table or a field to check specific data quality criteria. Monitors can be scheduled to run based on a flexible scheduler and can alert the user in case of a breach on Slack or by email.
Find below the available Monitors with a brief description of what they do.
Type | Monitiors | Applicable To | Description |
---|---|---|---|
Metadata | Completeness (ML-Based) | Table | Counts the new ingested rows and compares them to expectations based on past behavior. |
Metadata | Duplicates (ML-Based) | Table | Computes the duplication rate on row-level and compares it to expectations based on past behavior. |
Metadata | Freshness (ML-based) | Table | Detects the frequency of your table new rows ingestion. |
Metadata | Schema Change | Table | Detects any new change to the schema: new field(s), removed field(s), existing field(s) with updated types or names. |
Metrics | Average (static thresholds) | Fields: Numeric | The monitor fails if the average of the field is outside of a given range |
Metrics | Values count (static thresholds) | Fields: All | The monitor fails if the number of unique values of the field is outside of a given range |
Metrics | Quantile (static thresholds) | Fields: Numeric | The monitor fails if a quantile of the field is outside of a given range. |
Metrics | Values (static thresholds) | Fields: Numeric | The monitor fails if the chosen field has one or more values outside of a given range. |
Metrics | Standard Deviation (static thresholds) | Fields: Numeric | The monitor fails if the standard deviation of the field is outside of a given range. |
Metrics | Variance (static thresholds) | Fields: Numeric | The monitor fails if the variance of the field is outside of a given range. |
Smart Metrics | Interlinked Metrics | Fields: Numeric | The monitor fails if the defined metrics diverge significantly from each other |
Smart Metrics | Metrics (dynamic thresholds) | Fields: Numeric | The monitor fails is the selected statistical transformation of the field behaves differently than it did in the past. |
Smart Metrics | Metrics Custom (dynamic thresholds) | Table | The rule fails if the time series returned by the query behave differently than they did in the past |
Field profiling | Distribution Change | Fields: All | The monitor fails if the distribution of a given field has changed abnormally compared to a former given run. |
Field profiling | Duplicates in % (static thresholds) | Fields: All | The monitor fails if chosen field duplicate rate is superior to a given threshold. |
Field profiling | Duplicates in # (dynamic thresholds) | Fields: All | The monitor fails if the count of duplicate values of the field is abnormal compared to expectations based on past behavior. |
Field profiling | Duplicates in % (dynamic thresholds) | Fields: All | The monitor fails if the % of duplicate values of the field is abnormal compared to expectations based on past behavior. |
Field profiling | Low Cardinality | Fields: All | The monitor fails if: - The chosen field has several different values above a given threshold; - The different values of the field changed since the previous run Ex: from ['dog', 'cat'] to ['dog', 'rabbit', 'turtle']. |
Field profiling | Not after date | Fields: Timestamps, Dates | The monitor fails if the table has rows after a given date. |
Field profiling | Not before date | Fields: Timestamps, Dates | The monitor fails if the table has rows before a given date. |
Field profiling | Not in the list | Fields: String | The monitor fails if the chosen field has values that are not present in the given list. |
Field profiling | Null in # (static thresholds) | Fields: All | The monitor fails if the chosen field has values that are empty/null. |
Field profiling | Null in # (dynamic thresholds) | Fields: All | The monitor fails if the count of null values of the field is abnormal compared to expectations based on past behavior. |
Field profiling | Null in % (dynamic thresholds) | Fields: All | The monitor fails if the % of null values of the field is abnormal compared to expectations based on past behavior. |
Field profiling | Unique | Fields: All | The monitor fails if the chosen field has duplications. |
Format validation | Is an email | Fields: String | The monitor fails if the chosen field contains at least one row that does not have an email format. |
Format validation | Is a phone number | Fields: String | The monitor fails if the chosen field contains at least one row that does not have a phone number format. |
Format validation | Is UUID | Fields: String | The monitor fails if the chosen field contains at least one row that does not have a UUID format. |
Format validation | Matches regex | Fields: String | The rule fails if the selected field contains at least one row that does not match the format specified by the given regular expression. |
Custom | SQL | Table | Advanced template to write custom monitors based on business specifics. The SQL query must describe a quality breach on one or more tables within the same data source. |
Custom | Conditional rules | Tables | As for SQL, this template allows to write custom monitors based on business use cases. With conditional statements, no SQL syntax is needed. The rule fails if values are found inside the filtering criteria set by conditional rules |
Updated about 1 month ago