Time parameters

There are mainly 3 time parameters to consider while configuring your rules:

  • Rule run frequency
  • Time window
  • Time offset

Frequency

Frequency is the frequency in which your rule will be run. It can be defined either with wording shortcuts (@daily, @hourly) or a cron expression. If you are not familiar with it, you will find more info here.

📘

Optimize your frequency run

You should adapt your frequency according to the refresh frequency of your data.
If your data is updated by batch at 2am every day, running your rules on a hourly one would be suboptimal.

Time window

Time window is the time interval that will be analyzed every time the rule is run. Unless your tables are frequently loaded entirely, setting a time window is recommended to optimize your resources. Sifflet will scan only the new data ingested during the time window that has been set up.

For instance, let's say you want to monitor a table that is updated daily. Setting a time window on this table will require:

  • A specific field that represents the time dimension (for instance, the creation time of the data entry)
  • The historical time window range that you want to consider for the rules scan (one day, one month, one week, for example)

Some templates require a time window whereas it is optional for others. You will be asked to select the field representing the time dimension, the unit of time and the numerical amount of that unit of time.

Time window on thelast 30 days, on field "Date" Time window on thelast 30 days, on field "Date"

Time window on the last 30 days, on field "Date"

On this schema, the set ups of frequency and time window are optimalOn this schema, the set ups of frequency and time window are optimal

On this schema, the set ups of frequency and time window are optimal

📘

Optimize your time window set up

Your time window should be based on the ingestion frequency of your table, which should also be your run frequency if you have followed the advice above. If the table is being updated daily, for example, then a time window of one day will help check the new data ingested every day the rule run.

Time offset

By default, Sifflet runs the monitoring rules on today's date. In some cases, your pipelines are configured so that it updates with a time delay: for instance, a sales table updated every morning with the data for T-2 days. To take this into account, Sifflet allows you to change the reference date by using an offset.

Time offset of 1 dayTime offset of 1 day

Time offset of 1 day

On this schema, there is an offset of one time windowOn this schema, there is an offset of one time window

On this schema, there is an offset of one time window

Parameters at dataset level

Time window and offset can also be generally configured at dataset level. When doing so, every rule created on this asset will, by default, have those parameters. For doing so, you need to select an asset in the Catalog and go to Overview. If Sifflet detects at least one temporal field, then you will have the possibility to configure a time window and an offset.

Setting the time window and offset at dataset levelSetting the time window and offset at dataset level

Setting the time window and offset at dataset level