Time-based Data Aggregation

Overview

Time-based Data Aggregation is a setting that defines how often datapoints are created. It can but doesn't necessarily need to be aligned with a Monitor Run Schedule.

How to use it

Where to find it

Time-based Data Aggregation setting is available in the Monitor Setup. It allows for adjusting the data point creation frequency to better match your data monitoring needs.

How does it work

Below you can see results of the same monitor being run on the exact same dataset with the same settings with the only difference being Time-base Data Aggregation parameter - daily and hourly, for the upper and lower images respectively. Notice the improved precision utilising the same Scheduling Frequency.

Graph with the Data Aggregation parameter set to "daily" Graph with the Data Aggregation parameter set to "hourly"

Impact on ML monitors Time Window (Model Training Period)

In order to ensure a sustainable level of performance, depending on a chosen Time-based Data Aggregation setting, there are following history limitations introduced for the Time Window (Model Training Period) parameter value.

Time-based Data AggregationTime Window (Model Training Period) maximum value
monthly5 years
weekly3 years
daily3 years
hourly45 days
intra-hourly (enterprise-only)7 days

Examples

  • hourly - perfect for monitors controlling the number of transactions evolution during the day. You'll get 1 data point per hour, so that even if the monitor runs 1 per day, you'll still be able to detect for example a lunchtime spike in grocery items.
  • daily - useful for monitoring a sum of sales during the day. For example run a monitor on a weekly schedule and create 1 data point per day. It'll allow you to accustom for differences between particular days of the week.
  • weekly, monthly - logistics and supply chain scenarios, for example monitoring a relationship between incoming and outgoing packages on a weekly basis.