This article provides and overview of history digests. For full documentation see the History Digests API.
What are History Digests for?
History data is unstructured and large. Any data that can be expressed in JSON can be stored in the subject or in an event in a history, so any collection of fields might exist in each subject/event. Because of the large number of events and their unstructured nature, it can be slow and cumbersome to get answers to certain questions from the history service. Questions like:
- "How many different reports have come from this postcode?"
- "Did we close more cases in August than July?"
What does a Digester do?
Digesters take the entire history database and copy a subset of the data into a separate database table. This has two advantages over the complete history database:
- The data is structured
- The data is small
Only the fields that were specified are copied into the digest, so all the other fields in each event are not present. Only the histories and events that were of interest to the digester are copied so the bulk of the history data is left behind. This small table with only the necessary columns can give very high performance in answering the sort of questions posed above.
Digests vs Aggregations
Most of these types of questions rely on aggregate functions like
Digest tables are strictly one-row-per-history and when a history is deleted so too is that row of the digest. All the aggregations should be performed either as the data is queried from the view or in post processing of the queried data. The two main challenges of designing a digest are deciding what data to keep in the digest and working out how to query that data to produce meaningful information.
Designing a Digester
registerDigest() builds your digester. It includes different levels of filter which can:
- Target specific histories
- Only consider histories that contain certain events
- Pick the events within a history that will be digested
Each row in your digest table corresponds to one history, and the columns in your table will be one of seven types, populated using one of nine column operations, including the ability to create child tables.
That level of flexibility can make a digester difficult to get your head around!
Which Histories?
There are two filters you can use to select the histories to be digested. They can be used on their own or together.
The top-level
"filter": {
"key": "labela",
"EQ": "Customer Enquiry"
}
The top-level
"eventFilter": {
"key": "event",
"EQ": "Mail Sent"
}
Generally the top-level
Which Events?
Once you have created filters that select relevant histories, the columns in your digester use an
Each column you define always has a
Child tables have their own set of columns. These columns have their own operations.
A column operation used on it's own will find the first or last value (as appropriate) from any event in the history. You can also use a column-level
For example, this operation would record the timestamp of the first time an event called "Mail Sent" was recorded:
{
"name": "first_response_time",
"type": "datetime",
"eventFilter": {
"key": "event",
"EQ": "Mail Sent"
},
"operation": "firsttimestamp"
}
And this would count how many events called "Mail Sent" there are in the history:
{
"name": "purchase_count",
"type": "int",
"eventFilter": {
"key": "event",
"EQ": "Mail Sent"
},
"operation": "countevents"
}
Performing the Digest
Once a digest has been registered, digestHistories() actually performs the digest operation. That function is often used in an End Point which can be called using a Scheduled Tasks. Where we have used digests to provide data for dashboards, the scheduled task generally calls the End Point every five minutes.