Toggle menu

History Digest Failures

This article describes a fault finding technique used when investigating a scheduled history digest that failed to complete.

Understanding the digestion process

History Digest Overview exist to turn potentially large unstructured history data into known, manageable, structured data that can be used for things like reporting and dashboards.

Digestion is a fast and efficient process because no data ever leaves the database server. It runs queries which, in essence, do nothing more than query some values from the history tables and insert them into the digest table. This happens in batches so that the database server isn't overwhelmed.

Within a batch, the digestion happens column by column in a two stage process. Firstly any new histories (in the batch) which are eligible to be digested are identified and the row for that history is inserted into the digest table (with a default value). Secondly the digest rows are updated with any new values from the batch of events.

Finding the undigestable data

It is not unusual to be confronted by a call to digestHistories() which fails. You usually get told something like "<unintelligible database error> while digesting data". It might seem unreasonable that the message doesn't tell you a problem value or the table that contained the error, however it's understandable once you realise that the query which failed is accessing multiple rows and columns at the same time.

The most powerful tool in fault finding is to identify which Event in the history data contains the undigestable value.

Calls to digestHistories() can contain a batchsize parameter which defaults to 250,000. When the call to digestHistories() fails you know that the problem data is somewhere in the next batchsize events. By steadily reducing the batchsize you can narrow down the list of events which might be causing the problem. You can use a binary search and narrow it down to one event.

This query will tell you which events have already been digested into which columns:

SELECT WkHistoryDigestTables.Name, WkHistoryDigestCols.Name, WkHistoryDigestCols.Type, WkHistoryDigestCols.Op, WkHistoryDigestCols.K, WkHistoryDigestCols.CursorPos
FROM WkHistoryDigestTables, WkHistoryDigestCols
WHERE WkHistoryDigestTables.ID = WkHistoryDigestCols.TableID

The CursorPos is the ID of the last event which was successfully examined during digestion into a particular column.

This query will let you look at the raw history data for a couple of events:

SELECT *
FROM WkHistoryEvents, WkHistoryValues
WHERE WkHistoryEvents.ID = WkHistoryValues.EventID
AND WkHistoryEvents.ID IN (12512, 12513)

You might be able to identify the problem at this stage.

Scratch Digests

It might be useful to digest the data into a smaller or simpler digest. In the past we have been able to prove which column of the digest was having a problem by digesting the same history data into a digest that contained only one column.

Additional Parameters

The method digesthistories() has a parameter "getsql":true which will make it give back the sql of all the queries it ran. The problem is that it doesn't return the queries if an error was thrown while one of them was running.

Last modified on 19 January 2022

Share this page

Facebook icon Twitter icon email icon

Print

print icon