Toggle menu

Queries and Performance

Your queries to the Workflow worker can potentially return huge amounts of data. This article looks at different scenarios and how more selective queries can make things faster.

This article looks at the standard workflow worker methods for returning data. You can get even better performance by using the search collection. See Process Instance Search Indexing for more information.

Querying Process Instances

getProcessInstances and getHistoricProcessInstances can potentially return every active and finished process instance in the workflow engine. This may add up to thousands of instances, each of which could contain hundreds of process variables. Fetching all of this data, then throwing 95% of it away because you only wanted to know a few details from a couple of instances, is inefficient but luckily, easily avoided.

Both functions can return information about active instances (by setting the "finished": false in getHistoricProcessInstances) so when should you use each one?

Basic Calls and Results

Without setting any additional parameters, these are the results of a call to each function, returning information about a single instance.

getHistoricProcessInstances

function(params, credentials) {
    let resp = this.callWorkerMethod("workflow", "getHistoricProcessInstances", {
        "processInstanceBusinessKey": "0265-7976-9399-0675"
    });
    return resp;
}

{
    "businessKey": "0265-7976-9399-0675",
    "startUserId": "anonymous",
    "startTime": "2019-02-27T08:42:39Z",
    "id": "120023",
    "processDefinitionName": "messages",
    "endTime": null,
    "processDefinitionId": "messages:6:120021"
}

getProcessInstances

function(params, credentials) {
    let resp = this.callWorkerMethod("workflow", "getProcessInstances", {
        "processInstanceBusinessKey": "0265-7976-9399-0675"
    });
    return resp;
}

{
    "businessKey": "0265-7976-9399-0675",
    "isSuspended": false,
    "processVariables": {
        "_logFinalSummary": false,
        "form_TEXT1": "Some text",
        "_historyLabels": {
            "labela": "messages",
            "labelc": null,
            "labelb": "0265-7976-9399-0675",
            "labeld": null,
            "labele": null
        },
        "_startDate": "2019-02-27T08:42:39Z",
        "initiator": "anonymous",
        "_startForm": "FORMTOSTARTASIMPLEPROCESS",
        "_startDateString": "2019-02-27 08:42:39",
        "_initiatorProxy": "",
        "_businessKey": "0265-7976-9399-0675",
        "_processDefinitionID": "messages:6:120021",
        "_summaryForm": "FORMTOSTARTASIMPLEPROCESS",
        "_startPageURL": "http://timssite/icm/admin/icmapps/index.cfm?AppDir=icmFormApp:5",
        "_processDefinitionKey": "messages"
    },
    "processDefinition": {
        "id": "messages:6:120021",
        "formProperties": {},
        "category": "http://www.gossinteractive.com/processdef",
        "diagramResourceName": "messages-V-7.messages.png",
        "description": "messages",
        "isSuspended": false,
        "name": "messages",
        "identityLinks": [{
            "taskId": null,
            "groupId": null,
            "processInstanceId": null,
            "userId": "anonymous",
            "type": "candidate",
            "processDefinitionId": "messages:6:120021"
        }],
        "deploymentId": "120018",
        "resourceName": "messages-V-7.bpmn20.xml",
        "key": "messages",
        "version": 6
    },
    "activityId": "sid-E6880901-D6DB-48D1-8BEB-3C5E39AC47D7",
    "id": "120023",
    "startTime": "2019-02-27T08:42:39Z",
    "startUserId": "anonymous",
    "parentId": null,
    "isEnded": false,
    "processInstanceId": "120023",
    "rawFormData": {
        "TEXT1": "Some text"
    },
    "processDefinitionId": "messages:6:120021"
}

Quite a difference! The historic instances request defaults virtually all of its parameters to false, whereas the request to running instances includes process variables, form data and the definition itself, unless you explicitly exclude them.

getProcessInstance

If you know the process instance you are after (the business key or instance ID) you could also use getProcessInstance. This returns even more information by default as it includes information about all of the current tasks. However, as you'll only ever get a single instance back, performance won't significantly different to the other functions.

Returning Variables

If you need variables back with your process instances, getProcessInstances and getProcessInstance let you set includeProcessVariables and includeTaskVariables as true or false (true by default). These will return all of the variables in each instance. However, this can be an expensive operation and it's far better to only request the variables you actually need.

getHistoricProcessInstances lets you set a variableList. This will limit the variables returned. For example:

function(params, credentials) {
    let resp = this.callWorkerMethod("workflow", "getHistoricProcessInstances", {
        "processInstanceBusinessKey": "0265-7976-9399-0675",
        "variableList": ["form_TEXT1"]
    });
    return resp;
}

Would return:

{
    "jsonrpc": "2.0",
    "id": 266,
    "result": {
        "id": "_99",
        "result": {
            "result": {
                "count": 1,
                "list": [{
                    "startUserId": "anonymous",
                    "startTime": "2019-02-27T08:42:39Z",
                    "id": "120023",
                    "businessKey": "0265-7976-9399-0675",
                    "processDefinitionName": "messages",
                    "processVariables": {
                        "form_TEXT1": "Some text"
                    },
                    "endTime": null,
                    "processDefinitionId": "messages:6:120021"
                }]
            }
        },
        "jsonrpc": "2.0"
    }
}

Using the Search Collection

A range of process and task data is also indexed by the platform's search engine, as described in Process Instance Search Indexing. Returning data back from the search collection is much faster than using the worker methods described above.

The API lets you construct Filters using the keys described in getQueryProcesses and getQueryTasks to quickly return indexed data. It's particularly useful if you need to query:

  • All tasks assigned to a user
  • All tasks a user is a candidate for
  • All process started by a user

One disadvantage of using the search collection is that it is is a secondary data store that takes time to update, potentially up to 15 seconds. It will also only return process variables that have been set as searchable.

Truncation of Query Results

The workflow engine has an internal limit that prevents more than 20,000 process variables from being returned by any one query.

This will affect queries that return process or task descriptions, as well as queries that explicitly return process or task variables. Paging using firstResult/maxResults will only take place within this limited results set.

When result truncation has happened the count returned by the worker will be higher than the number of items in the results list (note that this can also happen if maxResults has been specified).

If you need to return process/task descriptions or variables for a large results set, you could do this by:

  1. Performing an initial query that does not request this information to retrieve a list of relevant process/task IDs.
  2. Making a series of batched queries to getProcessInstances/getTasks to fill in the additional information. For processes, the easiest way to do this is probably by using the processInstanceIds parameter to explicitly specify a list of processes to query for in each batch. To page through tasks you can sort by create time ascending and then increment the taskCreateAfter parameter for each batch using the most recently created task that was returned from the previous batch. There will be some overlap between the results of batches doing this but this should not be an issue as long as the results of the initial query and the batched queries are being matched up using task IDs.
Last modified on 1 August 2023

Share this page

Facebook icon Twitter icon email icon

Print

print icon