Within a multi node NLB K2 Server configuration (nodeA, nodeB and nodeC), any one of the 3 nodes can at any point in time lock a specific process instance for execution. When during this locking or execution of the specific process instance, the responsible server fails for any reason; the process instance will stay locked against this server and will not be executed by any other K2 Server node.
Traditionally, the failing K2 Server node would have been the only server able to execute the corresponding process instance. The process instance will therefore lie dormant until the failing server came back online to continue processing.
An enhancement has been implemented in the K2 Server to allow any other node to pick up and continue execution of the said process instance. If and only if, the first K2 Server is down for a pre-determined configurable amount of time. By default, this time period is set to 15 minutes meaning that any K2 Server node will give the failing node 15 minutes to come back online and continue the execution of the process instance, before any other node will override the lock and execute the process instance in question.
Obviously, if the failing K2 Server comes back online within the allotted time period, it will stay responsible for the execution of the process instance.
In order to change the default period of 15 minutes, locate the file :
<install dir>\K2 blackpearl\HostServer\bin\K2server.setup
Add the following setting to the appropriate section and change the required number of minutes for online server to wait for failing servers:
<Cluster ServerDownTimeout="15" />
|Note: The circumstances described in this article are one scenario under which this issue may, or is known to occur. The description is intended to be specific to the scenario described and does not take into account all possible scenarios or circumstances.|
To test the K2 Server failover support:
- In an NLB cluster environment containing more than 1 server, start a few new process instances.
- In the K2 Server database:
- Query the _Server table to retrieve a list of Servers/ServerIDs
- Monitor the _ProcInst table’s ServerID field to make sure that process instances are executed by all Servers (a ServerID of 0 means that any server can execute the item).
- Stop one of the servers as soon as it is active and has some instances in a running state.
- After 15 minutes, process instances which were previously assigned to the stopped server should now be continued or executed by an active running server.
No error is raised. There are no failover capability on the running process instances as the available servers do not detect the server has gone done, therefore they do not re-allocate the existing work until they are restarted.
This Hotfix is contained within the latest K2 Update. Install the update package to resolve the error.