JBoss logo


print this page
email this page

The failure recovery subsystem of JBossTS will ensure that results of a transaction are applied consistently to all resources affected by the transaction, even if any of the application processes or the machine hosting them crash or lose network connectivity. In the case of machine (system) crash or network failure, the recovery will not take place until the system or network are restored, but the original application does not need to be restarted recovery responsibility is delegated to the Recovery Manager process (see below). Recovery after failure requires that information about the transaction and the resources involved survives the failure and is accessible afterward: this information is held in the ActionStore, which is part of the ObjectStore. If the ObjectStore is destroyed or modified, recovery may not be possible.
Until the recovery procedures are complete, resources affected by a transaction that was in progress at the time of the failure may be inaccessible. For database resources, this may be reported as tables or rows held by "in-doubt transactions".

The Recovery Manager

The Recovery Manager is a daemon process responsible for performing crash recovery. Only one Recovery Manager runs per node. The Object Store provides persistent data storage for transactions to log data. During normal transaction processing each transaction will log persistent data needed for the commit phase to the Object Store. On successfully committing a transaction this data is removed, however if the transaction fails then this data remains within the Object Store.
The Recovery Manager functions by:
  • Periodically scanning the Object Store for transactions that may have failed. Failed transactions are indicated by the presence of log data after a period of time that the transaction would have normally been expected to finish.
  • Checking with the application process which originated the transaction whether the transaction is still in progress or not.
  • Recovering the transaction by re-activating the transaction and then replaying phase two of the commit protocol.
To start the Recovery Manager issue the following command:
java com.arjuna.ats.arjuna.recovery.RecoveryManager
If the -test flag is used with the Recovery Manager then it will display a "Ready" message when initialised, i.e.,
java com.arjuna.ats.arjuna.recovery.RecoveryManager -test
On initialization the Recovery Manager first loads in configuration information via a properties file. This configuration includes a number of recovery activators and recovery modules, which are then dynamically loaded.
Each recovery activator, which implements the com.arjuna.ats.arjuna.recovery.RecoveryActivator interface, is used to instantiate a recovery class related to the underlying communication protocol. Indeed, since the version 3.0 of JBossTS, the Recovery Manager is not specifically tied to an Object Request Broker or ORB, which is to specify a recovery instance able to manage the OTS recovery protocol the new interface RecoveryActivator is provided to identify specific transaction protocol. For instance, when used with OTS, the RecoveryActivitor has the responsibility to create a RecoveryCoordinator object able to respond to the replay_completion operation.

All RecoveryActivator instances inherit the same interface. They are loaded via the following recovery extension property:
<property
  name="com.arjuna.ats.arjuna.recovery.recoveryActivator_<number>"
  value="RecoveryClass"/>

For instance the RecoveryActivator provided in the distribution of JTS/OTS, which shall not be commented, is as follow :

<property
  name="com.arjuna.ats.arjuna.recovery.recoveryActivator_1"
  value="com.arjuna.ats.internal.jts.
     orbspecific.recovery.RecoveryEnablement"/>
Each recovery module, which implements the com.arjuna.ats.arjuna.recovery.RecoveryModule interface, is used to recover a different type of transaction/resource, however each recovery module inherits the same basic behaviour.
Recovery consists of two separate passes/phases separated by two timeout periods. The first pass examines the object store for potentially failed transactions; the second pass performs crash recovery on failed transactions. The timeout between the first and second pass is known as the backoff period. The timeout between the end of the second pass and the start of the first pass is the recovery period. The recovery period is larger than the backoff period.

The Recovery Manager invokes the first pass upon each recovery module, applies the backoff period timeout, invokes the second pass upon each recovery module and finally applies the recovery period timeout before restarting the first pass again.

The recovery modules are loaded via the following recovery extension property:
com.arjuna.ats.arjuna.recovery.recoveryExtension<number>=<RecoveryClass>
The default RecoveryExtension settings are:
<property name="com.arjuna.ats.arjuna.recovery.recoveryExtension1"
  value="com.arjuna.ats.internal.
     arjuna.recovery.AtomicActionRecoveryModule"/>
<property name="com.arjuna.ats.arjuna.recovery.recoveryExtension2"
  value="com.arjuna.ats.internal.
     txoj.recovery.TORecoveryModule"/>
<property name="com.arjuna.ats.arjuna.recovery.recoveryExtension3"
  value="com.arjuna.ats.internal.
     jts.recovery.transactions.TopLevelTransactionRecoveryModule"/>
<property  name="com.arjuna.ats.arjuna.recovery.recoveryExtension4"
  value="com.arjuna.ats.internal.
     jts.recovery.transactions.ServerTransactionRecoveryModule"/>

Configuring the Recovery Manager

Periodic Recovery
The backoff period and recovery period are set using the following properties:
com.arjuna.ats.arjuna.recovery.recoveryBackoffPeriod (default 10 secs)
com.arjuna.ats.arjuna.recovery.periodicRecovery (default 120 secs)
Expired entry removal
The operation of the recovery subsystem will cause some entries to be made in the ObjectStore that will not be removed in normal progress. The RecoveryManager has a facility for scanning for these and removing items that are very old. Scans and removals are performed by implementations of the com.arjuna.ats.arjuna.recovery.ExpiryScanner. Implementations of this interface are loaded by giving the class name as the value of a property whose name begins with ExperyScanner.
The RecoveryManager calls the scan() method on each loaded ExpiryScanner implementation at an interval determined by the property com.arjuna.ats.arjuna.recovery.expiryScanInterval. This value is given in hours default is 12. An EXPIRY_SCAN_INTERVAL value of zero will suppress any expiry scanning. If the value as supplied is positive, the first scan is performed when RecoveryManager starts; if the value is negative, the first scan is delayed until after the first interval (using the absolute value)

The default ExpiryScanner is:
<property
  name="com.arjuna.ats.arjuna.recovery.
        expiryScannerTransactionStatusManager"
  value="com.arjuna.ats.internal.arjuna.recovery.
       ExpiredTransactionStatusManagerScanner"/>
 

The following table summarize properties used by the Recovery Manager. These properties are defined by default the properties file named RecoveryManager-properties.xml.

Name Description Possible Value Default Value
com.arjuna.ats.arjuna.
recovery.periodicRecoveryPeriod
Interval in seconds between initiating the periodic recovery modules Value in seconds 120
com.arjuna.ats.arjuna.
recovery.recoveryBackoffPeriod
Interval in seconds between first and second pass of periodic recovery Value in seconds 10
com.arjuna.ats.arjuna.
recovery.recoveryExtensionX
Indicates a periodic recovery module to use. X is the occurence number of the recovery module among a set of recovery modules. These modules are invoked in sort-order of names The class name of the periodic recovery module JBossTS provides a set classes given in the RecoveryManager-properties.xml file
com.arjuna.ats.arjuna.
recovery.recoveryActivator_X
Indicates a recovery activator to use. X is the occurence number of the recovery activator among a set of recovery activators. The class name of the periodic recovery activator JBossTS provide one class that manages the recovery protocol specified by the OTS specification
com.arjuna.ats.arjuna.
recovery.expiryScannerXXX
Expiry scanners to use (order of invocation is random). Names must begin with "com.arjuna.ats.arjuna.
recovery.expiryScanner"
Class name JBossTS provides one class given in the RecoveryManager-properties.xml file
com.arjuna.ats.arjuna.
recovery.expiryScanInterval
Interval, in hours, between running the expiry scanners. This can be quite long. The absolute value determines the interval - if the value is negative, the scan will NOT be run until after one interval has elapsed. If positive the first scan will be immediately after startup. Zero will prevent any scanning. Value in hours 12
com.arjuna.ats.arjuna.recovery.
transactionStatusManagerExpiryTime
Age, in hours, for removal of transaction status manager item. This should be longer than any ts-using process will remain running. Zero = Never removed. Value in Hours 12
com.arjuna.ats.arjuna.recovery
transactionStatusManagerPort
Use this to fix the port on which the TransactionStatusManager listens Port number (short) use a free port

Copyright 2002-2005 Arjuna Technologies. Copyright 2008 JBoss, a division of Red Hat. All Rights Reserved.