JBoss.orgCommunity Documentation

Chapter 8. HTTP Services

8.1. Configuring load balancing using Apache and mod_jk
8.1.1. Download the software
8.1.2. Configure Apache to load mod_jk
8.1.3. Configure worker nodes in mod_jk
8.1.4. Configuring JBoss to work with mod_jk
8.2. Configuring HTTP session state replication
8.2.1. Enabling session replication in your application
8.2.2. HttpSession Passivation and Activation
8.2.3. Configuring the JBoss Cache instance used for session state replication
8.3. Using FIELD level replication
8.4. Using Clustered Single Sign On
8.4.1. Configuration
8.4.2. SSO Behavior
8.4.3. Limitations
8.4.4. Configuring the Cookie Domain

HTTP session replication is used to replicate the state associated with web client sessions to other nodes in a cluster. Thus, in the event one of your nodes crashes, another node in the cluster will be able to recover. Two distinct functions must be performed:

State replication is directly handled by JBoss. When you run JBoss in the all configuration, session state replication is enabled by default. Just configure your web application as <distributable> in its web.xml (see Section 8.2, “Configuring HTTP session state replication”), deploy it, and its session state is automtically replicated across all JBoss instances in the cluster.

However, load-balancing is a different story; it is not handled by JBoss itself and requires an external load balancer. This function could be provided by specialized hardware switches or routers (Cisco LoadDirector for example) or by specialized software running on commodity hardware. As a very common scenario, we will demonstrate how to set up a software load balancer using Apache httpd and mod_jk.

Note

A load-balancer tracks HTTP requests and, depending on the session to which the request is linked, it dispatches the request to the appropriate node. This is called load-balancing with sticky-sessions or session affinity: once a session is created on a node, every future request will also be processed by that same node. Using a load-balancer that supports sticky-sessions but not configuring your web application for session replication allows you to scale very well by avoiding the cost of session state replication: each request for a session will always be handled by the same node. But in case a node dies, the state of all client sessions hosted by this node (the shopping carts, for example) will be lost and the clients will most probably need to login on another node and restart with a new session. In many situations, it is acceptable not to replicate HTTP sessions because all critical state is stored in a database or on the client. In other situations, losing a client session is not acceptable and, in this case, session state replication is the price one has to pay.

Apache is a well-known web server which can be extended by plugging in modules. One of these modules, mod_jk, has been specifically designed to allow the forwarding of requests from Apache to a Servlet container. Furthermore, it is also able to load-balance HTTP calls to a set of Servlet containers while maintaining sticky sessions, which is what is most interesting for us in this section.

Modify APACHE_HOME/conf/httpd.conf and add a single line at the end of the file:

# Include mod_jk's specific configuration file  
Include conf/mod-jk.conf  
            

Next, create a new file named APACHE_HOME/conf/mod-jk.conf:

# Load mod_jk module
# Specify the filename of the mod_jk lib
LoadModule jk_module modules/mod_jk.so
 
# Where to find workers.properties
JkWorkersFile conf/workers.properties

# Where to put jk logs
JkLogFile logs/mod_jk.log
 
# Set the jk log level [debug/error/info]
JkLogLevel info 
 
# Select the log format
JkLogStampFormat  "[%a %b %d %H:%M:%S %Y]"
 
# JkOptions indicates to send SSK KEY SIZE
JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
 
# JkRequestLogFormat
JkRequestLogFormat "%w %V %T"
               
# Mount your applications
JkMount /application/* loadbalancer
 
# You can use external file for mount points.
# It will be checked for updates each 60 seconds.
# The format of the file is: /url=worker
# /examples/*=loadbalancer
JkMountFile conf/uriworkermap.properties               

# Add shared memory.
# This directive is present with 1.2.10 and
# later versions of mod_jk, and is needed for
# for load balancing to work properly
JkShmFile logs/jk.shm 
              
# Add jkstatus for managing runtime data
<Location /jkstatus/>
    JkMount status
    Order deny,allow
    Deny from all
    Allow from 127.0.0.1
</Location>

Please note that two settings are very important:

In addition to the JkMount directive, you can also use the JkMountFile directive to specify a mount points configuration file, which contains multiple Tomcat forwarding URL mappings. You just need to create a uriworkermap.properties file in the APACHE_HOME/conf directory. The format of the file is /url=worker_name. To get things started, paste the following example into the file you created:

# Simple worker configuration file

# Mount the Servlet context to the ajp13 worker
/jmx-console=loadbalancer
/jmx-console/*=loadbalancer
/web-console=loadbalancer
/web-console/*=loadbalancer

This will configure mod_jk to forward requests to /jmx-console and /web-console to Tomcat.

You will most probably not change the other settings in mod_jk.conf. They are used to tell mod_jk where to put its logging file, which logging level to use and so on.

Next, you need to configure mod_jk workers file conf/workers.properties. This file specifies where the different Servlet containers are located and how calls should be load-balanced across them. The configuration file contains one section for each target servlet container and one global section. For a two nodes setup, the file could look like this:

# Define list of workers that will be used
# for mapping requests
worker.list=loadbalancer,status

# Define Node1
# modify the host as your host IP or DNS name.
worker.node1.port=8009
worker.node1.host=node1.mydomain.com 
worker.node1.type=ajp13
worker.node1.lbfactor=1
worker.node1.cachesize=10

# Define Node2
# modify the host as your host IP or DNS name.
worker.node2.port=8009
worker.node2.host= node2.mydomain.com
worker.node2.type=ajp13
worker.node2.lbfactor=1
worker.node2.cachesize=10

# Load-balancing behaviour
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=node1,node2
worker.loadbalancer.sticky_session=1
#worker.list=loadbalancer

# Status worker for managing load balancer
worker.status.type=status

Basically, the above file configures mod_jk to perform weighted round-robin load balancing with sticky sessions between two servlet containers (i.e. JBoss AS instances) node1 and node2 listening on port 8009.

In the workers.properties file, each node is defined using the worker.XXX naming convention where XXX represents an arbitrary name you choose for each of the target Servlet containers. For each worker, you must specify the host name (or IP address) and the port number of the AJP13 connector running in the Servlet container.

The lbfactor attribute is the load-balancing factor for this specific worker. It is used to define the priority (or weight) a node should have over other nodes. The higher this number is for a given worker relative to the other workers, the more HTTP requests the worker will receive. This setting can be used to differentiate servers with different processing power.

The cachesize attribute defines the size of the thread pools associated to the Servlet container (i.e. the number of concurrent requests it will forward to the Servlet container). Make sure this number does not outnumber the number of threads configured on the AJP13 connector of the Servlet container. Please review http://jakarta.apache.org/tomcat/connectors-doc/config/workers.html for comments on cachesize for Apache 1.3.x.

The last part of the conf/workers.properties file defines the loadbalancer worker. The only thing you must change is the worker.loadbalancer.balanced_workers line: it must list all workers previously defined in the same file: load-balancing will happen over these workers.

The sticky_session property specifies the cluster behavior for HTTP sessions. If you specify worker.loadbalancer.sticky_session=0, each request will be load balanced between node1 and node2; i.e., different requests for the same session will go to different servers. But when a user opens a session on one server, it is always necessary to always forward this user's requests to the same server, as long as that server is available. This is called a "sticky session", as the client is always using the same server he reached on his first request. To enable session stickiness, you need to set worker.loadbalancer.sticky_session to 1.

The preceding discussion has been focused on using mod_jk as a load balancer. The content of the remainder our discussion of clustering HTTP services in JBoss AS applies no matter what load balancer is used.

In Section 8.1.3, “Configure worker nodes in mod_jk”, we covered how to use sticky sessions to make sure that a client in a session always hits the same server node in order to maintain the session state. However, sticky sessions by themselves are not an ideal solution. If a node goes down, all its session data is lost. A better and more reliable solution is to replicate session data across the nodes in the cluster. This way, if a server node fails or is shut down, the load balancer can fail over the next client request to any server node and obtain the same session state.

To enable replication of your web application sessions, you must tag the application as distributable in the web.xml descriptor. Here's an example:

<?xml version="1.0"?> 
<web-app  xmlns="http://java.sun.com/xml/ns/j2ee"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
          xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee 
          http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" 
          version="2.4">

    <distributable/>

</web-app>

You can futher configure session replication using the replication-config element in the jboss-web.xml file. However, the replication-config element only needs to be set if one or more of the default values described below is unacceptable. Here is an example:

<!DOCTYPE jboss-web PUBLIC
    -//JBoss//DTD Web Application 5.0//EN
    http://www.jboss.org/j2ee/dtd/jboss-web_5_0.dtd>

<jboss-web>
   
   <replication-config>
      <cache-name>custom-session-cache</cache-name>
      <replication-trigger>SET</replication-trigger>
      <replication-granularity>ATTRIBUTE</replication-granularity>
      <replication-field-batch-mode>true</replication-field-batch-mode>
      <use-jk>false</use-jk>
      <max-unreplicated-interval>30</max-unreplicated-interval>
      <snapshot-mode>instant</snapshot-mode>
      <snapshot-interval>1000</snapshot-interval>
      <session-notification-policy>com.example.CustomPolicy</session-notification-policy>
   </replication-config>

</jboss-web>

All of the above configuration elements are optional and can be ommitted if the default value is acceptable. A couple are commonly used; the rest are very infrequently changed from the defaults. We'll cover the commonly used ones first.

The replication-trigger element determines when the container should consider that session data must be replicated across the cluster. The rationale for this setting is that after a mutable object stored as a session attribute is accessed from the session, in the absence of a setAttribute call the container has no clear way to know if the object (and hence the session state) has been modified and needs to be replicated. This element has 3 valid values:

In all cases, calling setAttribute marks the session as needing replication.

The replication-granularity element determines the granularity of what gets replicated if the container determines session replication is needed. The supported values are:

The other elements under the replication-config element are much less frequently used.

The cacheName element indicates the name of the JBoss Cache configuration that should be used for storing distributable sessions and replicating them around the cluster. This element allows webapps that need different caching characteristics to specify the use of separate, differently configured, JBoss Cache instances. In AS 4 the cache to use was a server-wide configuration that could not be changed per webapp. The default value is standard-session-cache if the replication-granularity is not FIELD, field-granularity-session-cache if it is. See Section 8.2.3, “Configuring the JBoss Cache instance used for session state replication” for more details on JBoss Cache configuration for web tier clustering.

The replication-field-batch-mode element indicates whether you want all replication messages associated with a request to be batched into one message. Only applicable if replication-granularity is FIELD. If this is set to true, fine-grained changes made to objects stored in the session attribute map will replicate only when the http request is finished; otherwise they replicate as they occur. Setting this to false is not advised. Default is true.

The useJK element indicates whether the container should assume a JK-based software load balancer (e.g. mod_jk, mod_proxy, mod_cluster) is used for load balancing for this webapp. If set to true, the container will examine the session id associated with every request and replace the jvmRoute portion of the session id if it detects a failover.

The default value is null (i.e. unspecified), in which case the session manager will use the presence or absence of a jvmRoute configuration on its enclosing JBoss Web Engine (see Section 8.1.4, “Configuring JBoss to work with mod_jk”) as indicating whether JK is used.

The only real reason to set this element is to set it to false for a particular webapp whose URL's the JK load balancer doesn't handle. Even doing that isn't really necessary.

The max-unreplicated-interval element configures the maximum interval between requests, in seconds, after which a request will trigger replication of the session's timestamp regardless of whether the request has otherwise made the session dirty. Such replication ensures that other nodes in the cluster are aware of the most recent value for the session's timestamp and won't incorrectly expire an unreplicated session upon failover. It also results in correct values for HttpSession.getLastAccessedTime() calls following failover.

A value of 0 means the timestamp will be replicated whenever the session is accessed. A value of -1 means the timestamp will be replicated only if some other activity during the request (e.g. modifying an attribute) has resulted in other replication work involving the session. A positive value greater than the HttpSession.getMaxInactiveInterval() value will be treated as a likely misconfiguration and converted to 0; i.e. replicate the metadata on every request. Default value is 60.

The snapshot-mode element configures when sessions are replicated to the other nodes. Possible values are instant (the default) and interval.

The typical value, instant, replicates changes to the other nodes at the end of requests, using the request processing thread to perform the replication. In this case, the snapshot-interval property is ignored.

With interval mode, a background task is created that runs every snapshot-interval milliseconds, checking for modified sessions and replicating them.

Note that this property has no effect if replication-granularity is set to FIELD. If it is FIELD, instant mode will be used.

The snapshot-interval element defines how often (in milliseconds) the background task that replicates modified sessions should be started for this web app. Only meaningful if snapshot-mode is set to interval.

The session-notification-policy element specifies the fully qualified class name of the implementation of the ClusteredSessionNotificationPolicy interface that should be used to govern whether servlet specification notifications should be emitted to any registered HttpSessionListener, HttpSessionAttributeListener and/or HttpSessionBindingListener.

Event notifications that may make sense in a non-clustered environment may or may not make sense in a clustered environment; see https://jira.jboss.org/jira/browse/JBAS-5778 for an example of why a notification may not be desired. Configuring an appropriate ClusteredSessionNotificationPolicy gives the application author fine-grained control over what notifications are issued.

In AS 5.0.0.GA the default value if not explicitly set is the LegacyClusteredSessionNotificationPolicy, which implements the behavior in previous JBoss versions. In the AS 5.1.0 release this was changed to IgnoreUndeployLegacyClusteredSessionNotificationPolicy, which implements the same behavior except for in undeployment situations, during which no HttpSessionListener and HttpSessionAttributeListener notifications are sent.

Passivation is the process of controlling memory usage by removing relatively unused sessions from memory while storing them in persistent storage. If a passivated session is requested by a client, it can be "activated" back into memory and removed from the persistent store. JBoss AS 5 supports passivation of HttpSessions from webapps whose web.xml includes the distributable tag (i.e. clustered webapps).

Passivation occurs at 3 points during the lifecycle of a web application:

A session will be passivated if one of the following holds true:

In both cases, sessions are passivated on a Least Recently Used (LRU) basis.

Session passivation behavior is configured via the jboss-web.xml deployment descriptor in your webapp's WEB-INF directory.

<!DOCTYPE jboss-web PUBLIC
    -//JBoss//DTD Web Application 5.0//EN
    http://www.jboss.org/j2ee/dtd/jboss-web_5_0.dtd>

<jboss-web>
   
   <max-active-sessions>20</max-active-sessions>
   <passivation-config>
      <use-session-passivation>true</use-session-passivation>
      <passivation-min-idle-time>60</passivation-min-idle-time>
      <passivation-max-idle-time>600</passivation-max-idle-time>
   </passivation-config>


</jboss-web>

Note that the number of sessions in memory includes sessions replicated from other cluster nodes that are not being accessed on this node. Be sure to account for that when setting max-active-sessions. Note also that the number of sessions replicated from other nodes may differ greatly depending on whether buddy replication is enabled. In an 8 node cluster where each node is handling requests from 100 users, with total replication each node will have 800 sessions in memory. With buddy replication with the default numBuddies setting of 1, each node will have 200 sessions in memory.

The container for a distributable web application makes use of JBoss Cache to provide HTTP session replication services around the cluster. The container integrates with the CacheManager service to obtain a reference to a JBoss Cache instance (see Section 3.2.1, “The JBoss AS CacheManager Service”).

The name of the JBoss Cache configuration to use is controlled by the cacheName element in the application's jboss-web.xml (see Section 8.2.1, “Enabling session replication in your application”). In most cases, though, this does not need to be set as the default values of standard-session-cache and field-granularity-session-cache (for applications configured for FIELD granularity) are appropriate.

The JBoss Cache configurations in the CacheManager service expose are large number of options. See Chapter 11, JBoss Cache Configuration and Deployment and the JBoss Cache documentation for a more complete discussion. However, the standard-session-cache and field-granularity-session-cache configurations are already optimized for the web session replication use case, and most of the settings should not be altered. However, there are a few items that an JBoss AS administrator may wish to change:

  • cacheMode

    By default, REPL_ASYNC, meaning a web request thread sending a session replication message to the cluster does not wait for responses from other cluster nodes confirming they have received and processed the message. Alternative REPL_SYNC offers greater guarantees that the session state was received, but at a significant performance cost. See Section 11.1.2, “Cache Mode”.

  • enabled property in the buddyReplicationConfig section

    Set to true to enable buddy replication. See Section 11.1.8, “Buddy Replication”. Default is false.

  • numBuddies property in the buddyReplicationConfig section

    Set to a value greater than the default 1 to increase the number of backup nodes onto which sessions are replicated. Only relevant if buddy replication is enabled. See Section 11.1.8, “Buddy Replication”.

  • buddyPoolName property in the buddyReplicationConfig section

    A way to specify a preferred replication group when buddy replication is enabled. JBoss Cache tries to pick a buddy who shares the same pool name (falling back to other buddies if not available). Only relevant if buddy replication is enabled. See Section 11.1.8, “Buddy Replication”.

  • multiplexerStack

    Name of the JGroups protocol stack the cache should use. See Section 3.1.1, “The Channel Factory Service”.

  • clusterName

    Identifying name JGroups will use for this cache's channel. Only change this if you create a new cache configuration, in which case this property should have a different value from all other cache configurations.

If you wish to use a completely new JBoss Cache configuration rather than editing one of the existing ones, please see Section 11.2.1, “Deployment Via the CacheManager Service”.

FIELD-level replication only replicates modified data fields inside objects stored in the session. Its use could potentially drastically reduce the data traffic between clustered nodes, and hence improve the performance of the whole cluster. To use FIELD-level replication, you have to first prepare (i.e., bytecode enhance) your Java class to allow the session cache to detect when fields in cached objects have been changed and need to be replicated.

The first step in doing this is to identify the classes that need to be prepared. This is done via annotations. For example:

@org.jboss.cache.pojo.annotation.Replicable
public class Address 
{
...
}

If you annotate a class with @Replicable, then all of its subclasses will be automatically annotated as well. Similarly, you can annotate an interface with @Replicable and all of its implementing classes will be annotated. For example:

@org.jboss.cache.pojo.annotation.Replicable
public class Person 
{
...
}

public class Student extends Person
{
...
}

There is no need to annotate Student. POJO Cache will recognize it as @Replicable because it is a sub-class of Person.

JBoss AS 5 requires JDK 5 at runtime, but some users may still need to build their projects using JDK 1.4. In this case, annotating classes can be done via JDK 1.4 style annotations embedded in JavaDocs. For example:

/**
 * Represents a street address.
 *
 * @@org.jboss.cache.pojo.annotation.Replicable
 */
public class Address 
{
...
}

Once you have annotated your classes, you will need to perform a pre-processing step to bytecode enhance your classes for use by POJO Cache. You need to use the JBoss AOP pre-compiler annotationc and post-compiler aopc to process the above source code before and after they are compiled by the Java compiler. The annotationc step is only need if the JDK 1.4 style annotations are used; if JDK 5 annotations are used it is not necessary. Here is an example of how to invoke those commands from command line.

$ annotationc [classpath] [source files or directories]
$ javac -cp [classpath] [source files or directories]
$ aopc [classpath] [class files or directories]

Please see the JBoss AOP documentation for the usage of the pre- and post-compiler. The JBoss AOP project also provides easy to use ANT tasks to help integrate those steps into your application build process.

Finally, let's see an example on how to use FIELD-level replication on those data classes. First, we see some servlet code that reads some data from the request parameters, creates a couple of objects and stores them in the session:

Person husband = new Person(getHusbandName(request), getHusbandAge(request));
Person wife = new Person(getWifeName(request), getWifeAge(request));
Address addr = new Address();
addr.setPostalCode(getPostalCode(request));

husband.setAddress(addr);
wife.setAddress(addr); // husband and wife share the same address!

session.setAttribute("husband", husband); // that's it.
session.setAttribute("wife", wife); // that's it.

Later, a different servlet could update the family's postal code:

Person wife = (Person)session.getAttribute("wife");
// this will update and replicate the postal code
wife.getAddress().setPostalCode(getPostalCode(request)); 

Notice that in there is no need to call session.setAttribute() after you make changes to the data object, and all changes to the fields are automatically replicated across the cluster.

Besides plain objects, you can also use regular Java collections of those objects as session attributes. POJO Cache automatically figures out how to handle those collections and replicate field changes in their member objects.

JBoss supports clustered single sign-on, allowing a user to authenticate to one web application and to be recognized on all web applications that are deployed on the same virtual host, whether or not they are deployed on that same machine or on another node in the cluster. Authentication replication is handled by JBoss Cache. Clustered single sign-on support is a JBoss-specific extension of the non-clustered org.apache.catalina.authenticator.SingleSignOn valve that is a standard part of Tomcat and JBoss Web. Both the non-clustered and clustered versions allow users to sign on to any one of the web apps associated with a virtual host and have their identity recognized by all other web apps on the same virtual host. The clustered version brings the added benefits of enabling SSO failover and allowing a load balancer to direct requests for different webapps to different servers, while maintaining the SSO.

To enable clustered single sign-on, you must add the ClusteredSingleSignOn valve to the appropriate Host elements of the JBOSS_HOME/server/all/deploy/jbossweb.sar/server.xml file. The valve element is already included in the standard file; you just need to uncomment it. The valve configuration is shown here:

<Valve className="org.jboss.web.tomcat.service.sso.ClusteredSingleSignOn" />

The element supports the following attributes:

  • className is a required attribute to set the Java class name of the valve implementation to use. This must be set to org.jboss.web.tomcat.service.sso.ClusteredSingleSign.

  • cacheConfig is the name of the cache configuration (see Section 3.2.1, “The JBoss AS CacheManager Service”) to use for the clustered SSO cache. Default is clustered-sso.

  • treeCacheName is deprecated; use cacheConfig. Specifies a JMX ObjectName of the JBoss Cache MBean to use for the clustered SSO cache. If no cache can be located from the CacheManager service using the value of cacheConfig, an attempt to locate an mbean registered in JMX under this ObjectName will be made. Default value is jboss.cache:service=TomcatClusteringCache.

  • cookieDomain is used to set the host domain to be used for sso cookies. See Section 8.4.4, “Configuring the Cookie Domain” for more. Default is "/".

  • maxEmptyLife is the maximum number of seconds an SSO with no active sessions will be usable by a request. The clustered SSO valve tracks what cluster nodes are managing sessions related to an SSO. A positive value for this attribute allows proper handling of shutdown of a node that is the only one that had handled any of the sessions associated with an SSO. The shutdown invalidates the local copy of the sessions, eliminating all sessions from the SSO. If maxEmptyLife were zero, the SSO would terminate along with the local session copies. But, backup copies of the sessions (if they are from clustered webapps) are available on other cluster nodes. Allowing the SSO to live beyond the life of its managed sessions gives the user time to make another request which can fail over to a different cluster node, where it activates the the backup copy of the session. Default is 1800, i.e. 30 minutes.

  • processExpiresInterval is the minimum number of seconds between efforts by the valve to find and invalidate SSO's that have exceeded their 'maxEmptyLife'. Does not imply effort will be spent on such cleanup every 'processExpiresInterval', just that it won't occur more frequently than that. Default is 60.

  • requireReauthentication is a flag to determine whether each request needs to be reauthenticated to the security Realm. If "true", this Valve uses cached security credentials (username and password) to reauthenticate to the JBoss Web security Realm each request associated with an SSO session. If false, the valve can itself authenticate requests based on the presence of a valid SSO cookie, without rechecking with the Realm. Setting to true can allow web applications with different security-domain configurations to share an SSO. Default is false.