Hibernate.orgCommunity Documentation

HIBERNATE - Relational Persistence for Idiomatic Java

Using JBoss Cache as a Hibernate Second Level Cache

3.5.0.alpha1

Legal Notice

Preface
1. Introduction
1.1. Overview
1.2. Requirements
1.2.1. Dependencies
1.2.2. JTA Transactional Support
1.3. Configuration Basics
2. Core Concepts
2.1. Types of Cached Data
2.1.1. Entities
2.1.2. Collections
2.1.3. Queries
2.1.4. Timestamps
2.2. Key JBoss Cache Behaviors
2.2.1. Replication vs. Invalidation vs. Local Mode
2.2.2. Synchronous vs. Asynchronous
2.2.3. Locking Scheme
2.2.4. Isolation Level
2.2.5. Initial State Transfer
2.2.6. Cache Eviction
2.2.7. Buddy Replication and Cache Loading
2.3. Matching JBC Behavior to Types of Data
2.3.1. The RegionFactory Interface
2.3.2. The CacheManager API
2.3.3. Sharable JGroups Resources
2.3.4. Bringing It All Together
3. Configuration
3.1. Configuring the Hibernate Session Factory
3.1.1. Basics
3.1.2. Specifying the RegionFactory Implementation
3.1.3. The SharedJBossCacheRegionFactory
3.1.4. The JndiSharedJBossCacheRegionFactory
3.1.5. The MultiplexedJBossCacheRegionFactory
3.1.6. The JndiMultiplexedJBossCacheRegionFactory
3.1.7. Legacy Configuration Properties
3.2. Configuring JBoss Cache
3.2.1. Configuring a Single Standalone Cache
3.2.2. Managing Multiple Caches via a CacheManager
3.2.3. JBoss Cache Configuration Details
3.2.4. Standard JBoss Cache Configurations
3.3. JGroups Configuration
3.3.1. Transport -- UDP vs. TCP
3.3.2. Standard JGroups Configurations
4. Cache Eviction
4.1. Overview
4.1.1. The Eviction Process
4.1.2. Eviction Regions
4.1.3. Eviction Policies
4.2. Organization of Data in the Cache
4.2.1. Region Prefix and Region Name
4.2.2. Entities
4.2.3. Collections
4.2.4. Queries
4.2.5. Timestamps
4.3. Example Configuration
4.4. Best Practices
5. Architecture
5.1. Hibernate Interface to the Caching Subsystem
5.2. Single JBoss Cache Instance Architecture
5.3. Multiple JBoss Cache Instance Architecture

This document is focused on the use of the JBoss Cache clustered transactional caching library as a tool for caching data in a Hibernate-based application.

Working with object-oriented software and a relational database can be cumbersome and time consuming in today's enterprise environments. Hibernate is an object/relational mapping tool for Java environments. The term object/relational mapping (ORM) refers to the technique of mapping a data representation from an object model to a relational data model with a SQL-based schema.

Hibernate not only takes care of the mapping from Java classes to database tables (and from Java data types to SQL data types), but also provides data query and retrieval facilities and can significantly reduce development time otherwise spent with manual data handling in SQL and JDBC.

In any application that works with a relational database, caching data retrieved from the database can potentially improve application performance. Hibernate provides a number of facilities for caching data on the database client side, i.e. in the Java process in which Hibernate is running. The primary facility is the Hibernate Session itself, which maintains a transaction-scoped cache of persistent data. If you wish to cache data beyond the scope of a transaction, it is possible to configure a cluster or JVM-level (technically a SessionFactory-level) cache on a class-by-class, collection-by-collection and query-by-query basis. This type of cache is referred to as a Second Level Cache.

Hibernate provides a pluggable architecture for implementing its Second Level Cache, allowing it to integrate with a number of third-party caching libraries. This document is focused on the use of the JBoss Cache clustered transactional caching library as an implementation of the Second Level Cache. It specifically focuses on JBoss Cache 3.

If you are new to Hibernate and Object/Relational Mapping or even Java, please follow these steps:

  1. Read the Hibernate Reference Documentation, particularly the Introduction and Architecture sections.

  2. Have a look at the eg/ directory in the Hibernate distribution, it contains a simple standalone application. Copy your JDBC driver to the lib/ directory and edit etc/hibernate.properties, specifying correct values for your database. From a command prompt in the distribution directory, type ant eg (using Ant), or under Windows, type build eg.

  3. Use the Hibernate Reference Documentation as your primary source of information. Consider reading Java Persistence with Hibernate (http://www.manning.com/bauer2) if you need more help with application design or if you prefer a step-by-step tutorial. Also visit http://caveatemptor.hibernate.org and download the example application for Java Persistence with Hibernate.

  4. FAQs are answered on the Hibernate website.

  5. Third party demos, examples, and tutorials are linked on the Hibernate website.

  6. The Community Area on the Hibernate website is a good resource for design patterns and various integration solutions (Tomcat, JBoss AS, Struts, EJB, etc.).

If you are new to the Hibernate Second Level Cache or to JBoss Cache, please follow these steps:

  1. Read the Hibernate Reference Documentation, particularly the Second Level Cache and Configuration sections.

  2. Read the JBoss Cache User Guide, Core Edition.

  3. Use this guide as your primary source of information on the usage of JBoss Cache 3 as a Hibernate Second Level Cache.

If you have questions, use the user forum linked on the Hibernate website. The user forum on the JBoss Cache website is also useful. We also provide a JIRA issue tracking system for bug reports and feature requests. If you are interested in the development of Hibernate, join the developer mailing list. If you are interested in translating this documentation into your language, contact us on the developer mailing list.

Commercial development support, production support, and training for Hibernate is available through Red Hat, Inc. (see http://www.hibernate.org/SupportTraining/). Hibernate is a Professional Open Source project and a critical component of the JBoss Enterprise Application Platform.

JBoss Cache is a tree-structured, clustered, transactional cache. It includes support for maintaining cache consistency across multiple cache instances running in a cluster. It integrates with JTA transaction managers, supporting transaction-scoped locking of cache elements and automatic rollback of cache changes upon transaction rollback. It supports both pessimistic and optimistic locking, with the tree-structure of the cache allowing maximum concurrency.

All of these features make JBoss Cache an excellent choice for use as a Hibernate Second Level Cache, particularly in a clustered environment. A Hibernate Session is a transaction-scoped cache of persistent data -- data accessed via the Session is cached in the Session for the duration of the current transaction, and then is cleared. A Second Level Cache is an optional cluster or JVM-level cache whose contents are maintained beyond the life of a transaction and whose contents can be shared across transactions. Use of a Second Level Cache is configured as part of the configuration of the Hibernate SessionFactory. If a Second Level Cache is enabled, caching of an instance of a particular entity class or of results of a particular query can be configured on a class-by-class, collection-by-collection and query-by-query basis. See the Hibernate Reference Documentation for more on Second Level Cache basics and how to configure entity classes, collections and queries for caching.

The JBoss Cache Second Level Cache integration supports the transactional and read only cache concurrency strategies discussed in the Hibernate Reference Documentation. It supports query caching and is, of course, cluster safe.

The key steps in using JBoss Cache as a second level cache are to:

See Chapter 3, Configuration for full details on configuration.



[1] JGroups is the group communication library used by JBoss Cache for intra-cluster communication.

This chapter focuses on some of the core concepts underlying how the JBoss Cache-based implementation of the Hibernate Second Level Cache works. There's a fair amount of detail, which certainly doesn't all need to be mastered to use JBoss Cache with Hibernate. But, an understanding of some of the basic concepts here will help a user understand what some of the typical configurations discussed in the next chapter are all about.

If you want to skip the details for now, feel free to jump ahead to Section 2.3.4, “Bringing It All Together”

The Second Level Cache can cache four different types of data: entities, collections, query results and timestamps. Proper handling of each of the types requires slightly different caching semantics. A major improvement in Hibernate 3.3 was the addition of the org.hibernate.cache.RegionFactory SPI, which allows Hibernate to tell the caching integration layer what type of data is being cached. Based on that knowledge, the cache integration layer can apply the semantics appropriate to that type.

Entities are the most common type of data cached in the second level cache. Entity caching requires the following semantics in a clustered cache:

Hibernate supports caching of query results in the second level cache. The HQL statement that comprised the query is cached (including any parameter values) along with the primary keys of all entities that comprise the result set.

The semantics of query caching are significantly different from those of entity caching. A database row that reflects an entity's state can be locked, with cache updates applied with that lock in place. The semantics of entity caching take advantage of this fact to help ensure cache consistency across the cluster. There is no clear database analogue to a query result set that can be efficiently locked to ensure consistency in the cache. As a result, the fail-fast semantics used with the entity caching put operation are not available; instead query caching has semantics akin to an entity insert, including costly synchronous cluster updates and the JBoss Cache two phase commit protocol. Furthermore, Hibernate must agressively invalidate query results from the cache any time any instance of one of the entity classes involved in the query's WHERE clause changes. All such query results are invalidated, even if the change made to the entity instance would not have affected the query result. It is not performant for Hibernate to try to determine if the entity change would have affected the query result, so the safe choice is to invalidate the query. See Section 2.1.4, “Timestamps” for more on query invalidation.

The effect of all this is that query caching is less likely to provide a performance boost than entity/collection caching. Use it with care and benchmark your application with it enabled and disabled. Be careful about replicating query results; caching them locally only on the node that executed the query will be more performant unless the query is quite expensive, is very likely to be repeated on other nodes, and is unlikely to be invalidated out of the cache.[2].

The JBoss Cache-based implementation of query caching adds a couple of interesting semantics, both designed to ensure that query cache operations don't block transactions from proceeding:

  • The insertion of a query result into the cache is very much like the insertion of a new entity. The difference is it is possible for two transactions, possibly on different nodes, to try to insert the same query at the same time. (If this happened with entities, the database would throw an exception with a primary key violation before any caching work could start). This could lead to long delays as the transactions compete for cache locks. To prevent such delays, the cache integration layer will set a very short (a few ms) lock timeout before attempting to cache a query result. If there is any sort of locking conflict, it will be detected quickly, and the attempt to cache the result will be quietly abandonded.

  • A read of a query result does not result in any long-lasting read lock in the cache. Thus, the fact that an uncommitted transaction had read a query result does not prevent concurrent transactions from subsequently invalidating that result and caching a new result set. However, an insertion of a query result into the cache will result in an exclusive write lock that lasts until the transaction that did the insert commits; this lock will prevent other transactions from reading the result. Since the point of query caching is to improve performance, blocking on a cache read for an extended period seems suboptimal. So, the cache integration code will set a very low lock acquisition timeout before attempting the read; if there is a lock conflict, the read will silently fail, resulting in a cache miss and a re-execution of the query against the database.

Timestamp caching is an internal detail of query caching. As part of each query result, Hibernate stores the timestamp of when the query was executed. There is also a special area in the cache (the timestamps cache) where, for each entity class, the timestamp of the last update to any instance of that class is stored. When a query result is read from the cache, its timestamp is compared to the timestamps of all entities involved in the query. If any entity has a later timestamp, the cached result is discarded and a new query against the database is executed.

The semantics of of the timestamp cache are quite different from those of the entity, collection and query caches.

JBoss Cache is a very flexible tool and includes a great number of configuration options. See the JBoss Cache User Guide for an in depth discussion of these options. Here we focus on the main concepts that are most important to the Second Level Cache use case. This discussion will focus on concepts; see Section 3.2, “Configuring JBoss Cache” for details on the actual configurations involved.

JBoss Cache provides three different choices for how a node in the cluster should interact with the rest of the cluster when its local state is updated:

If the MVCC or PESSIMISTIC node locking schemes are used, JBoss Cache supports different isolation level configurations that specify how different transactions coordinate the locking of nodes in the cache: READ_COMMITTED and REPEATABLE_READ. These are somewhat analogous to database isolation levels; see the JBoss Cache User Guide for an in depth discussion of these options. In both cases, cache reads do not block for other reads. In both cases a transaction that writes to a node in the cache tree will hold an exclusive lock on that node until the transaction commits, causing other transactions that wish to read the node to block. In the REPEATABLE_READ case, the read lock held by an uncommitted transaction that has read a node will cause another transaction wishing to write to that node to block until the read transaction commits. This ensures the reader transaction can read the node again and get the same results, i.e. have a repeatable read.

READ_COMMITTED allows the greatest concurrency, since reads don't block each other and also don't block a write.

If the deprecated OPTIMISTIC node locking scheme is used, any isolation level configuration is ignored by the cache. Optimistic locking provides a repeatable read semantic but does not cause writes to block for reads.

In most cases, a REPEATABLE_READ setting on the cache is not needed, even if the application wants repeatable read semantics. This is because the Second Level Cache is just that -- a secondary cache. The primary cache for an entity or collection is the Hibernate Session object itself. Once an entity or collection is read from the second level cache, it is cached in the Session for the life of the transaction. Subsequent reads of that entity/collection will be resolved from the Session cache itself -- there will be no repeated read of the Second Level Cache by that transaction. So, there is no benefit to a REPEATABLE_READ configuration in the Second Level Cache.

The only exception to this is if the application uses Session's evict() or clear() methods to remove data from the Session cache and during the course of the same transaction wants to read that same data again with a repeatable read semantic.

Note that for query and timestamp caches, the behavior of the Hibernate/JBC integration will not allow repeatable read semantics even if JBC is configured for REPEATABLE_READ. A cache read will not result in a read lock in the cache being held for the life of the transaction. So, for these caches there is no benefit to a REPEATABLE_READ configuration.

The preceding discussion has gone into a lot of detail about what Hibernate wants to accomplish as it caches data, and what JBoss Cache configuration options are available. What should be clear is that the configurations that are best for caching one type of data are not the best (and are sometimes completely incorrect) for other types. Entities likely work best with synchronous invalidation; timestamps require replication; query caching might do best in local mode.

Prior to Hibernate 3.3 and JBoss Cache 2.1, the conflicting requirements between the different cache types led to a real dilemna, particularly if query caching was enabled. This conflict arose because all four cache types needed to share a single underlying cache, with a single configuration. If query caching was enabled, the requirements of the timestamps cache basically forced use of synchronous replication, which is the worst performing choice for the more critical entity cache and is often inappropriate for the query cache.

With Hibernate 3.3 and JBoss Cache 2.1 it has become possible, even easy, to use separate underlying JBoss Cache instances for the different cache types. As a result, the entity cache can be optimally configured for entities while the necessary configuration for the timestamps cache is maintained.

There were three key changes that make this improvement possible:

JGroups is the group communication library JBoss Cache uses JGroups to send messages around a cluster. Each cache has a JGroups Channel; different channels around the cluster that have the same name and compatible configurations detect each other and form a group for message transmission.

A Channel is a fairly heavy object, typically using a good number of threads, several sockets and some good sized network I/O buffers. Creating multiple different channels in the same VM was therefore costly, and was an administrative burden as well, since each channel would need separate configuration to use different network addresses or ports. Architecturally, this mitigated against having multiple JBoss Cache instances in an application, since each would need its own Channel.

Added in JGroups 2.5 and much improved in the JGroups 2.6 series is the concept of sharable JGroups resources. Basically, the heavyweight JGroups elements can be shared. An application (e.g. the Hibernate/JBoss Cache integration layer) uses a JGroups ChannelFactory. The ChannelFactory is provided with a set of named channel configurations. When a Channel is needed (e.g. by a JBoss Cache instance), the application asks the ChannelFactory for the channel by name. If different callers ask for a channel with the same name, the ChannelFactory ensures that they get channels that share resources.

The effect of all this is that if a user wants to use four separate JBoss Cache instances, one for entity caching, one for collection caching, one for query caching and one for timestamp caching, those four caches can all share the same underlying JGroups resources.

The task of a Hibernate Second Level Cache user is to:

See Section 3.3, “JGroups Configuration” for more on JGroups.

So, we've seen that Hibernate caches up to four different types of data (entities, collections, queries and timestamps) and that Hibernate + JBoss Cache gives you the flexibility to use a separate underlying JBoss Cache, with different behavior, for each type. You can actually deploy four separate caches, one for each type.

In practice, four separate caches are unnecessary. For example, entities and collection caching have similar enough semantics that there is no reason not to share a JBoss Cache instance between them. The queries can usually use the same cache as well. Similarly, queries and timestamps can share a JBoss Cache instance configured for replication, with the hibernate.cache.jbc.query.localonly=true configuration letting you turn off replication for the queries if you want to.

Here's a decision tree you can follow:

  1. Decide if you want to enable query caching.

  2. Decide if you want to use invalidation or replication for your entities and collections. Invalidation is generally recommended for entities and collections.

  3. If you are using query caching, from the above decision tree you've either got your timestamps sharing a cache with other data types, or they are by themselves. Either way, the cache being used for timestamps must have initial state transfer enabled. Now, if the timestamps are sharing a cache with entities, collections or queries, decide whether you want initial state transfer for that other data. See Section 2.2.5, “Initial State Transfer” for the implications of this. If you don't want initial state transfer for the other data, you'll need to have a separate cache for the timestamps.

  4. Finally, if your queries are sharing a cache configured for replication, decide if you want the cached query results to replicate. (The timestamps cache must replicate.) If not, you'll want to set the hibernate.cache.region.jbc2.query.localonly=true option when you configure your SessionFactory

Once you've made these decisions, you know whether you need just one underlying JBoss Cache instance, or more than one. Next we'll see how to actually configure the setup you've selected.



[2] See the discussion of the hibernate.cache.jbc.query.localonly property in Section 3.1, “Configuring the Hibernate Session Factory”

for more on how to only cache query results locally.

There are three main areas of configuration involved in using JBoss Cache 3 for your Hibernate Second Level Cache: configuring the Hibernate SessionFactory, configuring the underlying JBoss Cache instance(s), and configuring the JGroups ChannelFactory. If you use the standard JBoss Cache and JGroups configurations that ship with the hibernate-jbosscache2.jar, then all you need to worry about is the SessionFactory configuration.

There are five basic steps to configuring the SessionFactory:

  • Make sure your Hibernate is configured to use JTA transactions. See Section 1.2.2, “JTA Transactional Support” for details.

  • Tell Hibernate you whether to enable caching of entities and collections. No need to set this property if you don't:

    hibernate.cache.use_second_level_cache=true
  • Tell Hibernate you want to enable caching of query results. No need to set this property if you don't:

    hibernate.cache.use_query_cache=true
  • If you have enabled caching of query results, tell Hibernate if you want to suppress costly replication of those results around the cluster. No need to set this property if you want query results replicated:

    hibernate.cache.jbc.query.localonly=true
  • Finally, you need to tell Hibernate what RegionFactory implementation to use to manage your caches. You do this by setting the hibernate.cache.region.factory_class configuration option.

    hibernate.cache.region.factory_class=
       org.hibernate.cache.jbc.MultiplexedJBossCacheRegionFactory

    To determine the correct factory class, you must decide whether you need just one underlying JBoss Cache instance to support the different types of caching you will be doing, or whether you need more than one. See Chapter 2, Core Concepts and particularly Section 2.3.4, “Bringing It All Together” for more on how to make that decision. Once you know the answer, see Section 3.1.2, “Specifying the RegionFactory Implementation” to find the factory class that best meets your needs.

    Once you've specified your factory class, there may be other factory-class-specific configurations you may want to set. The available options are explained below.

The MultiplexedJBossCacheRegionFactory supports a number of additional configuration options:

Many of the default values name JBoss Cache configurations in the standard jbc-configs.xml file found in the hibernate-jbosscache.jar. See Section 3.2.4, “Standard JBoss Cache Configurations” for details on those configurations. If you want to set hibernate.cache.jbc.configs and use your own JBoss Cache configuration file, you can still take advantage of these default names; just name the configurations in your file to match.

This is all looks a bit complex, so let's show what happens if you just configure the defaults, with query caching enabled:

hibernate.cache.use_second_level_cache=true
hibernate.cache.use_query_cache=true
hibernate.cache.region.factory_class=
   org.hibernate.cache.jbc.MultiplexedJBossCacheRegionFactory

You would end up using two JBoss Cache instances:

  • One, for the entities, collection and queries, would use the optimistic-entity configuration. This cache would use optimistic locking, synchronous invalidation and would disable initial state transfer.

  • The second, for timestamps, would use the timestamps-cache configuration. This cache would use pessimistic locking, asynchronous replication and would enable initial state transfer.

See Section 3.2.4, “Standard JBoss Cache Configurations” for more on these standard cache configurations.

If you hadn't set hibernate.cache.use_query_cache=true you'd just have the single optimistic-entity cache, shared by the entities and collections.

JBoss Cache provides a great many different configuration options; here we are just going to look at a few that are most relevant to the Second Level Cache use case. Please see the JBoss Cache User Guide for full details.

Let's look at how to specify a few of the JBoss Cache configuration options that are most relevant to the Hibernate use case.

The JBoss Cache CacheMode attribute encapsulates whether the cache uses replication, invalidation or local mode, as well as whether messages should be synchronous or asynchronous. See Section 2.2.1, “Replication vs. Invalidation vs. Local Mode” and Section 2.2.2, “Synchronous vs. Asynchronous” for a discussion of these concepts.

The CacheMode is configured as follows:

<!-- Legal modes are LOCAL
                     REPL_ASYNC
                     REPL_SYNC
                     INVALIDATION_ASYNC
                     INVALIDATION_SYNC
-->
<attribute name="CacheMode">INVALIDATION_SYNC</attribute>

The JBoss Cache NodeLockingScheme attribute configures whether optimistic locking or pessimistic locking should be used. See Section 2.2.3, “Locking Scheme” for a discussion of locking.

The NodeLockingScheme is configured as follows:

<!-- Node locking scheme:
        MVCC
        PESSIMISTIC (default)
-->
<attribute name="NodeLockingScheme">MVCC</attribute>

The JBoss Cache IsolationLevel attribute configures whether READ_COMMITTED or REPEATABLE_READ semantics are needed if pessimistic locking is used. It has no effect if optimistic locking is used. See Section 2.2.4, “Isolation Level” for a discussion of isolation levels.

The IsolationLevel is configured as follows:

<!-- Isolation level:
        READ_COMMITTED
        REPEATABLE_READ
-->
<attribute name="IsolationLevel">READ_COMMITTED</attribute>

See Section 2.2.5, “Initial State Transfer” for a discussion of the concept of initial state transfer.

Initial State Transfer is configured as follows:

<!-- Whether or not to fetch state on joining a cluster. -->
<attribute name="FetchInMemoryState">false</attribute>

Hibernate ships with a number of standard JBoss Cache configurations in the hibernate-jbosscache.jar's jbc-configs.xml file. The following table highlights the key features of each configuration.


A few notes on the above table:

These standard configurations are a good choice for many applications. The primary reason users may want to use their own configurations is to support more complex eviction setups. See Chapter 4, Cache Eviction for more on the kinds of things you can do with eviction.

JGroups configuration is a complex area that goes well beyond the scope of this document. Users interested in exploring the details are encouraged to visit the JGroups website at http://www.jgroups.org as well as the JGroups wiki page at jboss.com.

The jgroups-stacks.xml file found in the org.hibernate.cache.jbc.builder package in the hibernate-jbosscache.jar provides a good set of standard JGroups configurations; these should be suitable for most needs. If you need to create your own configuration set, we recommend that you start with this file as a base.

The JGroups transport refers to the mechanism JGroups uses for sending messages to the group members. Choosing which transport to use is the main JGroups-related decision users will need to make. There are three transport types:

Eviction refers to the process by which old, relatively unused, or excessively voluminous data can be dropped from the cache, allowing the cache to remain within a memory budget. Generally, applications that use the Second Level Cache should configure eviction, unless only a relatively small amount of reference data is cached. This chapter provides a brief overview of how JBoss Cache eviction works, and then explains how to configure eviction to effectively manage the data stored in a Hibernate Second Level Cache. A basic understanding of JBoss Cache eviction and of concepts like FQNs is assumed; see the JBoss Cache User Guide for more information.

The JBoss Cache eviction process is fairly straightforward. Whenever a node in a cache is read or written to, added or removed, the cache finds the eviction region (see below) that contains the node and passes an eviction event object to the eviction policy (see below) associated with the region. The eviction policy uses the stream of events it receives to track activity in the region. Periodically, a background thread runs and contacts each region's eviction policy. The policy uses its knowledge of the activity in the region, along with any configuration it was provided at startup, to determine which if any cache nodes should be evicted from memory. It then tells the cache to evict those nodes. Evicting a node means dropping it from the cache's in-memory state. The eviction only occurs on that cache instance; there is no cluster-wide eviction.

An important point to understand is that eviction proceeds independently on each peer in the cluster, with what gets evicted depending on the activity on that peer. There is no "global eviction" where JBoss Cache removes a piece of data in every peer in the cluster in order to keep memory usage inside a budget. The Hibernate/JBC integration layer may remove some data globally, but that isn't done for the kind of memory management reasons we're discussing in this chapter.

An effect of this is that even if a cache is configured for replication, if eviction is enabled the contents of a cache will be different between peers in the cluster; some may have evicted some data, while others will have evicted different data. What gets evicted is driven by what data is accessed by users on each peer.

Controlling when data is evicted from the cache is a matter of setting up appropriate eviction regions and configuring appropriate eviction policies for each region.

An Eviction Policy is a class that knows how to handle eviction events to track the activity in its region. It may have a specialized set of configuration properties that give it rules for when a particular node in the region should be evicted. It can then use that configuration and its knowledge of activity in the region to to determine what nodes to evict.

JBoss Cache ships with a number of eviction policies. See the JBoss Cache User Guide for a discussion of all of them. Here we are going to focus on just two.

In order to understand how to configure eviction, you need to understand how Hibernate organizes data in the cache.

So far we've been looking at things in the abstract; let's see an example of how this comes together. In this example, imagine we have a Hibernate application with the following characteristics.

Let's see a possible eviction configuration for this scenario:

<attribute name="EvictionPolicyConfig">
  <config>
         
    <attribute name="wakeUpIntervalSeconds">5</attribute>
    <attribute name="policyClass">org.jboss.cache.eviction.LRUPolicy</attribute>
          
    <!--  
      Default region to pick up anything we miss in the more
      specific regions below.
    -->
    <region name="/_default_">
       <attribute name="maxNodes">500</attribute>
       <attribute name="timeToLiveSeconds">300</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!--  Don't ever evict modification timestamps -->
    <region name="/TS" 
       policyClass="org.jboss.cache.eviction.NullEvictionPolicy"/>
            
    <!-- Reference data -->
    <region name="/appA/reference">
       <!-- Keep all reference data if it's being used -->
       <attribute name="maxNodes">0</attribute>
       <!-- Keep it around a long time (4 hours) -->
       <attribute name="timeToLiveSeconds">14400</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- Be more aggressive about queries on reference data -->
    <region name="/appA/reference/QUERY">
       <attribute name="maxNodes">200</attribute>
       <attribute name="timeToLiveSeconds">1000</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- 
       Lots of entity instances from this package, but different
       users are unlikely to share them. So, we can cache
       a lot, but evict unused ones pretty quickly.
    -->
    <region name="/appA/org/example/hibernate">
       <attribute name="maxNodes">50000</attribute>
       <attribute name="timeToLiveSeconds">1200</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- Clean up misc queries very promptly -->
    <region name="/appA/org/hibernate/cache/StandardQueryCache">
       <attribute name="maxNodes">200</attribute>
       <attribute name="timeToLiveSeconds">240</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
  </config>
</attribute>

Notes on the above:

Some best practices to follow:

We've now gone through all the main concepts and the configuration details; now we'll look a bit under the covers to understand a bit more about the architectural design of the Hibernate/JBoss Cache integration. Readers can skip this chapter if they aren't interested in a look under the covers.

The rest of Hibernate interacts with the Second Level Cache subsystem via the org.hibernate.cache.RegionFactory interface. What implementation of the interface is used is determined by the value of the hibernate.cache.region.factory_class configuration property. The interface itself is straightforward:

void start(Settings settings, Properties properties) 
       throws CacheException;

void stop();

boolean isMinimalPutsEnabledByDefault();

long nextTimestamp();

EntityRegion buildEntityRegion(String regionName, 
                               Properties properties, 
                               CacheDataDescription metadata) 
            throws CacheException;

CollectionRegion buildCollectionRegion(String regionName, 
                                       Properties properties, 
                                       CacheDataDescription cdd) 
            throws CacheException;

QueryResultsRegion buildQueryResultsRegion(String regionName, 
                                           Properties properties) 
            throws CacheException;

TimestampsRegion buildTimestampsRegion(String regionName, 
                                       Properties properties) 
            throws CacheException;

Next, we'll look at the architecture of how the JBoss Cache integration implements these interfaces, first in the case where a single JBoss Cache instance is used, next in the case where multiple instances are desired.

The following diagram illustrates the key elements involved when a single JBoss Cache instance is used:

The situation when multiple JBoss Cache instances are used is very similar to the single cache case: