Chapter 4. Cache Eviction

Eviction refers to the process by which old, relatively unused, or excessively voluminous data can be dropped from the cache, allowing the cache to remain within a memory budget. Generally, applications that use the Second Level Cache should configure eviction, unless only a relatively small amount of reference data is cached. This chapter provides a brief overview of how JBoss Cache eviction works, and then explains how to configure eviction to effectively manage the data stored in a Hibernate Second Level Cache. A basic understanding of JBoss Cache eviction and of concepts like FQNs is assumed; see the JBoss Cache User Guide for more information.

4.2. Organization of Data in the Cache

In order to understand how to configure eviction, you need to understand how Hibernate organizes data in the cache.

4.2.1. Region Prefix and Region Name

All FQNs in a second level cache include two elements:

A Region Prefix, which is simply any value assigned to the hibernate.cache.region_prefix Hibernate configuration property. If no Region Prefix is set, this portion of the FQN is omitted.
If different session factories are sharing the same underlying JBoss Cache instance(s) it is strongly encouraged that a distinct Region Prefix be assigned to each. This will help ensure that the different session factories cache their data in different subtrees in JBoss Cache.
A Region Name, which is either
- any value assigned to a <cache> element's region attribute in a class or collection mapping. See Section 4.2.2, “Entities” for an example.
- Any value assigned to a Hibernate Query object's cacheRegion property. See Section 4.2.4, “Queries” for an example.
- The escaped class name of the type being cached. An escaped class name is simply a fully-qualified class name with any . replaced with a / -- for example org/example/Foo.

4.2.2. Entities

The FQN for the cache region where entities of a particular class are stored is derived as follows:

/ + Region Prefix + / + Region Name + /ENTITY

If no region prefix was specified, the leading / and Region Prefix is not included in the FQN. So, if hibernate.cache.region_prefix was set to "appA" and a class was mapped like this:

<class name="org.example.Foo">
    <cache usage="transactional" region="foo_region"/>
    ....
</class>

The FQN of the region where Foo entities would be cached is /appA/foo_region/ENTITY.

If the class mapping does not include a region attribute, the region name is based on the name of the entity class, e.g.

<class name="org.example.Bar">
    <cache usage="transactional"/>
    ....
</class>

the FQN of the region where Bar entities would be cached is /appA/org/example/Bar/ENTITY.

4.2.3. Collections

The FQN for the cache region where entities of a particular class is stored is derived as follows:

/ + Region Prefix + / + Region Name + /COLL

So, let's say our example Foo entity included a collection field bars that we wanted to cache:

<class name="org.example.Foo">
    <cache usage="transactional"/>
    ....
    <set name="bars">
        <cache usage="transactional" region="foo_region"/>
        <key column="FOO_ID"/>
        <one-to-many class="org.example.Bar"/>
    </set>
</class>

The FQN of the region where the collection would be cached would be /appA/foo_region/COLL.

If the collection's <cache> element did not include a region, the FQN would be /appA/org/example/Foo/COLL.

4.2.4. Queries

Queries follow this pattern:

/ + Region Prefix + / + Region Name + /QUERY

Say we had the following query (again with a region prefix set to "appA"):

List blogs = sess.createQuery("from Blog blog " + 
                              "where blog.blogger = :blogger")
    .setEntity("blogger", blogger)
    .setMaxResults(15)
    .setCacheable(true)
    .setCacheRegion("frontpages")
    .list();

The FQN of the region where this query's results would be cached would be /appA/frontpages/QUERY.

If the call to setCacheRegion("frontpages") were ommitted, the Region Name portion of the FQN would be based on a Hibernate class: /appA/org/hibernate/cache/StandardQueryCache/QUERY

4.2.5. Timestamps

Timestamps follow this pattern:

/TS/ + Region Prefix + /org/hibernate/cache/UpdateTimestampsCache

again with a / and the Region Prefix portion omitted if no region prefix was set.

Note that in the timestamps case the special constant ("TS") comes at the start of the FQN rather than the end. This makes it easier to ensure that eviction is never enabled for the timestamps region.

4.3. Example Configuration

So far we've been looking at things in the abstract; let's see an example of how this comes together. In this example, imagine we have a Hibernate application with the following characteristics.

Query caching is enabled.
There is a region prefix set as part of the Hibernate configuration: hibernate.cache.region_prefix==appA
Some cachable entities and collections have a region name of "reference" set in their Hibernate mapping.
Some cachable queries have the "reference" region name set when they are created.
Other cachable entities and collections in the org.example.hibernate package don't have a region name set in their Hibernate mapping.
Other cachable queries don't have a region name set when they are created.

Let's see a possible eviction configuration for this scenario:

<attribute name="EvictionPolicyConfig">
  <config>
         
    <attribute name="wakeUpIntervalSeconds">5</attribute>
    <attribute name="policyClass">org.jboss.cache.eviction.LRUPolicy</attribute>
          
    <!--  
      Default region to pick up anything we miss in the more
      specific regions below.
    -->
    <region name="/_default_">
       <attribute name="maxNodes">500</attribute>
       <attribute name="timeToLiveSeconds">300</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!--  Don't ever evict modification timestamps -->
    <region name="/TS" 
       policyClass="org.jboss.cache.eviction.NullEvictionPolicy"/>
            
    <!-- Reference data -->
    <region name="/appA/reference">
       <!-- Keep all reference data if it's being used -->
       <attribute name="maxNodes">0</attribute>
       <!-- Keep it around a long time (4 hours) -->
       <attribute name="timeToLiveSeconds">14400</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- Be more aggressive about queries on reference data -->
    <region name="/appA/reference/QUERY">
       <attribute name="maxNodes">200</attribute>
       <attribute name="timeToLiveSeconds">1000</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- 
       Lots of entity instances from this package, but different
       users are unlikely to share them. So, we can cache
       a lot, but evict unused ones pretty quickly.
    -->
    <region name="/appA/org/example/hibernate">
       <attribute name="maxNodes">50000</attribute>
       <attribute name="timeToLiveSeconds">1200</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
    <!-- Clean up misc queries very promptly -->
    <region name="/appA/org/hibernate/cache/StandardQueryCache">
       <attribute name="maxNodes">200</attribute>
       <attribute name="timeToLiveSeconds">240</attribute>
       <attribute name="minTimeToLiveSeconds">120</attribute>
    </region>
            
  </config>
</attribute>

Notes on the above:

The wakeUpIntervalSeconds configuration controls how often the background eviction process kicks in to evict nodes.
The first policyClass configuration sets the default eviction policy class to use for each region. Here we want to use the standard LRUPolicy This can be overridden on a per-region basis, as is done here for the /TS region.
We set up a /_default_ region. Having such a region is a requirement if eviction is used. Here we don't expect any data to end up in this default region, but if by mistake someone adds a new entity type that doesn't fall into one of our other regions, we may not have a large memory budget for it so we evict fairly agressively.
Evicting timestamps is forbidden, so we add a /TS region that disables it. Here we see how to override the default eviction policy.
The /appA/reference region covers our reference data entities and collections. This is our most likely to be reused data, so we configure the cache to be very slow to evict it.
The queries related to our reference data are less likely to be reused, and may take up a lot of memory, so we override the /appA/reference region with a /appA/reference/QUERY region that is more agressive about eviction.
The org.example.hibernate package includes a lot of entity classes like Order, where there are hundreds of thousands of records in the database. These are unlikely to be reused across users, but we have a lot of users and want to be able to cache many of them so a user can have fast access to his or her data during the course of their interaction with the system. So we create a /appA/org/example/hibernate region with a high maxNodes value but a fairly low timeToLiveSeconds. The low time-to-live ensures an Order is evicted quickly once a user is done with it.
Finally, cacheable queries that aren't assigned to to the reference region will end up in /appA/org/hibernate/cache/StandardQueryCache. We've elected not to keep these around long at all.

4.4. Best Practices

Some best practices to follow:

Set hibernate.cache.region_prefix in your configuration. It makes it simple to ensure the different session factories don't step on each other if they share a JBoss Cache instance.
Always set up an eviction region for the /TS FQN that uses the NullEvictionPolicy. This will ensure that timestamps never get evicted. Even if you are not doing query caching or aren't caching timestamps in a particular cache, this is still a good practice, as it costs almost nothing and helps to ensure that timestamp eviction doesn't slip in unnoticed later.
Assign a region to your entities, collections and queries rather than relying on class names to compose the FQN. It makes it easier to set up eviction, and helps prevent your eviction setup breaking if class names are refactored.
Assign a different region name to your entities, collections or queries that have different desirable eviction characteristics. Put objects like often used reference data in one region, data probably only accessed by a single user in another. Aggressively evict the latter region; be less agressive with the former if you evict it at all.
In some cases, there is an external application (i.e. outside of Hibernate's control) that can modify data in the database. Generally, a Second Level Cache should not be used in this sort of case, since it can result in data in the cache being out of date with respect to the database. But sometimes application designers can tolerate having out of date data in the cache. In this sort of situation, use an LRUPolicy with a fairly low maxAgeSeconds. This will ensure that out-of-date data eventually gets flushed from the cache.