Hibernate.orgCommunity Documentation
Eviction refers to the process by which old, relatively unused, or excessively voluminous data can be dropped from the cache, allowing the cache to remain within a memory budget. Generally, applications that use the Second Level Cache should configure eviction, unless only a relatively small amount of reference data is cached. This chapter provides a brief overview of how JBoss Cache eviction works, and then explains how to configure eviction to effectively manage the data stored in a Hibernate Second Level Cache. A basic understanding of JBoss Cache eviction and of concepts like FQNs is assumed; see the JBoss Cache User Guide for more information.
The JBoss Cache eviction process is fairly straightforward. Whenever a node in a cache is read or written to, added or removed, the cache finds the eviction region (see below) that contains the node and passes an eviction event object to the eviction policy (see below) associated with the region. The eviction policy uses the stream of events it receives to track activity in the region. Periodically, a background thread runs and contacts each region's eviction policy. The policy uses its knowledge of the activity in the region, along with any configuration it was provided at startup, to determine which if any cache nodes should be evicted from memory. It then tells the cache to evict those nodes. Evicting a node means dropping it from the cache's in-memory state. The eviction only occurs on that cache instance; there is no cluster-wide eviction.
An important point to understand is that eviction proceeds independently on each peer in the cluster, with what gets evicted depending on the activity on that peer. There is no "global eviction" where JBoss Cache removes a piece of data in every peer in the cluster in order to keep memory usage inside a budget. The Hibernate/JBC integration layer may remove some data globally, but that isn't done for the kind of memory management reasons we're discussing in this chapter.
An effect of this is that even if a cache is configured for replication, if eviction is enabled the contents of a cache will be different between peers in the cluster; some may have evicted some data, while others will have evicted different data. What gets evicted is driven by what data is accessed by users on each peer.
Controlling when data is evicted from the cache is a matter of setting up appropriate eviction regions and configuring appropriate eviction policies for each region.
JBoss Cache stores its data in a set of nodes organized in a tree
structure. An eviction region is a just a portion of the tree
to which an eviction policy has been assigned. The name of the
region is the FQN of the topmost node in that portion of the tree.
An eviction configuration always needs to include a special region
named _default_
; this region is rooted in the
root node of the tree and includes all nodes not covered by
other regions.
It's possible to define regions that overlap. In other words, one region can be defined for /a/b/c, and another defined for /a/b/c/d (which is just the d subtree of the /a/b/c sub-tree). The algorithm that assigns eviction events to eviction regions handles scenarios like this consistently by always choosing the first region it encounters. So, if the algorithm needed to decide how to handle an event affecting /a/b/c/d/e, it would start from there and work its way up the tree until it hits the first defined region - in this case /a/b/c/d.
An Eviction Policy is a class that knows how to handle eviction events to track the activity in its region. It may have a specialized set of configuration properties that give it rules for when a particular node in the region should be evicted. It can then use that configuration and its knowledge of activity in the region to to determine what nodes to evict.
JBoss Cache ships with a number of eviction policies. See the JBoss Cache User Guide for a discussion of all of them. Here we are going to focus on just two.
The org.jboss.cache.eviction.LRUPolicy
evicts
nodes that have been Least Recently Used. It has the following
configuration parameters:
maxNodes
- This is the maximum number of nodes allowed in this region.
0 denotes no limit. If the region has more nodes than this,
the least recently used nodes will be evicted until the number
of nodes equals this limit.
timeToLiveSeconds
- The amount of time a node is not written to or read (in seconds)
before the node should be evicted. 0 denotes no limit. Nodes that
exceed this limit will be evicted whether or not a
maxNodes
limit has been breached.
maxAgeSeconds
- Lifespan of a node (in seconds) regardless of idle time before
the node is swept away. 0 denotes no limit. Nodes that
exceed this limit will be evicted whether or not a
maxNodes
or timeToLiveSeconds
limit has been breached.
minTimeToLiveSeconds
- the minimum amount of time a node must be allowed to live after
being accessed before it is allowed to be considered for eviction.
0 denotes that this feature is disabled, which is the default value.
Should be set to a value less than timeToLiveSeconds
.
It is recommended that this be set to a value slightly greater
than the maximum amount of time a transaction that affects the
region should take to complete. Configuring this is particularly
important when optimistic locking is used in conjunction with
invalidation.
The org.jboss.cache.eviction.NullEvictionPolicy
is a simple policy that very efficiently does ... nothing. It
is used to efficiently short-circuit eviction handling for regions
where you don't want objects to be evicted (e.g. the timestamps
cache, which should never have data
evicted). Since the NullEvictionPolicy
doesn't
actually evict anything, it doesn't take any configuration parameters.
In order to understand how to configure eviction, you need to understand how Hibernate organizes data in the cache.
All FQNs in a second level cache include two elements:
A Region Prefix, which is simply
any value assigned to the
hibernate.cache.region_prefix
Hibernate
configuration property. If no Region Prefix is set, this
portion of the FQN is omitted.
If different session factories are sharing the same underlying JBoss Cache instance(s) it is strongly encouraged that a distinct Region Prefix be assigned to each. This will help ensure that the different session factories cache their data in different subtrees in JBoss Cache.
A Region Name, which is either
any value assigned to a <cache>
element's
region
attribute in a class or collection mapping.
See Section 4.2.2, “Entities” for
an example.
Any value assigned to a Hibernate Query
object's cacheRegion
property. See
Section 4.2.4, “Queries” for an
example.
The escaped class name of the type
being cached. An escaped class name
is simply a fully-qualified class name with any
.
replaced with a /
-- for example org/example/Foo
.
The FQN for the cache region where entities of a particular class are stored is derived as follows:
/
+ Region Prefix + /
+ Region Name + /ENTITY
If no region prefix was specified, the leading /
and
Region Prefix is not included in the FQN.
So, if hibernate.cache.region_prefix
was set to
"appA" and a class was mapped like this:
<class name="org.example.Foo"> <cache usage="transactional" region="foo_region"/> .... </class>
The FQN of the region where Foo
entities
would be cached is /appA/foo_region/ENTITY
.
If the class mapping does not include a region
attribute, the region name is based on the name of the entity
class, e.g.
<class name="org.example.Bar"> <cache usage="transactional"/> .... </class>
the FQN of the region where Bar
entities
would be cached is /appA/org/example/Bar/ENTITY
.
The FQN for the cache region where entities of a particular class is stored is derived as follows:
/
+ Region Prefix + /
+ Region Name + /COLL
So, let's say our example Foo
entity
included a collection field bars
that
we wanted to cache:
<class name="org.example.Foo"> <cache usage="transactional"/> .... <set name="bars"> <cache usage="transactional" region="foo_region"/> <key column="FOO_ID"/> <one-to-many class="org.example.Bar"/> </set> </class>
The FQN of the region where the collection would be cached
would be
/appA/foo_region/COLL
.
If the collection's <cache>
element
did not include a region
, the FQN would be
/appA/org/example/Foo/COLL
.
Queries follow this pattern:
/
+ Region Prefix + /
+ Region Name + /QUERY
Say we had the following query (again with a region prefix set to "appA"):
List blogs = sess.createQuery("from Blog blog " + "where blog.blogger = :blogger") .setEntity("blogger", blogger) .setMaxResults(15) .setCacheable(true) .setCacheRegion("frontpages") .list();
The FQN of the region where this query's results would be cached
would be /appA/frontpages/QUERY
.
If the call to setCacheRegion("frontpages")
were ommitted, the Region Name portion of
the FQN would be based on a Hibernate class:
/appA/org/hibernate/cache/StandardQueryCache/QUERY
Timestamps follow this pattern:
/TS/
+ Region Prefix +
/org/hibernate/cache/UpdateTimestampsCache
again with a /
and the Region Prefix
portion omitted if no region prefix was set.
Note that in the timestamps case the special constant ("TS") comes at the start of the FQN rather than the end. This makes it easier to ensure that eviction is never enabled for the timestamps region.
So far we've been looking at things in the abstract; let's see an example of how this comes together. In this example, imagine we have a Hibernate application with the following characteristics.
Query caching is enabled.
There is a region prefix set as part of the Hibernate
configuration: hibernate.cache.region_prefix==appA
Some cachable entities and collections have a region name of "reference" set in their Hibernate mapping.
Some cachable queries have the "reference" region name set when they are created.
Other cachable entities and collections in the
org.example.hibernate
package don't have a
region name set in their Hibernate mapping.
Other cachable queries don't have a region name set when they are created.
Let's see a possible eviction configuration for this scenario:
<attribute name="EvictionPolicyConfig"> <config> <attribute name="wakeUpIntervalSeconds">5</attribute> <attribute name="policyClass">org.jboss.cache.eviction.LRUPolicy</attribute> <!-- Default region to pick up anything we miss in the more specific regions below. --> <region name="/_default_"> <attribute name="maxNodes">500</attribute> <attribute name="timeToLiveSeconds">300</attribute> <attribute name="minTimeToLiveSeconds">120</attribute> </region> <!-- Don't ever evict modification timestamps --> <region name="/TS" policyClass="org.jboss.cache.eviction.NullEvictionPolicy"/> <!-- Reference data --> <region name="/appA/reference"> <!-- Keep all reference data if it's being used --> <attribute name="maxNodes">0</attribute> <!-- Keep it around a long time (4 hours) --> <attribute name="timeToLiveSeconds">14400</attribute> <attribute name="minTimeToLiveSeconds">120</attribute> </region> <!-- Be more aggressive about queries on reference data --> <region name="/appA/reference/QUERY"> <attribute name="maxNodes">200</attribute> <attribute name="timeToLiveSeconds">1000</attribute> <attribute name="minTimeToLiveSeconds">120</attribute> </region> <!-- Lots of entity instances from this package, but different users are unlikely to share them. So, we can cache a lot, but evict unused ones pretty quickly. --> <region name="/appA/org/example/hibernate"> <attribute name="maxNodes">50000</attribute> <attribute name="timeToLiveSeconds">1200</attribute> <attribute name="minTimeToLiveSeconds">120</attribute> </region> <!-- Clean up misc queries very promptly --> <region name="/appA/org/hibernate/cache/StandardQueryCache"> <attribute name="maxNodes">200</attribute> <attribute name="timeToLiveSeconds">240</attribute> <attribute name="minTimeToLiveSeconds">120</attribute> </region> </config> </attribute>
Notes on the above:
The wakeUpIntervalSeconds
configuration
controls how often the background eviction process kicks
in to evict nodes.
The first policyClass
configuration
sets the default eviction policy class to use for each region.
Here we want to use the standard LRUPolicy
This can be overridden on a per-region basis, as is done
here for the /TS
region.
We set up a /_default_
region. Having
such a region is a requirement if eviction is used. Here we
don't expect any data to end up in this default region, but
if by mistake someone adds a new entity type that doesn't fall
into one of our other regions, we may not have a large memory
budget for it so we evict fairly agressively.
Evicting timestamps is forbidden, so we add a
/TS
region that disables it. Here we
see how to override the default eviction policy.
The /appA/reference
region covers our
reference data entities and collections. This is our most
likely to be reused data, so we configure the cache to be
very slow to evict it.
The queries related to our reference data are less likely to
be reused, and may take up a lot of memory, so we override the
/appA/reference
region with a
/appA/reference/QUERY
region that is more
agressive about eviction.
The org.example.hibernate
package includes a
lot of entity classes like Order
, where
there are hundreds of thousands of records in the database.
These are unlikely to be reused across users, but we have a lot
of users and want to be able to cache many of them so a user
can have fast access to his or her data during the
course of their interaction with the system. So we create a
/appA/org/example/hibernate
region
with a high maxNodes
value but a fairly
low timeToLiveSeconds
. The low time-to-live
ensures an Order
is evicted quickly once a
user is done with it.
Finally, cacheable queries that aren't assigned to
to the reference
region will end up in
/appA/org/hibernate/cache/StandardQueryCache
.
We've elected not to keep these around long at all.
Some best practices to follow:
Set hibernate.cache.region_prefix
in your
configuration. It makes it simple to ensure the different session
factories don't step on each other if they share a JBoss Cache
instance.
Always set up an eviction region for the /TS
FQN that uses the NullEvictionPolicy
. This
will ensure that timestamps never get evicted. Even if you are
not doing query caching or aren't caching timestamps in a
particular cache, this is still a good practice, as it costs
almost nothing and helps to ensure that timestamp eviction doesn't
slip in unnoticed later.
Assign a region to your entities, collections and queries rather than relying on class names to compose the FQN. It makes it easier to set up eviction, and helps prevent your eviction setup breaking if class names are refactored.
Assign a different region name to your entities, collections or queries that have different desirable eviction characteristics. Put objects like often used reference data in one region, data probably only accessed by a single user in another. Aggressively evict the latter region; be less agressive with the former if you evict it at all.
In some cases, there is an external application (i.e. outside
of Hibernate's control) that can modify data in the database.
Generally, a Second Level Cache should not be used in this sort
of case, since it can result in data in the cache being out of
date with respect to the database. But sometimes application
designers can tolerate having out of date data in the cache. In
this sort of situation, use an LRUPolicy
with
a fairly low maxAgeSeconds
. This will ensure
that out-of-date data eventually gets flushed from the cache.
Copyright © 2009 Red Hat, Inc.