JBoss.orgCommunity Documentation

Chapter 5. Repositories

5.1. Repository connectors
5.2. Repository Service
5.3. Out-of-the-box repository connectors
5.3.1. In-memory connector
5.3.2. JBoss Cache connector
5.3.3. Federating connector
5.4. Writing custom connectors
5.4.1. Creating the Maven 2 project
5.4.2. Implementing a RepositorySource
5.4.3. Implementing a RepositoryConnection
5.4.4. Testing custom connectors
5.4.5. Configuring and deploying custom connectors
5.5. Graph API for using connectors
5.6. Summary

There is a lot of information stored in many of different places: databases, repositories, SCM systems, registries, file systems, services, etc. The purpose of the federation engine is to allow applications to use the JCR API to access that information as if it were all stored in a single JCR repository, but to really leave the information where it is.

Why not just copy or move the information into a JCR repository? Moving it is probably pretty difficult, since most likely there are existing applications that rely upon that information being where it is. All of those applications would break or have to change. And copying the information means that we'd have to continually synchronize the changes. This not only is a lot of work, but it often makes it difficult to know whether information is accurate and "the master" data.

JBoss DNA lets us leave information where it, yet access it through the JCR API as if it were in one big repository. One major benefit is that existing applications that use the information in the original locations don't break, since they can keep using the information. But now our JCR clients can also access all the information, too. And if our federating JBoss DNA repository is configured to allow updates, JCR client applications can change the information in the repository and JBoss DNA will propagate those changes down to the original source, making those changes visible to all the other applications.

In short, all clients see the correct information, even when it changes in the underlying systems. But the JCR clients can get to all of the information in one spot, using one powerful standard API.

With JBoss DNA, your applications use the JCR API to work with the repository, but the DNA repository transparently fetches the information from different kinds of repositories and storage systems, not just a single purpose-built store. This is fundamentally what makes JBoss DNA different.

How does JBoss DNA do this? At the heart of JBoss DNA and it's JCR implementation is a simple graph-based repository connector system. Essentially, JBoss DNA's JCR implementation uses a single repository connector to access all content:


That single repository connector could use an in-memory repository, a JBoss Cache instance (including those that are clustered and replicated), or a federated repository where content from multiple sources is unified.


Really, the federated connector gives us all kinds of possibilities, since we can use that connector on top of lots of connectors to other individual sources. This simple connector architecture is fundamentally what makes JBoss DNA so powerful and flexible. Along with a good library of connectors, which is what we're planning to create.

For instance, we want to build a connector to other JCR repositories, and another that accesses the local file system. We've already started on a Subversion connector, which will allow JCR to access the files in a SVN repository (and perhaps push changes into SVN through a commit). And of course we want to create a connector that accesses data and metadata from relational databases. For more information, check out our roadmap. Of course, if we don't have a connector to suit your needs, you can write your own.


It's even possible to put a different API layer on top of the connectors. For example, the new New I/O (JSR-203) API offers the opportunity to build new file system providers. This would be very straightforward to put on top of a JCR implementation, but it could be made even simpler by putting it on top of a DNA connector. In both cases, it'd be a trivial mapping from nodes that represent files and folders into JSR-203 files and directories, and events on those nodes could easily be translated into JSR-203 watch events. Then, simply choose a DNA connector and configure it to use the source you want to use.


Before we go further, let's define some terminology regarding connectors.

As an example, consider that we want JBoss DNA to give us access through JCR to the schema information contained in a relational databases. We first have to develop a connector that allows us to interact with relational databases using JDBC. That connector would contain a JdbcRepositorySource Java class that implements RepositorySource, and that has all of the various JavaBean properties for setting the name of the driver class, URL, username, password, and other properties. (Or we might have a JavaBean property that defines the JNDI name where we can find a JDBC DataSource instance pointing to our JDBC database.)

Our new connector would also have a JdbcRepositoryConnection Java class that implements the RepositoryConnection interface. This class would probably wrap a JDBC database connection, and would implement the execute(...) method such that the nodes exposed by the connector describe the database schema of the database. For example, the connector might represent each database table as a node with the table's name, with properties that describe the table (e.g., the description, whether it's a temporary table), and with child nodes that represent each of the columns, keys and constraints.

To use our connector in an application that uses JBoss DNA, we need to create an instance of the JdbcRepositorySource for each database instance that we want to access. If we have 3 MySQL databases, 9 Oracle databases, and 4 PostgreSQL databases, then we'd need to create a total of 16 JdbcRepositorySource instances, each with the properties describing a single database instance. Those sources are then available for use by JBoss DNA components, including the JCR implementation.

So, we've so far learned what a repository connector is and how they're used to establish connections to the underlying sources and access the content in those sources. In the next section, we'll show how these source instances can be configured, managed, and their connections pooled. After that, we'll look review JBoss DNA's existing connectors and show how to create your own connectors.

The JBoss DNA RepositoryService is the component that manages the repository sources and the connections to them. RepositorySource instances can be programmatically added to the service, but the service can actually read its configuration from a configuration repository (which, by the way, is represented by a just another RepositorySource instance that's usually added programmatically to the service). The service connects to the configuration repository, reads the content in a particular area, and automatically sets up the RepositorySource instances per the information found in the configuration repository.

The RepositoryService also transparently maintains for each source a pool of reusable connections. The pooling properties can be controlled via the configuration repository, or adjusted programmatically.

Using a repository, then, involves simply asking the RepositoryService for a RepositoryConnection to the repository given the repository's name. If a source exists with that name, the service checks out a connection from the source's pool. The resulting connection is actually a wrapper around the underlying pooled connection, so the component that requested the connection can simply close it, and under the covers the actual connection is simply returned to the pool.

To instantiate the RepositoryService, we need to first have a few other objects:

  • A ExecutionContextFactory instance, as discussed earlier.

  • A RepositoryLibrary instance that manages the list of RepositorySource instances, properly injects the execution contexts into each repository source, and provides a configurable pool of connections for each source.

  • A configuration repository that contains descriptions of all of the repository sources as well as any information those sources need. Because this is a regular repository, this could be a simple repository with content loaded from an XML file, or it could be a shared central repository with information about all of the JBoss DNA repositories used across your organization.

With these components in place, we can then instantiate the RepositoryService and start it (using its ServiceAdministrator). During startup, the service reads the configuration repository and loads any defined RepositorySource instances into the repository library, using the class loader factory (available in the ExecutionContext) to obtain.

Here's sample code that shows how to set up and start the repository service. You can see something similar in the example application in the startRepositories() method of the org.jboss.example.dna.repository.RepositoryClient class.

 // Create the factory for execution contexts, and create one ...
 ExecutionContextFactory contextFactory = new BasicExecutionContextFactory();
 ExecutionContext context = contextFactory.create();

 // Create the library for the RepositorySource instances ...
 RepositoryLibrary sources = new RepositoryLibrary(contextFactory);

 // Load into the source manager the repository source for the configuration repository ...
 InMemoryRepositorySource configSource = new InMemoryRepositorySource();
 configSource.setName("Configuration");
 sources.addSource(configSource);

 // Now instantiate the Repository Service ...
 RepositoryService service = new RepositoryService(sources, configSource.getName(), context);
 service.getAdministrator().start();
 

After startup completes, the repositories are ready to be used. The client application obtains the list of repositories and presents them to the user. When the user selects one, the client application starts navigating that repository starting at its root node (e.g., the "/" path). As you type a command to list the contents of the current node or to "change directories" to a different node, the client application obtains the information for the node using a simple procedure:

  1. Get a connection to the repository.

  2. Using the connection, find the current node and read its properties and children, putting the information into a simple Java plain old Java object (POJO).

  3. Close the connection to the repository (in a finally block to ensure it always happens).

A number of repository connectors are already available in JBoss DNA, and are outlined in the following sections. Note that we do want to build more connectors in the upcoming releases.

The in-memory repository connector is a simple connector that creates a transient, in-memory repository. This repository is used as a very simple in-memory cache or as a standalone transient repository.

The InMemoryRepositorySource class provides a number of JavaBean properties that control its behavior:

Table 5.1. InMemoryRepositorySource properties

PropertyDescription
nameThe name of the repository source, which is used by the RepositoryService when obtaining a RepositoryConnection by name.
jndiNameOptional property that, if used, specifies the name in JNDI where an InMemoryRepository instance can be found. This is an advanced property that is infrequently used.
rootNodeUuidOptional property that, if used, defines the UUID of the root node in the in-memory repository. If not used, then a new UUID is generated.
retryLimitOptional property that, if used, defines the number of times that any single operation on a RepositoryConnection to this source should be retried following a communication failure. The default value is '0'.
defaultCachePolicyOptional property that, if used, defines the default for how long this information provided by this source may to be cached by other, higher-level components. The default value of null implies that this source does not define a specific duration for caching information provided by this repository source.

The JBoss Cache repository connector allows a JBoss Cache instance to be used as a JBoss DNA (and thus JCR) repository. This provides a repository that is an effective, scalable, and distributed cache, and is often paired with other repository sources to provide a local or federated repository.

The JBossCacheSource class provides a number of JavaBean properties that control its behavior:

Table 5.2. JBossCacheSource properties

PropertyDescription
nameThe name of the repository source, which is used by the RepositoryService when obtaining a RepositoryConnection by name.
cacheFactoryJndiNameOptional property that, if used, specifies the name in JNDI where an existing JBoss Cache Factory instance can be found. That factory would then be used if needed to create a JBoss Cache instance. If no value is provided, then the JBoss Cache DefaultCacheFactory class is used.
cacheConfigurationNameOptional property that, if used, specifies the name of the configuration that is supplied to the cache factory when creating a new JBoss Cache instance.
cacheJndiNameOptional property that, if used, specifies the name in JNDI where an existing JBoss Cache instance can be found. This should be used if your application already has a cache that is used, or if you need to configure the cache in a special way.
uuidPropertyNameOptional property that, if used, defines the property that should be used to find the UUID value for each node in the cache. "dna:uuid" is the default.
retryLimitOptional property that, if used, defines the number of times that any single operation on a RepositoryConnection to this source should be retried following a communication failure. The default value is '0'.
defaultCachePolicyOptional property that, if used, defines the default for how long this information provided by this source may to be cached by other, higher-level components. The default value of null implies that this source does not define a specific duration for caching information provided by this repository source.

The federated repository source provides a unified repository consisting of information that is dynamically federated from multiple other RepositorySource instances. This is a very powerful repository source that appears to be a single repository, when in fact the content is stored and managed in multiple other systems. Each FederatedRepositorySource is typically configured with the name of another RepositorySource that should be used as the local, unified cache of the federated content. The configuration also contains the names of the other RepositorySource instances that are to be federated along with the Projection definition describing where in the unified repository the content is to appear.


The federation connector works by effectively building up a single graph by querying each source and merging or unifying the responses. This information is cached, which improves performance, reduces the number of (potentially expensive) remote calls, reduces the load on the sources, and helps mitigate problems with source availability. As clients interact with the repository, this cache is consulted first. When the requested portion of the graph (or "subgraph") is contained completely in the cache, it is retuned immediately. However, if any part of the requested subgraph is not in the cache, each source is consulted for their contributions to that subgraph, and any results are cached.

This basic flow makes it possible for the federated repository to build up a local cache of the integrated graph (or at least the portions that are used by clients). In fact, the federated repository caches information in a manner that is similar to that of the Domain Name System (DNS). As sources are consulted for their contributions, the source also specifies whether it is the authoritative source for this information (some sources that are themselves federated may not be the information's authority), whether the information may be modified, the time-to-live (TTL) value (the time after which the cached information should be refreshed), and the expiration time (the time after which the cached information is no longer valid). In effect, the source has complete control over how the information it contributes is cached and used.

The federated repository also needs to incorporate negative caching, which is storage of the knowledge that something does not exist. Sources can be configured to contribute information only below certain paths (e.g., /A/B/C), and the federation engine can take advantage of this by never consulting that source for contributions to information on other paths. However, below that path, any negative responses must also be cached (with appropriate TTL and expiry parameters) to prevent the exclusion of that source (in case the source has information to contribute at a later time) or the frequent checking with the source.

The federated repository uses other RepositorySources that are to be federated and a RepositorySource that is to be used as the cache of the unified contents. These are configured in another RepositorySource that is treated as a configuration repository. The FederatedRepositorySource class uses JavaBean properties to define the name of the configuration repository and the path to the "dna:federation" node in that configuration repository containing the information about the cache and federated sources. This graph structure that is expected at this location is as follows:

<!-- Define the federation configuration. -->
<dna:federation xmlns:dna="http://www.jboss.org/dna" 
	                   xmlns:jcr="http://www.jcp.org/jcr/1.0"
                     dna:timeToCache="100000" >
    <!-- Define how the content in the 'Cache' source is to map to the federated cache -->
    <dna:cache>
        <dna:projection jcr:name="Cache" dna:projectionRules="/ => /" />
    </dna:cache>
    <!-- Define how the content in the two sources maps to the federated/unified repository.
         This example puts the 'Cars' and 'Aircraft' content underneath '/vehicles', but the
         'Configuration' content (which is defined by this file) will appear under '/'. -->
    <dna:projections>
        <dna:projection jcr:name="Cars" dna:projectionRules="/Vehicles => /" />
        <dna:projection jcr:name="Aircraft" dna:projectionRules="/Vehicles => /" />
        <dna:projection jcr:name="Configuration" dna:projectionRules="/ => /" />
    </dna:projections>
</dna:federation>

Note

We're using XML to represent a graph structure, since the two map pretty well. Each XML element represents a node and XML attributes represent properties on a node. The name of the node is defined by either the jcr:name attribute (if it exists) or the name of the XML element. And we use XML namespaces to define the namespaces used in the node and property names. BTW, this is exactly how the XML graph importer works.

Notice that there is a cache projection and three source projections, and each projection defines one or more projection rules that are of the form:

pathInFederatedRepository => pathInSourceRepository

So, a projection rule /Vehicles => / projects the entire contents of the source so that it appears in the federated repository under the "/Vehicles" node.

The FederatedRepositorySource class provides a number of JavaBean properties that control its behavior:

Table 5.3. FederatedRepositorySource properties

PropertyDescription
nameThe name of the repository source, which is used by the RepositoryService when obtaining a RepositoryConnection by name.
repositoryNameThe name for the federated repository.
configurationSourceNameThe name of the RepositorySource that should be used as the configuration repository, and in which is defined how this federated repository is to be set up and configured. This name is supplied to the RepositoryConnectionFactory that is provided to this instance when added to the RepositoryLibrary.
configurationSourcePathThe path to the node in the configuration repository below which a "dna:federation" node exists with the graph structure describing how this federated repository is to be configured.
securityDomainOptional property that, if used, specifies the name of the JAAS application context that should be used to establish the execution context for this repository. This should correspond to the JAAS login configuration located within the JAAS login configuration file, and should be used only if a "username" property is defined.
usernameOptional property that, if used, defines the name of the JAAS subject that should be used to establish the execution context for this repository. This should be used if a "securityDomain" property is defined.
passwordOptional property that, if used, defines the password of the JAAS subject that should be used to establish the execution context for this repository. If the password is not provided but values for the "securityDomain" and "username" properties are, then authentication will use the default JAAS callback handlers.
retryLimitOptional property that, if used, defines the number of times that any single operation on a RepositoryConnection to this source should be retried following a communication failure. The default value is '0'.
defaultCachePolicyOptional property that, if used, defines the default for how long this information provided by this source may to be cached by other, higher-level components. The default value of null implies that this source does not define a specific duration for caching information provided by this repository source.

There may come a time when you want to tackle creating your own repository connector. Maybe the connectors we provide out-of-the-box don't work with your source. Maybe you want to use a different cache system. Maybe you have a system that you want to make available through a JBoss DNA repository. Or, maybe you're a contributor and want to help us round out our library with a new connector. No matter what the reason, creating a new connector is pretty straightforward, as we'll see in this section.

Creating a custom connector involves the following steps:

  1. Create a Maven 2 project for your connector;

  2. Implement the RepositorySource interface, using JavaBean properties for each bit of information the implementation will need to establish a connection to the source system.

    Then, implement the RepositoryConnection interface with a class that represents a connection to the source. The execute(ExecutionContext, Request) method should process any and all requests that may come down the pike, and the results of each request can be put directly on that request.

    Don't forget unit tests that verify that the connector is doing what it's expected to do. (If you'll be committing the connector code to the JBoss DNA project, please ensure that the unit tests can be run by others that may not have access to the source system. In this case, consider writing integration tests that can be easily configured to use different sources in different environments, and try to make the failure messages clear when the tests can't connect to the underlying source.)

  3. Configure JBoss DNA to use your connector. This may involve just registering the source with the RepositoryService, or it may involve adding a source to a configuration repository used by the federated repository.

  4. Deploy the JAR file with your connector (as well as any dependencies), and make them available to JBoss DNA in your application.

Let's go through each one of these steps in more detail.

The first step is to create the Maven 2 project that you can use to compile your code and build the JARs. Maven 2 automates a lot of the work, and since you're already set up to use Maven, using Maven for your project will save you a lot of time and effort. Of course, you don't have to use Maven 2, but then you'll have to get the required libraries and manage the compiling and building process yourself.

Note

JBoss DNA may provide in the future a Maven archetype for creating connector projects. If you'd find this useful and would like to help create it, please join the community.

In lieu of a Maven archetype, you may find it easier to start with a small existing connector project. The dna-connector-inmemory project is small, but it may be tough to separate the stuff that every connector needs from the extra code and data structures that manage the content. See the subversion repository: http://anonsvn.jboss.org/repos/dna/trunk/extensions/dna-connector-inmemory/

You can create your Maven project any way you'd like. For examples, see the Maven 2 documentation. Once you've done that, just add the dependencies in your project's pom.xml dependencies section:



<dependency>
  <groupId>org.jboss.dna</groupId>
  <artifactId>dna-graph</artifactId>
  <version>0.3</version>
</dependency>
     

This is the only dependency required for compiling a connector - Maven pulls in all of the dependencies needed by the 'dna-graph' artifact. Of course, you'll still have to add dependencies for any library your connector needs to talk to its underlying system.

As for testing, you probably will want to add more dependencies, such as those listed here:



<dependency>
  <groupId>org.jboss.dna</groupId>
  <artifactId>dna-graph</artifactId>
  <version>0.3</version>
  <type>test-jar</type>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.jboss.dna</groupId>
  <artifactId>dna-common</artifactId>
  <version>0.3</version>
  <type>test-jar</type>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>4.4</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.hamcrest</groupId>
  <artifactId>hamcrest-library</artifactId>
  <version>1.1</version>
  <scope>test</scope>
</dependency>
<!-- Logging with Log4J -->
<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-log4j12</artifactId>
  <version>1.4.3</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>log4j</groupId>
  <artifactId>log4j</artifactId>
  <version>1.2.14</version>
  <scope>test</scope>
</dependency>
     

Testing JBoss DNA connectors does not require a JCR repository or the JBoss DNA services. (For more detail, see the testing section.) However, if you want to do integration testing with a JCR repository and the JBoss DNA services, you'll need additional dependencies (e.g., dna-repository and any other extensions).

At this point, your project should be set up correctly, and you're ready to move on to writing the Java implementation for your connector.

As mentioned earlier, a connector consists of the Java code that is used to access content from a system. Perhaps the most important class that makes up a connector is the implementation of the RepositorySource. This class is analogous to JDBC's DataSource in that it is instantiated to represent a single instance of a system that will be accessed, and it contains enough information (in the form of JavaBean properties) so that it can create connections to the source.

Why is the RepositorySource implementation a JavaBean? Well, this is the class that is instantiated, usually reflectively, and so a no-arg constructor is required. Using JavaBean properties makes it possible to reflect upon the object's class to determine the properties that can be set (using setters) and read (using getters). This means that an administrative application can instantiate, configure, and manage the objects that represent the actual sources, without having to know anything about the actual implementation.

So, your connector will need a public class that implements RepositorySource and provides JavaBean properties for any kind of inputs or options required to establish a connection to and interact with the underlying source. Most of the semantics of the class are defined by the RepositorySource and inherited interface. However, there are a few characteristics that are worth mentioning here.

Each connector is responsible for determining whether and how long DNA is to cache the content made available by the connector. This is referred to as the caching policy, and consists of a time to live value representing the number of milliseconds that a piece of data may be cached. After the TTL has passed, the information is no longer used.

DNA allows a connector to use a flexible and powerful caching policy. First, each connection returns the default caching policy for all information returned by that connection. Often this policy can be configured via properties on the RepositorySource implementation. This is optional, meaning the connector can return null if it does not wish to have a default caching policy.

Second, the connector is able to override its default caching policy on individual requests (which we'll cover in the next section). Again, this is optional, meaning that a null caching policy on a request implies that the request has no overridden caching policy.

Third, if the connector has no default caching policy and none is set on the individual requests, DNA uses whatever caching policy is set up for that component using the connector. For example, the federating connector allows a default caching policy to be specified, and this policy is used should the sources being federated not define their own caching policy.

In summary, a connector has total control over whether and for how long the information it provides is cached.

Sometimes it is necessary (or easier) for a RepositorySource implementation to look up an object in JNDI. One example of this is the JBoss Cache connector: while the connector can instantiate a new JBoss Cache instance, more interesting use cases involve JBoss Cache instances that are set up for clustering and replication, something that is generally difficult to configure in a single JavaBean. Therefore the JBossCacheSource has optional JavaBean properties that define how it is to look up a JBoss Cache instance in JNDI.

This is a simple pattern that you may find useful in your connector. Basically, if your source implementation can look up an object in JNDI, simply use a single JavaBean String property that defines the full name that should be used to locate that object in JNDI. Usually it's best to include "Jndi" in the JavaBean property name so that administrative users understand the purpose of the property. (And some may suggest that any optional property also use the word "optional" in the property name.)

Another characteristic of a RepositorySource implementation is that it provides some hint as to whether it supports several features. This is defined on the interface as a method that returns a RepositorySourceCapabilities object. This class currently provides methods that say whether the connector supports updates, whether it supports same-name-siblings (SNS), and whether the connector supports listeners and events.

Note that these may be hard-coded values, or the connector's response may be determined at runtime by various factors. For example, a connector may interrogate the underlying system to decide whether it can support updates.

The RepositorySourceCapabilities can be used as is (the class is immutable), or it can be subclassed to provide more complex behavior. It is important, however, that the capabilities remain constant throughout the lifetime of the RepositorySource instance.

Note

Why a concrete class and not an interface? By using a concrete class, connectors inherit the default behavior. If additional capabilities need to be added to the class in future releases, connectors may not have to override the defaults. This provides some insulation against future enhancements to the connector framework.

As we'll see in the next section, the main method connectors have to process requests takes an ExecutionContext, which contains the JAAS security information of the subject performing the request. This means that the connector can use this to determine authentication and authorization information for each request.

Sometimes that is not sufficient. For example, it may be that the connector needs its own authorization information so that it can establish a connection (even if user-level privileges still use the ExecutionContext provided with each request). In this case, the RepositorySource implementation will probably need JavaBean properties that represent the connector's authentication information. This may take the form of a username and password, or it may be properties that are used to delegate authentication to JAAS. Either way, just realize that it's perfectly acceptable for the connector to require its own security properties.

One job of the RepositorySource implementation is to create connections to the underlying sources. Connections are represented by classes that implement the RepositoryConnection interface, and creating this class is the next step in writing a repository connector. This is what we'll cover in this section.

The RepositoryConnection interface is pretty straightforward:

/**
 * A connection to a repository source.
 * <p>
 * These connections need not support concurrent operations by multiple threads.
 * </p>
 */
@NotThreadSafe
public interface RepositoryConnection {

    /**
     * Get the name for this repository source. This value should be the same as that returned
     * by the same RepositorySource that created this connection.
     * 
     * @return the identifier; never null or empty
     */
    String getSourceName();

    /**
     * Return the transactional resource associated with this connection. The transaction manager 
     * will use this resource to manage the participation of this connection in a distributed transaction.
     * 
     * @return the XA resource, or null if this connection is not aware of distributed transactions
     */
    XAResource getXAResource();

    /**
     * Ping the underlying system to determine if the connection is still valid and alive.
     * 
     * @param time the length of time to wait before timing out
     * @param unit the time unit to use; may not be null
     * @return true if this connection is still valid and can still be used, or false otherwise
     * @throws InterruptedException if the thread has been interrupted during the operation
     */
    boolean ping( long time, TimeUnit unit ) throws InterruptedException;

    /**
     * Set the listener that is to receive notifications to changes to content within this source.
     * 
     * @param listener the new listener, or null if no component is interested in the change notifications
     */
    void setListener( RepositorySourceListener listener );

    /**
     * Get the default cache policy for this repository. If none is provided, a global cache policy
     * will be used.
     * 
     * @return the default cache policy
     */
    CachePolicy getDefaultCachePolicy();

    /**
     * Execute the supplied commands against this repository source.
     * 
     * @param context the environment in which the commands are being executed; never null
     * @param request the request to be executed; never null
     * @throws RepositorySourceException if there is a problem loading the node data
     */
    void execute( ExecutionContext context,
                  Request request ) throws ;

    /**
     * Close this connection to signal that it is no longer needed and that any accumulated 
     * resources are to be released.
     */
    void close();
}

While most of these methods are straightforward, a few warrant additional information. The ping(...) allows DNA to check the connection to see if it is alive. This method can be used in a variety of situations, ranging from verifying that a RepositorySource's JavaBean properties are correct to ensuring that a connection is still alive before returning the connection from a connection pool.

DNA hasn't yet defined the event mechanism, so connectors don't have any methods to invoke on the RepositorySourceListener. This will be defined in the next release, so feel free to manage the listeners now. Note that by default the RepositorySourceCapabilities returns false for supportsEvents().

The most important method on this interface, though, is the execute(...) method, which serves as the mechanism by which the component using the connector access and manipulates the content exposed by the connector. The first parameter to this method is the ExecutionContext, which contains the information about environment as well as the subject performing the request. This was discussed earlier.

The second parameter, however, represents a request that is to be processed by the connector. Request objects can take many different forms, as there are different classes for each kind of request (see the table below). Each request contains the information a connector needs to do the processing, and it also is the place where the connector places the results (or the error, if one occurs).

How do the requests reference a node (or nodes)? Since requests are coming from a client, the client may identify a particular node using a Location object that is created with:

  • the Path to the node; or

  • one or more identification properties that are likely source=specific and that are represented with Property objects; or

  • a combination of both.

So, when a client knows the path or the identification properties, they can create a Location. However, all of the requests return Location objects, so often times the client simply uses the location from a previous request. Since Location is an immutable class, it is perfectly safe to reuse them.

One more thing about locations: while the request may have an incomplete location (e.g., a path but no identification properties), the connector is expected to set on the request the actual location that contains the path and all identification properties. So as long as the client reuses the actual locations in subsequent requests, the connectors will have the benefit of having both the path and identification properties. Connectors can then be written to leverage this information, although the connector should still perform as expected when requests have incomplete locations.

Table 5.4. Types of Requests

NameDescription
ReadNodeRequest A request to read from the source a node's properties and children. The node may be specified by path and/or by identification properties. The connector returns all properties and the locations for all children, or sets a PathNotFoundException error on the request if the node did not exist. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties).
ReadAllPropertiesRequest A request to read from the source all of the properties of a node. The node may be specified by path and/or by identification properties. The connector returns all properties that were found on the node, or sets a PathNotFoundException error on the request if the node did not exist. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties).
ReadPropertyRequest A request to read from the source a single property of a node. The node may be specified by path and/or by identification properties, and the property is specified by name. The connector returns the property if found on the node, or sets a PathNotFoundException error on the request if the node or property did not exist. If the node is found, the connector sets on the request the actual location of the node (including the path and identification properties).
ReadAllChildrenRequest A request to read from the source all of the children of a node. The node may be specified by path and/or by identification properties. The connector returns an ordered list of locations for each child found on the node, an empty list if the node had no children, or sets a PathNotFoundException error on the request if the node did not exist. If the node is found, the connector sets on the request the actual location of the parent node (including the path and identification properties).
ReadBlockOfChildrenRequest A request to read from the source a block of children of a node, starting with the nth children. This is designed to allow paging through the children, which is much more efficient for large numbers of children. The node may be specified by path and/or by identification properties, and the block is defined by a starting index and a count (i.e., the block size). The connector returns an ordered list of locations for each of the node's children found in the block, or an empty list if there are no children in that range. The connector also sets on the request the actual location of the parent node (including the path and identification properties) or sets a PathNotFoundException error on the request if the parent node did not exist.
ReadNextBlockOfChildrenRequest A request to read from the source a block of children of a node, starting with the children that immediately follow a previously-returned child. This is designed to allow paging through the children, which is much more efficient for large numbers of children. The node may be specified by path and/or by identification properties, and the block is defined by the location of the node immediately preceding the block and a count (i.e., the block size). The connector returns an ordered list of locations for each of the node's children found in the block, or an empty list if there are no children in that range. The connector also sets on the request the actual location of the parent node (including the path and identification properties) or sets a PathNotFoundException error on the request if the parent node did not exist.
ReadBranchRequest A request to read a portion of a subgraph that has as its root a particular node, up to a maximum depth. This request is an efficient mechanism when a branch (or part of a branch) is to be navigated and processed, and replaces some non-trivial code to read the branch iteratively using multiple ReadNodeRequests. The connector reads the branch to the specified maximum depth, returning the properties and children for all nodes found in the branch. The connector also sets on the request the actual location of the branch's root node (including the path and identification properties). The connector sets a PathNotFoundException error on the request if the node at the top of the branch does not exist.
CreateNodeRequest A request to create a node at the specified location and setting on the new node the properties included in the request. The connector creates the node at the desired location, adjusting any same-name-sibling indexes as required. (If an SNS index is provided in the new node's location, existing children with the same name after that SNS index will have their SNS indexes adjusted. However, if the requested location does not include a SNS index, the new node is added after all existing children, and it's SNS index is set accordingly.) The connector also sets on the request the actual location of the new node (including the path and identification properties).. The connector sets a PathNotFoundException error on the request if the parent node does not exist.
RemovePropertiesRequest A request to remove a set of properties on an existing node. The request contains the location of the node as well as the names of the properties to be removed. The connector performs these changes and sets on the request the actual location (including the path and identification properties) of the node. The connector sets a PathNotFoundException error on the request if the node does not exist.
UpdatePropertiesRequest A request to set or update properties on an existing node. The request contains the location of the node as well as the properties to be set and those to be deleted. The connector performs these changes and sets on the request the actual location (including the path and identification properties) of the node. The connector sets a PathNotFoundException error on the request if the node does not exist.
RenameNodeRequest A request to change the name of a node. The connector changes the node's name, adjusts all SNS indexes accordingly, and returns the actual locations (including the path and identification properties) of both the original location and the new location. The connector sets a PathNotFoundException error on the request if the node does not exist.
CopyBranchRequest A request to copy a portion of a subgraph that has as its root a particular node, up to a maximum depth. The connector copies the branch from the original location, up to the specified maximum depth, and places a copy of the node as a child of the new location. The connector also sets on the request the actual location (including the path and identification properties) of the original location as well as the location of the new copy. The connector sets a PathNotFoundException error on the request if the node at the top of the branch does not exist.
MoveBranchRequest A request to move a subgraph that has a particular node as its root. The connector moves the branch from the original location and places it as child of the specified new location. The connector also sets on the request the actual location (including the path and identification properties) of the original and new locations. The connector will adjust SNS indexes accordingly. The connector sets a PathNotFoundException error on the request if the node that is to be moved or the new location do not exist.
DeleteBranchRequest A request to delete an entire branch specified by a single node's location. The connector deletes the specified node and all nodes below it, and sets the actual location, including the path and identification properties, of the node that was deleted. The connector sets a PathNotFoundException error on the request if the node being deleted does not exist.
CompositeRequest A request that actually comprises multiple requests (none of which will be a composite). The connector simply processes all of the requests in the composite request, but should set on the composite request any error (usually the first error) that occurs during processing of the contained requests.

Although there are over a dozen different kinds of requests, we do anticipate adding more in future releases. For example, DNA will likely support searching repository content in sources through an additional subclass of Request. Getting the version history for a node will likely be another kind of request added in an upcoming release.

A connector is technically free to implement the execute(...) method in any way, as long as the semantics are maintained. But DNA provides a RequestProcessor class that can simplify writing your own connector and at the same time help insulate your connector from new kinds of requests that may be added in the future. The RequestProcessor is an abstract class that defines a process(...) method for each concrete Request subclass. In other words, there is a process(CompositeRequest) method, a process(ReadNodeRequest) method, and so on.

To use this in your connector, simply create a subclass of RequestProcessor, overriding all of the abstract methods and optionally overriding any of the other methods that have a default implementation.

Note

In many cases, the default implementations of the process(...) methods are sufficient but probably not efficient or optimum. If that is the case, simply provide your own methods that perform the request in a manner that is efficient for your source. However, if performance is not a big issue, all of the concrete methods will provide the correct behavior. And remember, you can always provide better implementations later.

Then, in your connector's execute(ExecutionContext, Request) method, instantiate your RequestProcessor subclass and call its process(Request) method, passing in the execute(...) method's Request parameter. The RequestProcessor will determine the appropriate method given the actual Request object and will then invoke that method:

public void execute( final ExecutionContext context,
                     final Request request ) throws RepositorySourceException {
    RequestProcessor processor = new RequestProcessor(context);
    processor.process(request);
}

If you do this, the bulk of your connector implementation may be in the RequestProcessor implementation methods. This not only is pretty maintainable, it also lends itself to easier testing. And should any new request types be added in the future, your connector may work just fine without any changes. In fact, if the RequestProcessor class can implement meaningful methods for those new request types, your connector may "just work". Or, at least your connector will still be binary compatible, even if your connector won't support any of the new features.

Finally, how should the connector handle exceptions? As mentioned above, each Request object has a slot where the connector can set any exception encountered during processing. This not only handles the exception, but in the case of CompositeRequests it also correctly associates the problem with the request. However, it is perfectly acceptable to throw an exception if the connection becomes invalid (e.g., there is a communication failure) or if a fatal error would prevent subsequent requests from being processed.

After building your connector project, you need to configure the JBoss DNA components your application is using so that they use your connector. In a lot of cases, this will entail instantiating your connector's RepositorySource class, setting the various properties, and registering it with a RepositoryLibrary. Or, it will entail using a configuration repository to use your source and letting RepositoryService instantiate and set up your RepositorySource instance. Or, you can just instantiate and set it up manually, passing the instance to whatever component needs it.

And of course you have to make the JAR file containing your connector (as well as any dependency JARs) available to your application's classpath.

So far we've talked about repositories, repository connectors, and how connectors respond to the different kinds of requests. Normally you'd code to the JCR API and use our JCR implementation. However, what does your code look like if you want to use the connectors directly, without using our JCR implementation? After all, you may be a contributor to JBoss DNA, or you may want to take advantage of our connectors without all the overhead of JCR.

One option, of course, is to explicitly create the different requests and pass them to the connector's execute(...) method. While this is the most efficient approach (and one taken in some key DNA components), you probably want something that is much less verbose and much easier to use. This is where the DNA graph API comes in.

JBoss DNA's Graph API was designed as a lightweight public API for working with graph information, and it insulates components from the underlying requests and interacting with connectors. The Graph class is the primary class in API, and each instance represents a single, independent view of the graph of content from a single connector. Graph instances return snapshots of state, and those snapshots never change after they're retrieved. To obtain a Graph instance, use the static create(...) method, supplying the name of the source, a RepositoryConnectionFactory from which a RepositoryConnection can be obtained, and the ExecutionContext.

The Graph class basically represents an internal domain specific language (DSL), designed to be easy to use in an application. The Graph API makes extensive use of interfaces and method chaining, so that methods return a concise interface that has only those methods that make sense at that point. In fact, this should be really easy if your IDE has code completion. Just remember that under the covers, a Graph is just building Request objects, submitting them to the connector, and then exposing the results.

Let's look at some examples of how the Graph API works. This first example returns a map of properties (keyed by property name) for a node at a specific Path:

Path path = ...
Map<Name,Property> propertiesByName = graph.getPropertiesByName().on(path);

This next example shows how the graph can be used to obtain and loop over the properties of a node:

Path path = ...
for ( Property property : graph.getProperties().on(path) ) {
	  ...
}

Likewise, the next example shows how the graph can be used to obtain and loop over the children of a node:

Path path = ...
for ( Location child : graph.getChildren().of(path) ) {
    Path childPath = child.getPath();
	  ...
}

Notice that the examples pass a Path instance to the on(...) and of(...) methods. Many of the Graph API methods take a variety of parameter types, including String, Paths, Locations, UUID, or Property parameters. This should make it easy to use in many different situations.

Of course, changing content is more interesting and offers more interesting possibilities. Here are a few examples:

Path path = ...
Location location = ...
Property idProp1 = ...
Property idProp2 = ...
UUID uuid = ...
graph.move(path).into(idProp1, idProp2);
graph.copy(path).into(location);
graph.delete(uuid);
graph.delete(idProp1,idProp2);

The methods shown above work immediately, as soon as each request is built. However, there is another way to use the Graph object, and that is in a batch mode. Simply create a Graph.Batch object using the batch() method, create the requests on that batch object, and then execute all of the commands on the batch by calling its execute() method. That execute() method returns a Results interface that can be used to read the node information retrieved by the batched requests.

Method chaining works really well with the batch mode, since multiple commands can be assembled together very easily:

Path path = ...
String path2 = ...
Location location = ...
Property idProp1 = ...
Property idProp2 = ...
UUID uuid = ...
graph.batch().move(path).into(idProp1, idProp2).and().copy(path2).into(location).and().delete(uuid).execute();
Results results = graph.batch().read(path2).
                    and().readChildren().of(idProp1,idProp2).
                    and().readSugraphOfDepth(3).at(uuid2).
                    execute();
for ( Location child : results.getNode(path2) ) {
    ...
}

Of course, this section provided just a hint of the Graph API. The Graph interface is actually quite complete and offers a full-featured approach for reading and updating a graph. For more information, see the Graph JavaDocs.

In this chapter, we covered all the aspects of JBoss DNA repositories, including the connector framework, how DNA's JCR implementation works with connectors, what connectors are available (and how to use them), and how to write your own connector. So now that you know how to set up and use JBoss DNA repositories, the next chapter describes how you can leverage JBoss DNA's JCR implementation.