Basic Persistence using XStream

By Mark Halloran, OCI Senior Software Engineer

November 2009

Introduction

Many tasks in software engineering provide unique opportunities to fail. Persistence is a very good example, because we, the application engineers, have to live with the decisions we make for an extended period of time, and decisions made based on initial information might be the incorrect decisions when more information is known.

XStream is a simple library to serialize objects to XML and back again. Using it correctly to stream persistent class data to XML provides a sound and stable base from which persistence can be accomplished.

In this article, I demonstrate some basic tenets to be considered when designing persistence using XStream for an application.

Establish a dichotomy between an application's model and behavior
Protect against refactoring during ongoing development
Incorporate extensibility to your persistence model (versioning)
Writing and reading the persisted data
Consider the application lifecycle when determining what to persist

Along with this discussion, I am providing code fragments to illustrate particular points.

XStream can be obtained from: http://xstream.codehaus.org/, and it is available under a BSD license. You will also require the open-source XPP3 (XML Pull Parser), which can be obtained from: http://www.extreme.indiana.edu/dist/java-repository/xpp3/distributions/

I am using Java2D objects to represent a portion of the application's state.

Establish a dichotomy between an application's model and behavior

Dichotomy: (Webster's Dictionary): the division into two especially mutually exclusive or contradictory groups or entities ,
also: the process or practice of making such a division

The object-oriented concept of encapsulation provides benefits of co-locating attributes and behavior. Persistence requires we separate them, allowing a component of an application to be saved and subsequently restored to a usable state.

Frequently, establishing the correct dichotomy between state and behavior can be a source of problems. It serves one well to pay careful attention to establishing a sound and consistent separation.

In our example, we have defined an abstract class GeneralArea from which we extend a GeneralRect and GeneralCirc. These areas can be added to an assembly (which we'll introduce later):

public abstract class GeneralArea {
    private static final String GENERAL_AREA_VERSION = "generalAreaVersion";
 
    protected static final double DEFAULT_ORIGIN_X = 0.0;
    protected static final double DEFAULT_ORIGIN_Y = 0.0;
    public static final double AREA_EPSILON = 1e-12;
 
    private double originX = DEFAULT_ORIGIN_X;
    private double originY = DEFAULT_ORIGIN_Y;
    private int generalAreaVersion = 1;
    [...]
}
 
public abstract class GeneralRect extends GeneralArea {
    private static final double DEFAULT_WIDTH = 2.0;
    private static final double DEFAULT_HEIGHT = 1.0;
 
    // persisted data.
    public double width = DEFAULT_WIDTH;
    public double height = DEFAULT_HEIGHT;
    [...]
}
 
public abstract class GeneralCirc extends GeneralArea {
    private static final double DEFAULT_DIAMETER = 1.0;
 
    // persisted data.
    public double diameter = DEFAULT_DIAMETER;
    [...]
}

We certainly will need to maintain attributes such as the origin of any area, as well as the width and height of a rectangle and the diameter of the circle. These are attributes required to define the shape of the area. But, if we wish also draw these areas using Java2D, we'll need to use an object of type java.awt.geom.Area. We'll add this to our GeneralArea class, and add an attribute to hold on to the java.awt.geom.Area attribute as well.

If you consider the application's lifecycle, you will notice that the area attribute can always be re-generated from the input values of origin and size. As we extend our dynamic model to maintain the created area from our general areas, we will choose to not persist the generated Java2D Area.

import java.awt.geom.Area;
 
public abstract class GeneralArea {
    private static final String GENERAL_AREA_AREA = "area";
    private transient Area area;
 
    protected static final double DEFAULT_ORIGIN_X = 0.0;
    protected static final double DEFAULT_ORIGIN_Y = 0.0;
    public static final double AREA_EPSILON = 1e-12;
 
    private double originX = DEFAULT_ORIGIN_X;
    private double originY = DEFAULT_ORIGIN_Y;
    [...]
}

Notice that the area attribute has been declared as transient. This is one way of indicating to XStream that serialization (persistence for us) is not required.

A second approach, more explicit, is to tell XStream directly that a particular attribute (or field) should be omitted.

I prefer utilizing both approaches to ensure follow-on developers do not miss intended targets of persistence. This approach is:

   XStream:omitField(Class type, String fieldName);

I provide a method ...

public static void setupXStream(XStream xstream);

... in each of my persistent classes to provide for customization of XStream.

The class method in GeneralArea is such:

public static void setupXStream(XStream xstream) {
    [...]
    xstream.omitField(GeneralArea.class, GENERAL_AREA_AREA);
    [...]
 
    GeneralRect.setupXStream(xstream);
    GeneralCirc.setupXStream(xstream);
}

Notice that I also force the setup for derived classes, a simplification I hope you'll excuse for brevity's sake.

We now can distinguish and indicate to XStream the attributes we wish to persist and those we wish to ignore. By doing this, we have limited our exposure from persisting classes over which we do not have direct control. Consider this when choosing how to maintain state, especially when GUI classes are involved; these can become a detriment as an application matures.

Protect against refactoring during ongoing development

As we evolve our designs, we often move classes from one package into another. By default, XStream uses a fully qualified name to resolve class names for persisted objects. This removes ambiguity from persisted classes in different packages that have the same name. However, it also prohibits a class from being recognized when refactored into another package.

To alleviate this scenario, XStream provides the following convenience method that allows a class to be recognized after a move.

   xstream.alias(String nameToAlias, Class classToAlias);

Exercising care in naming is very important to disambiguate persisted objects. Remember to consider ancillary classes that may be included in your class to make sure you've accounted for each of the included classes.

Incorporate Extensibility to Persisted Objects

An application maintaining persistent data will undergo changes as requirements are added, changed, or re-implemented. Providing versioning for each of these objects allows new implementations to accept and modify legacy persisted objects when previous versions of data are restored.

Adding requirements

I will introduce the concept of an assembly. This serves to collect GeneralAreas. Further, we want to be able to add and subtract areas from this assembly; for instance, we can now add a positive rectangle and then subtract a negative circle, yielding a hole in the rectangle.

Providing the assembly

We'll make the GeneralArea responsible for maintaining its negative state. The assembly is responsible only for maintaining the [ordered] collection of GeneralAreas. Here's our new class:

public class Assembly {
 
    // Tags used to facilitate XStream.
    //
    private static final String XSTREAM_ID = "Assembly";
    private static final String ASSEMBLY_VERSION = "assemblyVersion";
    private static final String GENERAL_AREAS = "generalAreas";
    private static final String CHANGE_SUPPORT = "changeSupport";
    private static final String XSTREAM = "xstream";
 
    // NON-PERSISTED data, must be regenerated on read or access.
    //
    private transient XStream xstream;
    private transient Area area = null;
 
    // Persisted attributes.
    //
    protected List<GeneralArea> generalAreas = new ArrayList<GeneralArea>();
    protected int assemblyVersion = 1;
 
    public Assembly() {
        initializeXStream();
    }
 
    public void add(GeneralArea generalArea) {
        area = null;
        generalAreas.add(generalArea);
    }
 
    public Area getArea() {
        if (area == null) {
            area = new Area();
            for (GeneralArea generalArea : generalAreas) {
                if (generalArea.isNegative()) {
                    area.subtract(generalArea.getArea());
                } else {
                    area.add(generalArea.getArea());
                }
            }
        }
        return area;
    }
    [...]
    private void initializeXStream() {
        if (xstream == null) {
            xstream = new XStream(new XppDomDriver());
            setupXStream(xstream);
        }
    }
 
    public static void setupXStream(XStream xstream) {
        xstream.alias(XSTREAM_ID, Assembly.class);
        xstream.useAttributeFor(Assembly.class, ASSEMBLY_VERSION);
        xstream.addImplicitCollection(GeneralArea.class, GENERAL_AREAS);
 
        xstream.omitField(GeneralArea.class, XSTREAM);
 
        GeneralArea.setupXStream(xstream);
    }
 
    public String saveToXStream() {
        initializeXStream();
        return xstream.toXML(this);
    }
}

Notice that the area attribute is transient, and I use lazy evaluation to acquire a valid summation of the collected generalAreas. I do this to simplify the difference between a constructed model, a read-in model, or (as you can see in the add method) a modified model.

I'm making the assumption that at any or all times, the overall area may not have been generated. This provides consistent access to the model throughout its lifetime.

Modifying the `GeneralArea`

Next, we add a boolean isNegative attribute to GeneralArea. Since we have maintained a version attribute that is streamed in and out, now we will bump the version and introduce a new method:

private Object readResolve();

readResolve provides access (as does the JDK serialization) to the instantiated object of a class after it is populated. This provides a place to perform default construction and version maintenance on data. We will use it to perform version maintenance:

private Object readResolve() {
    if (generalAreaVersion < GENERAL_AREA_VERSION_2) {
        isNegative = false;
        generalAreaVersion = GENERAL_AREA_VERSION_2;
    }
    return this;
}

Writing and Reading the Persisted Data

XStream provides a method for outputting the data of your classes as XML. Note however, that XStream does not generate the XML headers – only consistent chunks of your data that can be embedded in a complete HTML document.

I do not pollute the object model with the responsibility of providing header data; I allow the user of the XML output to construct the document using the XML output string data describing my persistent structures.

As the assembly is the highest level of this small architecture, I place the responsibility of encoding our classes at the Assembly level by implementing:

public String toXMLString() {
    initializeXStream();
    return xstream.toXML(this);
}

After consideration, preparation, and implementation, it becomes that simple to extract your data model to XML.

Consider the application lifecycle when determining what to persist

Let's review some of the decisions I have made during development of this small set of code.

I chose not to persist data that is completely dependent upon more basic data. This simplifies assumptions regarding consistency. What happens if I save a modified but not updated model, when an origin has been edited, but the resulting Java2D Area has not been regenerated? By omitting values I can generate, I remove the issue.

There are also attributes in the included source code required to provide behavior necessary for a working application, like PropertyChange. This attribute contributes nothing to the model, only run-time behavior. Thus, I do not persist it. Rather, I use readResolve to re-hookup listeners for a streamed-in model.

Summary

I hope that this has provided a basic front-to-back understanding of utilizing XStream to provide persistence.

Please utilize the source code to further explore XStream usage.

References

[1] XStream
http://xstream.codehaus.org
[2] Code Examples (including an Eclipse classpath)
jnbNov2009.zip

Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.