Collaborating with the Java Memory Manager
By Brian Gilstrap, OCI Principal Software Engineer
June 2000
Java developers are all familiar with the Java garbage collector, which eliminates the need to explicitly deallocate objects. However, few know that there is a package in Java 2™ (JDK 1.2 and later) that allows them to coordinate their activities with the garbage collector.
This package, java.lang.ref
, provides a number of classes that allow developers some interaction with the garbage collector.
Types of Object References
To understand how these java.lang.ref
classes are useful, we must first understand the different types of references.
All developers use object references when programming.
For example, the following line of Java code declares a reference to a String and makes it refer to a string object with contents "hello":
String aHelloString = "hello"
The variable aHelloString
is an object reference. In order to distinguish these garden-variety object references from the other kinds of references we will be discussing, Java calls this kind of reference a strong reference. This is because the String object referred to by aHelloString
is held onto "strongly," meaning it can't be garbage collected as long as it is referred to in this manner.
In addition to strong references, Java 2 provides three other kinds of reference. They provide increasingly "weaker" references, meaning that each successive type places fewer restrictions on the garbage collector.
These three types of references are:
- soft references.
java.lang.ref.SoftReference
- weak references.
java.lang.ref.WeakReference
- phantom references.
java.lang.ref.PhantomReference
All of these reference types share certain common characteristics:
- They all refer to a single object, called the referent (see Figure 1 below).
- They are created with a constructor that includes a reference to the referent and either refers to it or to no object at all.
- There is a
get()
method, which can be used to get back a strong reference to the referent (if that reference hasn't been cleared). - The garbage collector will clear the reference at its discretion when certain conditions are met (the conditions for each kind of reference are discussed below).
These other kinds of references are created by creating an instance of one of the reference classes in the java.lang.ref package
(SoftReference
, WeakReference
, or PhantomReference
). (There is also a java.lang.ref.Reference
class, but it is an abstract base class for the other three and cannot be instantiated.)
Let's look at an example. The following line of code corresponds to Figure 1:
Reference aRef = new SoftReference( new String( 'foo' ) );
While we created a SoftReference
in the code, we could have created a WeakReference
or a PhantomReference
and achieved essentially the same state of affairs.
In looking at Figure 1, we can see that these new kinds of reference objects add overhead in terms of space (there is another Java object on the heap, the reference object), and they also cost us some time (the garbage collector interacts with these references in certain ways, which takes time).
This leads us to a question: What features do these references offer that would make it worth the time and space to use them?
To understand this, we need to look at each kind of reference.
Soft References
Soft references are good for providing caches of objects that can be garbage collected when the Java Virtual Machine (JVM) is running low on memory.
In particular, the JVM guarantees a couple of useful things:
- It will not garbage collect an object as long as there are normal, strong references to it.
- It will not throw an
OutOfMemoryError
until it has "cleaned up" all softly reachable objects (objects not reachable by strong references but reachable through one or more soft references).
This allows us to use soft references to refer to objects that we could afford to have garbage collected, but that are convenient to have around until memory becomes tight.
Obviously, this means our program has to either be able to live without these objects or be able to recreate them. But we can let the garbage collector decide when to collect these objects, continuing the Java goal of letting the programmer avoid explicit memory management.
There is no specific timeframe after an object "goes soft" (becomes reachable through soft references but not through strong references) before the garbage collector collects the object, but the documentation encourages JVM implementors to prefer collecting older objects and to delay collection as long as is feasible.
Weak References
Weak references are good for providing "canonicalizing mappings." These are mappings from some unique identifier to an object.
For example, imagine we have permanent storage (a file, database, etc.) containing a large number of Employee objects, and we only work with a subset of them at any given time (perhaps there are too many to fit in a running JVM on the machines we have, and we only need to work with a small number of them at any given time).
Assume that we uniquely identify employees by an employee number. We could use weak references to refer to all employee objects currently in the JVM and provide a Map
(or HashTable
) that maps from the employee number to a weak reference to the corresponding Employee object (see Figure 2).
When using this sort of approach, we have a "canonical" mapping between the employee number and the Employee object, meaning there is a one-to-one correspondence between employee numbers as keys in the map and Employee objects that have that employee number.
We would not want to prevent an Employee object from being garbage collected if no other part of the program is using that object, since we could always retrieve the object from permanent storage. In addition, we want to avoid using a soft reference because this puts pressure on the garbage collector (since it has to wait until running out of memory to reclaim the objects). If we use weak references, we give the garbage collector greater freedom to collect an Employee object more quickly.
This approach allows us to work with a subset of a very large collection of objects (all employees) in a Java program running on a machine with much less memory than we would need if we tried to read in all Employee objects. If we didn't have weak references, we would have to work around the garbage collector to determine when to "free" a given Employee object, removing all the benefits of the garbage collector and automatic memory management.
Phantom References
Phantom references are a bit different from the previous two kinds.
It turns out that you can never get back the referent of a PhantomReference
once you have created it (calling the get()
method always returns null). Yet the PhantomReference
still holds onto a reference to its referent. Why is this?
Phantom references are designed to allow for pre-mortem cleanup. This means that you can learn about an object that is going to be garbage collected just before it actually gets collected and clean up resources it is using.
This is particularly useful if you have several objects collaborating together, and you need to do the cleanup only when all of the objects are no longer using that shared resource. You can keep track of the shared resource and, when the last object becomes phantomly reachable, do the actual cleanup.
This still leaves one piece of the puzzle missing: How do we know when the object has become phantomly reachable?
This leads us to the one remaining class in the java.lang.ref package: ReferenceQueue
.
Reference Queue
When creating a reference object (phantom, weak, or soft), you can specify a reference queue to associate with the reference object. In addition to the behavior already described, the garbage collector will place such reference objects onto the specified ReferenceQueue
when it takes action on them.
This allows us to read the references off the queue and know that the garbage collector has done its work with that reference.
In the case of a PhantomReference
, we can take note of the references that have become phantom, and when the reference to the last object using the shared resource "goes phantom," we can clean up that shared resource.
We can choose to dedicate a thread to reading entries off a ReferenceQueue
, or we can poll it at strategic points in the code.
Summary
The new kinds of references provided by the java.lang.ref
package enable developers to build smarter, more efficient programs. And while we can't create fundamentally new kinds of references, we can subclass the concrete reference classes (SoftReference
, WeakReference
, and PhantomReference
) to add features (this is common with PhantomReference
as a way to keep track of the shared resource to be cleaned up).
These classes are not used frequently, but when they are needed they are invaluable. With these classes built in as a standard part of Java 2, programmers get a consistent means to build more robust software without resorting to platform/JVM-specific approaches.
Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.