Getting To Know Berkeley DB Java Edition

By Weiqi Gao, OCI Principal Software Engineer

November 2006

Introduction

Berkeley DB Java Edition is an embedded database written in pure Java. It allows efficient persistence of data into local disks. It provides full ACID transaction semantics for data storage. It uses an in-memory cache to speed up data reads when the working set can fit into memory. And its log-based storage system makes data writes very fast.

Berkeley DB Java Edition also supports Java Transaction API (JTA), J2EE Connector Architecture (JCA) and Java Management Extensions (JMX), making it easily integrated into many Java EE application servers.

The latest release of Berkeley DB Java Edition contains an API known as the Direct Persistence Layer (DPL), which is similar to the Java Persistence API (API). This layer is built on top of the lower-level byte array based Berkeley DB Java Edition engine. Berkeley DB Java Edition also contains an implementation of the Java Collections API that is backed by the same engine.

On the administrative side, Berkeley DB Java Edition uses an architecture-neutral file format and supports hot backups.

Licensing and History

The free software Berkeley DB (the C Edition, or the Core Edition, to distinguish from the Java Edition) has been in existence since 1991. The original authors of Berkeley DB founded the Sleepycat Software company in 1996 to further develop and support, both through a free software license and a commercial license, the Berkeley DB software. Software that redistributes Berkeley DB must be either itself available as free software or use a commercially licensed Berkeley DB.

A JNI-based Java binding for Berkeley DB (Core Edition) was introduced soon after the Java programming language become popular. Sleepycat released a pure Java version Berkeley DB in 2004.

Oracle Corporation acquired Sleepycat Software in February 2006 to expand its line of embeddable database products.

Oracle has continued both the development and the dual licensing model of Sleepycat's line of database products: Berkeley DB (Core Edition), Berkeley DB XML and Berkeley DB Java Edition.

Oracle Berkeley DB Java Edition 3.1.0 was released in September 2006.

Contrast with Other Kinds of Databases

When thinking about databases, the first thing that comes to a Java developer's mind is probably relational database servers. These servers have a communications layer that allows a remote application to connect to the database process, a SQL compiler that understands and executes SQL queries from the application, and an interactive shell that allows users and administrators to perform ad hoc queries. Berkeley DB Java Edition has none of these.

What it does have is a transaction engine that offers full ACID (Atomicity, Consistency, Isolation and Durability) semantics, an in-memory cache that manages the data using B+trees, and a persistence engine that writes log files to local disks. Everything Berkeley DB Java Edition does happens inside the same JVM instance where the application is hosted. In other words, the database is embedded within the application process as a jar file in the classpath.

These characteristics makes Berkeley DB Java Edition uniquely suitable for a wide class of applications that does not require the additional features from a full-fledged relational database server. These applications tend to have a relatively stable and small data schema, a very dynamic but usually bounded data set, and high throughput requirements.

In this article, I will take you though a Java application developer's tour of Oracle Berkeley DB Java Edition, henceforth referred to as JE, the shorthand name used in its documentation.

What's In the Download

JE is available for download from the Oracle Berkeley DB website. The download bundle can be unzipped or untarred into a single directory je-3.1.0 that contains the following files and subdirectories:

[weiqi@gao] $ ls je-3.1.0ant/              dist/               examples/            LICENSE  test/
build.properties  docs/               FindBugsExclude.xml  README
build.xml         example.properties  lib/                 src/

The src and test directories contain the source code and a set of tests for the JE product. The lib directory contains a single jar file named je-3.1.0.jar. This 1.1MB file is the only jar file needed to use JE. It does not depend on anything but the JDK. The Direct Persistence Layer (DPL) requires Java 5 or later. The rest of the JE requires Java 1.4.2 or later. The jar file contains a Main-Class manifest declaration and can be used to invoke administrative utilities.

The docs directory contains a rich set of documentation typical of the Berkeley DB family of products. Aside from the usual release notes, installation instructions and javadocs, four books are included, in both HTML and PDF format:

Getting Started with Berkeley DB Java Edition (120 pages)
Getting Started with Direct Persistence Layer (42 pages)
Getting Started with Transaction Processing (62 pages)
Java Collections Tutorial (95 pages)

The examples directory contains example programs illustrating JE features. The release notes have detailed instructions on how to compile and use the example classes.

The online books are written in a tutorial style and does not assume any prior database programming experience. Together with the javadocs and the examples, they provide a thorough introduction to JE basics, its transaction engine and the two frameworks built on top of JE: the Direct Persistence Layer and the Java Collections API. And because of the availability of these excellent tutorials, I will try to simply highlight some of the interesting features of JE.

Let's Create A Database

JE's terminology is slightly different from those of the relational database world. To work with JE data, we use Environments. A JE Environment is roughly equivalent to what the relational database world calls a database or a schema, and a JE Database corresponds to an relational database table. However a JE database has only two columns: key and data. In this regard a JE database is more like a Java TreeMap.

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
 
import java.io.File;
 
public class CreateDatabase {
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        envConfig.setTransactional(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setAllowCreate(true);
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(true);
        Database db =
                env.openDatabase(null, "randomBytes", dbConfig);
 
        db.close();
        env.close();
    }
}

This opens a JE environment in the directory ./envHome and a database named randomBytes within the environment. The variable env is called an environment handle and db a database handle. The environment handle is created with a constructor and it in turn is a factory of database handles.

JE supports many configuration properties for the various handles it creates. These properties are passed in to the constructors or factory methods of the handles as a configuration object parameter. The above example shows the usage of EnvironmentConfig and DatabaseConfig objects and later examples will show the usage of CursorConfig, TransactionConfig and SecondaryConfig objects. A null, or a *Config.DEFAULT object can be passed in if the default properties are desired. Properties can also be set in a je.properties file in the environment home directory. An example.properties file is available in the je-3.1.0 directory that lists all the properties, their default values, and whether they can be modified after creation.

The same *Config object can be reused in the creation of subsequent handles. Changes made to config objects will not affect handles already opened.

A subset of the environment configuration properties can be modified at run time through a EnvironmentMutableConfig object obtained from the environment handle.

The environment home directory must exist prior to opening an environment, or a DatabaseException is thrown. If the directory is empty and the allowCreate property is true, an environment will be created. If the directory is empty and allowCreate is false, a DatabaseException is thrown.

From the above example, one can see that the transactional properties can be turned on or off for the environment or for individual databases in the environment. However, if transaction is turned off for the environment then it must be off for all databases that are opened in that environment.

[weiqi@gao] $ java CreateDatabaseException in thread "main" com.sleepycat.je.DatabaseException: (JE 3.1.0) The En
vironment directory /home/weiqi/perm/JNB/berkeleydb/classes/production/berkeleyd
b/./envHome is not writable, but the Environment was opened for read-write acces
s.
        [Stacktrace ...]
[weiqi@gao] $ mkdir envHome[weiqi@gao] $ java CreateDatabase[weiqi@gao] $ ls envHome00000000.jdb  je.lck

When JE creates an environment, it simply creates a couple of files: a log file 00000000.jdb and a lock file je.lck. It is not obvious what JE databases are available in an environment by looking into the environment home directory. JE appends all persistent activities such as transactions, insertions, updates and deletions to the log file. When 00000000.jdb is full (the default size is 10MB), 00000001.jdb is created, etc., up to ffffffff.jdb. If an environment is opened as read-write, a log cleaning thread is started that monitors log files utilization. When most of the records in a particular log file have been obsoleted by updates or deletions in later log files, the log cleaning thread copies the remaining records to the end of the latest log file and deletes the old log file. The cleaner chooses which log file to operate based on the current space utilization within the log file.

The JE log file format is architecture neutral and can be copied from machine to machine regardless of processor types or operating systems. However this file format is not compatible with that of Berkeley DB Core Edition.

Byte Arrays Are the Key (and the Data)

At the lowest level, JE supports only one data type, the byte array type. It encapsulates byte arrays into the DatabaseEntry class. The following class inserts 100000 1k records into the randomBytes database:

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.Transaction;
 
import java.io.File;
import java.util.Random;
 
public class CreateRandomBytes {
    private static final int NUM_RECORDS = 100000;
 
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        envConfig.setTransactional(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setAllowCreate(true);
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(true);
        Database db = env.openDatabase(null, "randomBytes", dbConfig);
 
        Transaction txn = env.beginTransaction(null, null);
        byte[] key = new byte[4];
        byte[] data = new byte[1024];
        long startTime = System.currentTimeMillis();
 
        for (int i = 0; i < NUM_RECORDS; i++) {
            fillKey(i, key);
            fillData(data);
            try {
                db.put(txn, new DatabaseEntry(key),
                        new DatabaseEntry(data));
            } catch (DatabaseException e) {
                txn.abort();
                throw e;
            }
        }
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Inserted " + NUM_RECORDS + " 1k records in " +
                (endTime - startTime) + " milliseconds");
        db.close();
        env.close();
    }
 
    private static void fillKey(int i, byte[] key) {
        key[0] = (byte) i;
        key[1] = (byte) (i >> 8);
        key[2] = (byte) (i >> 16);
        key[3] = (byte) (i >> 24);
    }
 
    private static void fillData(byte[] data) {
        Random random = new Random();
        random.nextBytes(data);
    }
}

We obtain a transaction from the environment by calling the beginTransaction method, passing in two nulls as parameters. The first parameter is the parent transaction and must be set to null as JE does not yet support nested transactions. The second parameter is a TransactionConfig and we pass in null to use the default properties.

[weiqi@gao] $ mkdir envHome[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 8319 milliseconds
[weiqi@gao] $ ls -l envHometotal 112428
-rw-rw-r-- 1 weiqi weiqi 9999708 Oct 22 10:59 00000000.jdb
-rw-rw-r-- 1 weiqi weiqi 9999885 Oct 22 10:59 00000001.jdb
-rw-rw-r-- 1 weiqi weiqi 9998931 Oct 22 10:59 00000002.jdb
-rw-rw-r-- 1 weiqi weiqi 9999622 Oct 22 10:59 00000003.jdb
-rw-rw-r-- 1 weiqi weiqi 9998944 Oct 22 10:59 00000004.jdb
-rw-rw-r-- 1 weiqi weiqi 9999171 Oct 22 10:59 00000005.jdb
-rw-rw-r-- 1 weiqi weiqi 9999133 Oct 22 10:59 00000006.jdb
-rw-rw-r-- 1 weiqi weiqi 9999709 Oct 22 10:59 00000007.jdb
-rw-rw-r-- 1 weiqi weiqi 9999639 Oct 22 10:59 00000008.jdb
-rw-rw-r-- 1 weiqi weiqi 9999297 Oct 22 10:59 00000009.jdb
-rw-rw-r-- 1 weiqi weiqi 9999216 Oct 22 10:59 0000000a.jdb
-rw-rw-r-- 1 weiqi weiqi 4851872 Oct 22 10:59 0000000b.jdb
-rw-rw-r-- 1 weiqi weiqi       0 Oct 22 10:59 je.lck
[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 76100 milliseconds
[weiqi@gao] $ ls -l envHometotal 1186436
-rw-rw-r-- 1 weiqi weiqi 9999708 Oct 22 10:59 00000000.jdb
-rw-rw-r-- 1 weiqi weiqi 9999885 Oct 22 10:59 00000001.jdb
......
-rw-rw-r-- 1 weiqi weiqi 9998394 Oct 22 11:28 00000078.jdb
-rw-rw-r-- 1 weiqi weiqi 2255609 Oct 22 11:28 00000079.jdb
-rw-rw-r-- 1 weiqi weiqi       0 Oct 22 11:27 je.lck

The sortedDuplicates property of databases is saved on disk when a database is first created. All subsequent opens of the database must use the same property value. This property controls duplicate data in the database. If this property is false, putting a key-data pair with an existing key will overwrite the old data. If it is set to true, putting a key-data pair with an existing key will cause both the new data and the old data to be saved in the database.

Here are the numbers if sortedDuplicates is set to false:

[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 8243 milliseconds
[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 13532 milliseconds
[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 15955 milliseconds

For the remainder of these article, we will use a version of CreateRandomBytes where sortedDuplicates is false.

Turning Anything into a Byte Array (Data Binding)

As illustrated in the last section, JE stores only byte arrays. If you want to store anything other than byte arrays, you have to do the conversion when inserting or retrieving values.

JE provides a Bind API to help with this process. It defines the EntryBinding interface:

interface EntryBinding {
    Object entryToObject(DatabaseEntry entry);
    void objectToEntry(Object object, DatabaseEntry entry);
}

Implementations of this interface are provided in JE to handle Java primitive types, Strings and Serializable objects. An abstract class TupleBinding can be extended to handle other kinds of objects.

Here is what it takes to convert a long and String key-data pair into DatabaseEntrys (Notice that the TupleBinding abstract class is also the factory for concrete bindings for primitive types and String):

import com.sleepycat.bind.EntryBinding;
import com.sleepycat.bind.tuple.TupleBinding;
import com.sleepycat.je.DatabaseEntry;
 
public class PrimitiveBinding {
    public static void main(String[] args) {
        long longKey = 1234567890L;
        String stringData = "This is a test.";
 
         // Manufacture the bindings
        EntryBinding longBinding =
                TupleBinding.getPrimitiveBinding(Long.class);
        EntryBinding stringBinding =
                TupleBinding.getPrimitiveBinding(String.class);
 
         // To entries
        DatabaseEntry key = new DatabaseEntry();
        DatabaseEntry data = new DatabaseEntry();
        longBinding.objectToEntry(longKey, key);
        stringBinding.objectToEntry(stringData, data);
 
         // And back
        long outLongKey =
                (Long) longBinding.entryToObject(key);
        String outStringData =
                (String) stringBinding.entryToObject(data);
        assert longKey == outLongKey;
        assert stringData.equals(outStringData);
    }
}

Here is what it takes to extend TupleBinding to handle a custom data structure. The Book class contains a couple of String fields:

public class Book {
    public String title;
    public String author;
}

The BookBinding extends TupleBinding and uses TupleInput and TupleOutput objects to convert Bookobjects to byte arrays:

import com.sleepycat.bind.tuple.TupleBinding;
import com.sleepycat.bind.tuple.TupleInput;
import com.sleepycat.bind.tuple.TupleOutput;
 
public class BookBinding extends TupleBinding {
 
    public Object entryToObject(TupleInput input) {
        Book book = new Book();
        book.title = input.readString();
        book.author = input.readString();
        return book;
    }
 
    public void objectToEntry(Object object, TupleOutput output) {
        Book book = (Book) object;
        output.writeString(book.title);
        output.writeString(book.author);
    }
}

One thing to note here is that the order that you read data from input must be exactly the same as the order in which it was written to output. Another thing to note is the lack of type safety in this approach. You can put a Book into a database and get out an object of another type that has the same structure as a Book.

JE also contains a SerialBinding class that works with any Serializable objects without having to write a custom TupleBinding. Since Java serialization format contains repetitive classes information, JE uses a revised scheme that puts the classes information into a separate database.

Navigation with Cursors

JE Database objects have a get method that allows you to retrieve the data based on a key. We can read back the random bytes from the randomBytes database using the following program:

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
import com.sleepycat.je.Transaction;
 
import java.io.File;
 
public class RetrieveRandomBytes {
    private static final int NUM_RECORDS = 100000;
 
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setTransactional(true);
        envConfig.setReadOnly(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(false);
        dbConfig.setReadOnly(true);
        Database db = env.openDatabase(null, "randomBytes", dbConfig);
 
        Transaction txn = env.beginTransaction(null, null);
        byte[] keyBytes = new byte[4];
        DatabaseEntry data = new DatabaseEntry();
        long startTime = System.currentTimeMillis();
 
        for (int i = 0; i < NUM_RECORDS; i++) {
            fillKey(i, keyBytes);
            DatabaseEntry key = new DatabaseEntry(keyBytes);
            try {
                if (db.get(txn, key, data, LockMode.DEFAULT) ==
                        OperationStatus.SUCCESS) {
                    byte[] dataBytes = data.getData();
                    assert dataBytes.length == 1024;
                } else {
                    throw new RuntimeException("db.get() failed.");
                }
            } catch (DatabaseException e) {
                txn.abort();
                throw e;
            }
        }
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Retrieved " + NUM_RECORDS + " 1k records in " +
                (endTime - startTime) + " milliseconds");
 
        db.close();
        env.close();
    }
 
    private static void fillKey(int i, byte[] key) {
        key[0] = (byte) i;
        key[1] = (byte) (i >> 8);
        key[2] = (byte) (i >> 16);
        key[3] = (byte) (i >> 24);
    }
}

We open the environment and database as read only, reconstructed each key and retrieved the the corresponding data from the database.

[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 8287 milliseconds
[weiqi@gao] $ java -ea RetrieveRandomBytesRetrieved 100000 1k records in 5646 milliseconds

To iterate over the data in a database without prior knowledge of the keys, we use the JECursor class.

import com.sleepycat.je.Cursor;
import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
import com.sleepycat.je.Transaction;
 
import java.io.File;
 
public class IterateRandomBytes {
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setTransactional(true);
        envConfig.setReadOnly(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(false);
        dbConfig.setReadOnly(true);
        Database db = env.openDatabase(null, "randomBytes", dbConfig);
 
        Transaction txn = env.beginTransaction(null, null);
        DatabaseEntry key = new DatabaseEntry();
        DatabaseEntry data = new DatabaseEntry();
        long startTime = System.currentTimeMillis();
        int recordsVisited = 0;
 
        Cursor cursor = db.openCursor(txn, null);
        while (cursor.getNext(key, data, LockMode.DEFAULT) ==
                OperationStatus.SUCCESS) {
            byte[] dataBytes = data.getData();
            assert dataBytes.length == 1024;
            recordsVisited++;
        }
        cursor.close();
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Retrieved " + recordsVisited + " 1k records in " +
                (endTime - startTime) + " milliseconds");
 
        db.close();
        env.close();
    }
}

Here we call the database handle's openCursor method to get a Cursor. Two parameters are passed in: a Transaction, and a null for CursorConfig to use the default properties. We then use the getNext method to walk through all the records in the database. Executing CreateRandomBytes followed by IterateRandomBytes yields the following results:

[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 7887 milliseconds
[weiqi@gao] $ java -ea IterateRandomBytesRetrieved 100000 1k records in 4761 milliseconds

Aside from the getNext method that moves the cursor to the next record in the database (a newly opened cursor points to one prior to the first record), the Cursor class also has the following cursor position methods:

`getPrev`	moves to the previous record (a newly opened cursor also points to one past the last record)
`getFirst`	moves to the first record
`getLast`	moves to the last record
`getSearchKey`	moves to record with matching key
`getSearchKeyRange`	moves to first record at or just past the given key
`getSearchBoth`	moves to record with matching key and data
`getSearchBothRange`	moves to first record with matching key and data at or just past the given data)

The search methods account for duplicate data values with the same key.

You can also insert, update and delete records using the put, putCurrent and delete methods of the Cursor class. More elaborate methods exists to deal with duplicate data situations. Here is a little program that deletes all the random bytes that we created:

import com.sleepycat.je.Cursor;
import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
import com.sleepycat.je.Transaction;
 
import java.io.File;
 
public class DeleteRandomBytes {
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setTransactional(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(false);
        Database db = env.openDatabase(null, "randomBytes", dbConfig);
 
        Transaction txn = env.beginTransaction(null, null);
        DatabaseEntry key = new DatabaseEntry();
        DatabaseEntry data = new DatabaseEntry();
        long startTime = System.currentTimeMillis();
        int recordsDeleted = 0;
 
        Cursor cursor = db.openCursor(txn, null);
        while (cursor.getNext(key, data, LockMode.DEFAULT) ==
                OperationStatus.SUCCESS) {
            cursor.delete();
            recordsDeleted++;
        }
        cursor.close();
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Deleted " + recordsDeleted + " 1k records in " +
                (endTime - startTime) + " milliseconds");
 
        db.close();
        env.cleanLog();
        env.close();
    }
}

We simply delete each record as we iterate through the cursor. We also call the Environment.cleanLog() method right before we close the environment handle. In our scenario this causes some of the log files, which now contain deleted records, to be removed.

Executing CreateRandomBytes followed by DeleteRandomBytes yields the following numbers:

[weiqi@gao] $ java CreateRandomBytesInserted 100000 1k records in 8084 milliseconds
[weiqi@gao] $ java DeleteRandomBytesDeleted 100000 1k records in 5076 milliseconds
[weiqi@gao] $ java DeleteRandomBytes
Deleted 0 1k records in 951 milliseconds

Secondary Databases Are Indexes

The JE cursor allows us to search a database based on the keys of the records. Sometimes an application may wish to search the database using something other than the keys. The way to make this possible in JE is to use a secondary database (an index in relational database terminology.)

The data in secondary databases are the keys of its associated primary database and the keys are generated from data in the primary database. The primary database handle and the key creator are two pieces of information needed when a secondary database is opened. A primary database cannot allow duplicate records when a secondary database is used. However, a secondary database should generally allow duplicate records.

In the next example, we create a secondary database keyed off of the tenth byte of the byte array data in the primary database and then use that secondary database to find all primary data whose tenth byte is 42.

We first implement a TenthByteKeyCreator:

import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.SecondaryDatabase;
import com.sleepycat.je.SecondaryKeyCreator;
 
public class TenthByteKeyCreator implements SecondaryKeyCreator {
 
    public boolean createSecondaryKey(SecondaryDatabase secondary,
                                      DatabaseEntry key,
                                      DatabaseEntry data,
                                      DatabaseEntry result)
            throws DatabaseException {
        byte[] tenthByte = new byte[1];
        tenthByte[0] = data.getData()[9];
        result.setData(tenthByte);
        return true;
    }
}

The SecondaryKeyCreator interface has only one method createSecondaryKey(). We have access to the secondary database itself, the key-data pair of the primary database record, and a result that is to become the key in the secondary database. We simply grab the tenth byte out of the primary data and stuff it into the result.

import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
import com.sleepycat.je.SecondaryConfig;
import com.sleepycat.je.SecondaryCursor;
import com.sleepycat.je.SecondaryDatabase;
import com.sleepycat.je.Transaction;
 
import java.io.File;
import java.util.Random;
 
public class IndexRandomBytes {
    private static final int NUM_RECORDS = 100000;
 
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        envConfig.setTransactional(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setAllowCreate(true);
        dbConfig.setTransactional(true);
        dbConfig.setSortedDuplicates(false);
        Database db = env.openDatabase(null, "randomBytes", dbConfig);
 
        SecondaryConfig secondaryConfig = new SecondaryConfig();
        secondaryConfig.setAllowCreate(true);
        secondaryConfig.setAllowPopulate(true);
        secondaryConfig.setTransactional(true);
        secondaryConfig.setKeyCreator(new TenthByteKeyCreator());
        secondaryConfig.setSortedDuplicates(true);
        SecondaryDatabase index =
                env.openSecondaryDatabase(null, "tenth-byte",
                        db, secondaryConfig);
 
        populatePrimary(env, db);
 
        searchIndex(env, index);
 
        index.close();
        db.close();
        env.cleanLog();
        env.close();
    }
 
    private static void populatePrimary(Environment env, Database db)
            throws DatabaseException {
        Transaction txn = env.beginTransaction(null, null);
        byte[] key = new byte[4];
        byte[] data = new byte[1024];
        long startTime = System.currentTimeMillis();
 
        for (int i = 0; i < NUM_RECORDS; i++) {
            fillKey(i, key);
            fillData(data);
            try {
                db.put(txn, new DatabaseEntry(key),
                        new DatabaseEntry(data));
            } catch (DatabaseException e) {
                txn.abort();
                throw e;
            }
        }
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Inserted " + NUM_RECORDS + " 1k records in " +
                (endTime - startTime) + " milliseconds");
    }
 
    private static void searchIndex(Environment env, SecondaryDatabase index)
            throws DatabaseException {
        Transaction txn = env.beginTransaction(null, null);
        SecondaryCursor cursor = index.openSecondaryCursor(txn, null);
        DatabaseEntry key2 = new DatabaseEntry(new byte[] {(byte) 42});
        DatabaseEntry data2 = new DatabaseEntry();
        int recordsFound = 0;
        long startTime = System.currentTimeMillis();
 
        try {
            OperationStatus status =
                    cursor.getSearchKey(key2, data2, LockMode.DEFAULT);
            if (status == OperationStatus.SUCCESS) {
                recordsFound++;
                while (cursor.getNextDup(key2, data2, LockMode.DEFAULT) ==
                        OperationStatus.SUCCESS) {
                    assert data2.getData()[9] == 42;
                    recordsFound++;
                }
            }
        } catch (DatabaseException e) {
            txn.abort();
            throw e;
        }
        cursor.close();
        txn.commit();
 
        long endTime = System.currentTimeMillis();
        System.out.println("Found " + recordsFound + " records whose tenth byte is 42 in " +
                (endTime - startTime) + " milliseconds");
    }
 
    private static void fillKey(int i, byte[] key) {
        key[0] = (byte) i;
        key[1] = (byte) (i >> 8);
        key[2] = (byte) (i >> 16);
        key[3] = (byte) (i >> 24);
    }
 
    private static void fillData(byte[] data) {
        Random random = new Random();
        random.nextBytes(data);
    }
}

For this example we have to use a bigger heap size to avoid excessive garbage collections. Here are the numbers:

[weiqi@gao] $ java -Xmx800M IndexRandomBytesInserted 100000 1k records in 11724 milliseconds
Found 380 records whose tenth byte is 42 in 34 milliseconds
[weiqi@gao] $ java -Xmx800M IndexRandomBytesInserted 100000 1k records in 28330 milliseconds
Found 402 records whose tenth byte is 42 in 18 milliseconds

From a probability point of view, there should be around 100000/256 = 391 records whose tenth byte is 42. So our results are about right. It is worth pointing out that although the data writes are slowed down, the indexed search is a lot faster than a serial search through a primary database cursor.

The Cache, the OS IO Buffer, and the Physical Disk

One of the characteristics of embedded databases is the proximity of the data to the application—it's right there on a local disk. Consequently, a lot more can be done to tune the database with respect to its disk access behavior. This is markedly different from using an relational database server where the database is always a network call away and there is not a thing the application developer can do about it.

As a non-JE application writes data to the local disk, the data usually pass through three stages. First, the application gathers the data it wants to write out into an application buffer. Then the application buffer is written into a file system buffer. And finally, the operating system syncs the file system buffer onto the physical disk. And then there is the on-board disk hardware write cache, which needs to be disabled if we really want the data to reach the physical disk.

In order for a database to achieve the Durability part of the ACID transaction semantics, it is important that the data be synced to the physical disk. That step is expensive compared to writing to the file system.

JE allows the application developer to choose how far their data would go when a transaction is committed, trading a little bit durability of the data in the event of an operating system or an application failure for some usually dramatic throughput gains.

In JE terms, a transaction can be committed in one of three ways: commitSync, commitWriteNoSync, and commitNoSync. commitSync will take the data all the way to the disk and takes a long time. commitWriteNoSync takes the data to the operating system level but doesn't wait for all of it to be written to disk. The operating system will make sure it happens at a later time. commitNoSync writes the data to JE's cache, and the data will be written to the operating system when JE's write buffer is filled up or when a subsequent sync transaction happens.

How an application commits its transactions can be configured at different times in an JE application. It can be configured environment wide using EnvironmentConfig, at transaction creation time using TransactionConfig or at commit time by selecting from the different commit methods.

Wouldn't It Be Nice If...

Using the Bind API, the JE application developer can design the database schema with complex data structures and relationships. However writing custom bindings and complex queries still require quite a bit of work.

JE includes two higher level frameworks that ease this burden. The JE Java Collections API allows applications to access JE data through the standard Java Collections API interface. The newer Direct Persistence Layer allows applications to use a mechanism not unlike the Java Persistence API (JPA) to manipulate JE data.

We give a simple example that highlight some of the DPL features.

The DPL uses Java 5 annotations to define @Entity classes with @PrimaryKey and @SecondaryKey markings. An entity is an independent class that has a primary key and is accessed through a primary index. A secondary key can be one-to-one, many-to-one, one-to-many or many-to-many. A secondary key can also be a foreign key that relates one entity to another.

The DPL uses bytecode instrumentation to generate the low level code. The instrumentation can be performed either ahead of time using an Ant task, or at class loading time.

import com.sleepycat.persist.model.Entity;
import com.sleepycat.persist.model.PrimaryKey;
import com.sleepycat.persist.model.Relationship;
import com.sleepycat.persist.model.SecondaryKey;
 
@Entity
public class Order {
    @PrimaryKey
    public int id;
    @SecondaryKey(relate = Relationship.MANY_TO_ONE)
    public int dept;
    public String name;
    public double price;
    public double quantity;
 
    public String toString() {
        StringBuffer buffer = new StringBuffer("Order[");
        buffer.append("id=")
                .append(id)
                .append(",dept=")
                .append(dept)
                .append(",name=")
                .append(name)
                .append(",price=")
                .append(price)
                .append(",quantity=")
                .append(quantity)
                .append("]");
        return buffer.toString();
    }
}

The DPL uses a high-level construct called an EntityStore instead of the low level Databases to manage persistence. The EntityStore is the factory for the generic PrimaryIndex and SecondaryIndex objects. An entity object can be persisted using PrimaryIndex objects and retrieved using PrimaryIndex, SecondaryIndex and EntityCursor objects. The primary and secondary index classes are like DAOs.

import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.Transaction;
import com.sleepycat.persist.EntityStore;
import com.sleepycat.persist.PrimaryIndex;
import com.sleepycat.persist.SecondaryIndex;
import com.sleepycat.persist.StoreConfig;
 
import java.io.File;
 
public class PlaceOrder {
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        envConfig.setTransactional(true);
        Environment env =
               new Environment(new File("./envHome"), envConfig);
 
        StoreConfig storeConfig = new StoreConfig();
        storeConfig.setAllowCreate(true);
        storeConfig.setTransactional(true);
        EntityStore store =
               new EntityStore(env, "OrderStore", storeConfig);
 
        PrimaryIndex<Integer,Order> orderById =
                store.getPrimaryIndex(Integer.class, Order.class);
 
        SecondaryIndex<Integer, Integer, Order> orderByDepartmentId =
                store.getSecondaryIndex(orderById, Integer.class, "dept");
 
        Transaction txn = env.beginTransaction(null, null);
 
        Order order = new Order();
        order.id = 100;
        order.dept = 3;
        order.name = "Random Order";
        order.price = 10.24;
        order.quantity = 3.0;
        orderById.put(order);
 
        order = new Order();
        order.id = 101;
        order.dept = 3;
        order.name = "Random Order Again";
        order.price = 20.48;
        order.quantity = 6.0;
        orderById.put(order);
 
        order = new Order();
        order.id = 102;
        order.dept = 4;
        order.name = "Another Random Order";
        order.price = 40.96;
        order.quantity = 12.0;
        orderById.put(order);
 
        txn.commit();
 
        store.close();
        env.close();
    }
}

Executing this class' main method puts three orders into the JE environment.

import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.persist.EntityCursor;
import com.sleepycat.persist.EntityStore;
import com.sleepycat.persist.PrimaryIndex;
import com.sleepycat.persist.SecondaryIndex;
import com.sleepycat.persist.StoreConfig;
 
import java.io.File;
 
public class ProcessOrder {
    public static void main(String[] args) throws DatabaseException {
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        envConfig.setTransactional(true);
        Environment env =
                new Environment(new File("./envHome"), envConfig);
 
        StoreConfig storeConfig = new StoreConfig();
        storeConfig.setAllowCreate(true);
        storeConfig.setTransactional(true);
        EntityStore store =
                new EntityStore(env, "OrderStore", storeConfig);
 
        PrimaryIndex<Integer,Order> orderById =
                store.getPrimaryIndex(Integer.class, Order.class);
 
        SecondaryIndex<Integer, Integer, Order> orderByDepartmentId =
                store.getSecondaryIndex(orderById, Integer.class, "dept");
 
        EntityCursor<Order> orders = orderById.entities();
        inspectOrders(orders);
        orders.close();
 
        orders = orderByDepartmentId.subIndex(3).entities();
        inspectOrders(orders);
        orders.close();
 
        store.close();
        env.close();
    }
 
    private static void inspectOrders(EntityCursor<Order> orders) {
        int orderCount = 0;
 
        for (Order order : orders) {
            System.out.println(order);
            orderCount++;
        }
        System.out.println("Number of orders: " + orderCount);
    }
 
}

Executing this class' main method retrieves and displays the orders in the JE environment. The orders are retrieved twice. First, using the primary index to get all the orders, three orders were retrieved. Secondly, using the secondary index to get all orders with a department Id of 3, two orders were retrieved.

[weiqi@gao] $ java PlaceOrder[weiqi@gao] $ java ProcessOrderOrder[id=100,dept=3,name=Random Order,price=10.24,quantity=3.0]
Order[id=101,dept=3,name=Random Order Again,price=20.48,quantity=6.0]
Order[id=102,dept=4,name=Another Random Order,price=40.96,quantity=12.0]
Number of orders: 3
Order[id=100,dept=3,name=Random Order,price=10.24,quantity=3.0]
Order[id=101,dept=3,name=Random Order Again,price=20.48,quantity=6.0]
Number of orders: 2

Summary

Berkeley DB Java Edition brings the high performance, small footprint Berkeley DB Core Edition tradition into the Java world. By implementing the core engine in pure Java, the JE developers made it more attractive for Java developers.

JE and other Java-based embedded database engines makes writing a new class of applications feasible in the Java programming language.

The JE Direct Persistence Layer provides a JPA-like plain Java object persistence mechanism that is powerful and easy-to-use.

References

[1] Berkeley DB Java Edition Home
http://www.oracle.com/database/berkeley-db/index.html
[2] Berkeley DB Java Edition FAQ
http://www.oracle.com/technology/products/berkeley-db/faq/je_faq.html
[3] Berkeley DB Java Edition Forum
http://forums.oracle.com/forums/forum.jspa?forumID=273
[4] Dr. Dobb's Journal article on Berkeley DB Java Edition by Charles Lamb
http://www.ddj.com/184406147