Six JDK Classes You Think You Know

By Weiqi Gao, OCI Principal Software Engineer

December 2010

The Times Are Changing

For over ten years, the OCI Java News Brief series has been bringing noteworthy new developments in the Java space to its loyal readers.

Simply by following this series alone, you've learned eight scripting languages for the JVM, ten web frameworks, seven ways to mock your tests, and countless cool open source libraries to make your job easier!

However, with the progress of time, the Java platform has become more mature and has established itself as the development platform of choice for many application domains. Naturally, the jobs of some Java developers are much more mainstream than cutting-edge now.

With this change of pace of evolution, our focus naturally shifts from the shiny new libraries and tools to something more fundamental—the basic language constructs and the JDK class libraries—because these are the bread and butter of everyday life of the Java developer.

Over the last few years, I had many moments of puzzlement, not over brand new third party library classes, but basic JDK classes. And of course after a little bit, or a couple of days of (depending on the situation), studying of the JDK online documentation, I would come to a moment of clarification, and, sometimes, an a-ha! moment. I have made a habit of sharing these tidbits of insights on my blog under the category of Friday Java Quizes.

In this article, I will offer some of those quizes for your holiday season brain-teaser solving pleasure. Remember the point is not that these classes are that important, nor that it is super essential to know these trivia by heart—after all you can always look them up when you need to. But I do believe, as professional Java developers, we should make an effort to gain a deeper understanding of the basic properties of the JDK classes. That way, we may become more proficient in our daily work, and the products we build may become more robust.

In the following session, each topic will be presented as a quiz followed by an explanation. It may be more fun to give yourself a few minutes to ponder the questions before reading the explanations.

Do you know about …?

`java.lang.Double.MIN_VALUE`

Q: What does the following Java program print?

public class Foo {
    public static void main(String[] args) {
        System.out.println(Math.min(Double.MIN_VALUE, 0.0d));
    }
}

A: Unlike the Integer, where the MAX_VALUE and MIN_VALUE constants represent the greatest and least possible int values, both the MAX_VALUE and MIN_VALUE of the Double class are positive numbers. The Double.MIN_VALUE is 2^-1074, a double whose magnitude is the least among all double values.

So the above program will print 0.0.

While we are talking about Doubles, I should also mention that another aspect of double value calculations that most developer don't pay a lot of attention to is the presence of Infinity, NaN, and -0.0 and the rules that govern the arithmetic calculations involving them. For example, the expression 1.0 / 0.0 evaluates to Infinity and will not cause an ArithmeticException to be thrown. Also note that the comparison x == Double.NaN always evaluates to false, even if x itself is a NaN. To test if x is a NaN, one should use the method call Double.isNaN(x).

`java.lang.Thread.setDefaultUncaughtExceptionHandler` and
`java.lang.Runtime.addShutdownHook`

Q: What will the exit status of the following Java process as seen by the shell be? Will it be 99? Can it be anything other than 99?

public class Main {
    public static void main(String[] args) {
        Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
            public void run() {
                // Clean up resources: close open files, close databases, etc.
            }
        }));
        Thread.setDefaultUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() {
            public void uncaughtException(Thread t, Throwable e) {
                if (e instanceof Error) {
                    System.exit(99);
                }
            }
        });
        // The application proper: starting the GUI, listening on ServerSockets, etc.
    }
}

A: The intention of this code is very clear. We have instructed the JVM to do two things:

Perform some clean up routines at system shutdown time, things like closing communication channels, closing the database, etc.
If any thread of the application that does not already have a thread-specific uncaught exception handler encounters a java.lang.Error it should exit the system, and return a status of 99 to the operating system.

Imagine, if you will, an application that should be up all the time but can't, because there is a slow memory leak that's prone to cause the application to die of java.lang.OutOfMemoryError once in a while. Fixing the leak or leaks is in the long term goals of the application development team. The above uncaught exception handler is used in concert with a shell script or batch file that restarts the application automatically if the exit status is 99.

The answer of course is that the exit status of the Java process is not necessarily 99. It is not even certain that the process will terminate after the uncaught exception handler kicks in.

From a puzzle solver's perspective, the first thing you can try is to let the shutdown hook do something like

System.exit(42);

This will cause the JVM to block indefinitely. The javadoc for Runtime.exit(int) clearly states that "If this method is invoked after the virtual machine has begun its shutdown sequence then if shutdown hooks are being run this method will block indefinitely." And System.exit() will call Runtime.exit().

Similarly, if the code in the shutdown hook causes another Error to be thrown, then the uncaught exception handler will kick in again, calling System.exit(99) a second time, causing the indefinite block.

The second thing to try is to call

Runtime.getRuntime().halt(42);

from the shutdown hook. This will cause the Java process to exit with a status of 42.

However, in real code, you don't call the exit() or the halt() methods willy nilly. And if the shutdown hook has only legitimate resource cleanup code, there is still chance for things to go wrong. If the exception that started the shutdown sequence is an OutOfMemoryError then chances are high that the cleanup code will experience the same error again, causing an indefinite block.

If the shutdown hook calls a native method, then there is also a chance for the native code to cause the process to terminate in ways that cannot be anticipated from the Java side.

`java.util.Properties.load`

Q: Suppose we have the following foo.properties file:

Start Time: Fri Sep 24 11\:02\:45 CDT 2010

After loading this file into a java.util.Properties object, what is the value of the "Start Time" property in the object?

A: Since normal properties files are usually of the format

foo.bar=The value of the property

we don't think too much about the properties file format. However, if you read up on the javadoc of the java.util.Properties.load(Reader) method, you will see a quite detailed specification.

You will learn, for example, that the equal sign ("="), the colon (":") and white space characters other than a line terminator (that is, the space, tab and formfeed characters) separates the key from the element; and that they can be escaped with the backslash character.

Therefore the answer to the quiz is that there is not a "Start Time" property in the object, but a "Start" property in the object with a value of "Time: Fri Sep 24 11:02:45 CDT 2010." Had the properties file contained the following:

Start-Time: Fri Sep 24 11\:02\:45 CDT 2010

then we would have a "Start-Time" property with value "Fri Sep 24 11:02:45 CDT 2010."

While we are talking about java.util.Properties.load(), I should also mention that there are two versions of this method: one that takes a Reader; and another that takes an InputStream. A properties file that is loaded via an InputStream is assumed to use the ISO 8859-1 character encoding. This is markedly different from the system default file encoding represented by the system property file.encoding, which varies with operating systems and it locale settings.

`java.lang.String.getBytes`

Q: What does the following Java program print?

import java.util.Arrays;
 
public class Main {
    public static void main(String[] args) throws Exception {
        char[] chars = new char[] {'\u0097'};
        String str = new String(chars);
        byte[] bytes = str.getBytes();
        System.out.println(Arrays.toString(bytes));
    }
}

A: What we are doing here looks innocent enough: We started off with a char array with only one char in it, the '\u0097' character. We then made a string out of the char array. We then asked the string for its underlying byte array. Finally we print the byte array out.

Since \u0097 is within the 8-bit range, it is reasonable to guess that the str.getBytes() call will return a byte array that contains one element with a value of -105 ((byte) 0x97).

However, that's not what the program prints. As a matter of fact, the output of the program is operating system and locale dependent. On a Windows XP with the US locale, the above program prints

[63]

and on a Linux system it prints

[-62, -105]

What's going on here?

To understand the behavior of this program, we need to know about how Unicode characters are represented in Java char values and in Java strings, and what role character encoding plays in String.getBytes().

The Unicode standard is a complicated specification with many intricate details. For our purpose it helps to know that in Unicode every character has a code point that ranges from U+0000 to U+10FFFF. Every character between U+0000 and U+FFFF is represented as one char value whose numeric value is exactly the same as the Unicode code point. Every character between U+10000 and U+10FFFF is represented by two char values whose numeric values are reserved by the Unicode standard to not represent any characters. In our program the single char in the char array represents the Unicode character U+0097.

To convert a string to a byte array, Java needs to go through the characters that the string represents and turn each one into a number of bytes and finally put the bytes together. The rule that maps each Unicode character into a byte array is called a character encoding.

Some character encoding schemes maps only a selected small subset of the Unicode characters discriminately. For example, the US-ASCII encoding only recognizes the 128 7-bit ASCII characters from U+0000 to U+007F and maps each of these characters into a byte that has the same numeric value, i.e., 0 to 127.

The ISO-8859-1 encoding only recognize the 256 8-bit characters from U+0000 to U+00FF and maps each of these characters into a byte that has the same numeric value, i.e., 0 to 127 followed by -128 to -1. Since Java byte values are signed, the characters with the 8-th bit set are mapped to negative numbers.

The UTF-8 encoding, on the other hand, recognizes all Unicode characters, and maps each character to either one, or two, or three bytes, depending on which code point range the character belongs. It maps our character U+0097 into two bytes: -62 and -105.

The Cp1252 encoding, which is used on Windows, recognizes some, but not all, characters in the U+0000 and U+00FF range, plus some characters outside this range, such as the character U+20AC (the euro sign, €). It maps each of the characters it recognizes into one byte. It does not recognize the character U+0097.

When we call str.getBytes() without specifying a character encoding scheme, the JVM uses the default scheme to do the job. The default encoding scheme is operating system and locale dependent. On Linux, it is UTF-8. That explains the output we get from our program on Linux machines. On Windows with a US locale, the default encoding is Cp1252. Since it does not recognize our character U+0097, it encodes it as the byte value 63. That explains the output we get from our program on Windows machines with a US locale. No matter which character encoding scheme is used, Java will always translate Unicode characters not recognized by the encoding to63, which represents the character U+003F (the question mark, ?) in all encodings.

java.sql.ResultSet.updateRow

Q: In JDBC, is there a way to update a row in a database table without issuing a direct call to something similar to the following code?

stmt.execute("update table1 set item2 = 1024 where item1 = 1");

A: Most Java developers are familiar with the java.sql.ResultSet interface. In the normal JDBC work flow, you get a Connection from the environment, create a Statement from the Connection, and execute a query using the Statement. The executeQuery call returns a ResultSet, which can be iterated over to get the values of each column of each row in the result set.

However, since JDBC 2.0, the ResultSet interface also provides a set of updater methods that allow the Java program to update the affected columns and rows in the database table.

Assume we have a table named table1 with integer columns item1 and item2 where item1 is the primary key. Assume further that the table contains the following rows:

weiqi=# select * from table1;
 item1 | item2
-------+-------
     1 |  1024
     2 |  2048
     3 |  3072
(3 rows)

The following snippet of Java code can then be used to update this table by multiplying the item2 column by 2 for all rows:

Statement statement = conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
        ResultSet.CONCUR_UPDATABLE);
ResultSet resultSet = statement.executeQuery("select item1, item2 from table1");
while (resultSet.next()) {
    int item2 = resultSet.getInt(2);
    resultSet.updateInt(2, item2 * 2);
    resultSet.updateRow();
}

After running this code, the table would contain the following rows:

weiqi=# select * from table1;
 item1 | item2
-------+-------
     1 |  2048
     2 |  4096
     3 |  6144
(3 rows)

Furthermore, The ResultSet interface provides a moveToInsertRow() method that allows the Java program to move the cursor to an insert row. You can update the individual columns of this row to the desired values by calling the updater methods on the ResultSet and then make a call to the insertRow() method to insert the new row into the database table.

The following snippet of Java code inserts a new row into the table:

resultSet.moveToInsertRow();
resultSet.updateInt(1, 4);
resultSet.updateInt(2, 8192);
resultSet.insertRow();
resultSet.moveToCurrentRow();

The last line moves the cursor back to where it was before. After running this code, the table would contain the following rows:

weiqi=# select * from table1;
 item1 | item2
-------+-------
     1 |  2048
     2 |  4096
     3 |  6144
     4 |  8192
(4 rows)

Pretty cool, huh?

Summary

I hope you enjoyed the quiz. And I hope that I have convinced you that the JDK itself is a vast space that may contain many gems and it is worth it to spend some time to get to know parts of the JDK just a little bit better.

By knowing the fundamentals better, we can strive to write more succinct code, to reinvent the wheel less often, to reduce unnecessary dependencies, and to produce lean and efficient applications.

I would like to thank Brian Coyner and Lance Finney for their help in reviewing this article.

References

[1] Java SE 6 Documentation
http://download.oracle.com/javase/6/docs/