Java 8 Project Lambda
By Rad Widmer, OCI Senior Software Engineer
February 2013
Introduction
Java language designer Daniel Smith describes the forthcoming enhancements in Java 8 as "dramatic and necessary". He isn't exaggerating. On the language side, major new features include
- lambdas (closures)
- default methods for interfaces
- enhanced type inference (target typing)
And on the libraries side, there will be major enhancements to the Collections libraries, with the introduction of the Stream framework and related interfaces. All of this will enable us to write better Java code, using a more "fluent", functional, and declarative style, with much less boilerplate code for many common use cases. And the Java language designers have managed to do all this in a way that extends naturally from the existing language. Even so, it will bring a significant paradigm shift in how we program and how application code and the libraries interact. In a nutshell, it will enable applications to focus more on the "what", and let the libraries take care of the "how".
In this article we discuss the new language features coming in Java 8, as well as the most important enhancements to the standard libraries, specifically the new Stream
interface. Note that as of this writing (January 2013), the project lambda features have not been finalized, and the final version will likely differ somewhat from what is described here.
Lambda Expressions
Lambdas, also known as closures, are simply anonymous functions - i.e., functions with no name. The syntax is
(parameters) -> expression
(parameters) -> {statements…}
A lambda expression has the same elements as a Java method: a list of parameters, a body, and a return value. The types of the parameters may be given explicitly, or they can be inferred from the target type (more on target types later). The body can be either an expression or a block of statements. A lambda can also throw exceptions.
Let's look at some examples.
Runnable r = () -> System.out.println("hello world"); r.run(); // prints "hello world"
Here a lambda expression is assigned to a Runnable
and implements the Runnable.run()
method. The compiler determines what type the lambda should be by the type of the target (Runnable
in this case). This is a simple instance of target typing. Contrast this with how you would have to code the above example in Java 7 using an anonymous inner class. Most of the code here is boiler plate.
Runnable r = new Runnable() {
@Override
public void run() {
System.out.println("hello world"); }
};
An example with a single parameter. Note the parentheses around the parameters are optional when there is one parameter.
FileFilter filter = f -> f.getAbsolutePath().endsWith(".txt");
return filter.accept(new File("myfile.txt")); // returns true
The lambda expression is assigned to a FileFilter
instance. Again, the compiler knows the type of the lambda based on the type of the target, and it also knows the parameter f
is a File
because of the signature of the FileFilter.accept(File f)
method. It doesn't hurt to add the parameter types in the lambda expression, but it usually is not needed.
In the next example a lambda expression implements a Comparator
.
// sort a list by lastName
List<Person> persons = ...;
persons.sort((p1, p2) -> p1.getLastName().compareTo(p2.getLastName()));
Variable Capture
A lambda expression may reference variables which are visible in the current scope -- this is what makes lambdas closures instead of simply functions. The referenced variables must be either final or "effectively final". An effectively final variable meets the same restrictions as final variables -- it's only assigned to once, but it doesn't have the final keyword.
void doRun(String msg) { Runnable r = () -> System.out.println(msg); r.run();
}
A reference to this
inside a lambda expression references to the enclosing class instance. The lambda instance itself has no this
.
In contrast to anonymous inner classes, lambdas don't introduce another level of scope. Lambdas reside at the same level as the enclosing context. One consequence of this is that the lambda's parameter names can't shadow any local variable names in the enclosing method. The following example would not compile.
String a, b;
List<String> strings = ...;
// compile error -- lambda parameter names shadow local variables
strings.sort((a, b) -> a.compareTo(b));
Functional Interfaces
In the above examples, we saw lambda expressions assigned to common existing JDK interfaces. So what is the type of a lambda expression? The Java language designers chose not to introduce a new "function type" to support lambdas. Instead, they decided to build on what was already in the language by defining a special category of interfaces called Functional Interfaces. A functional interface is any interface that has exactly one abstract method. It is this abstract method which is implemented by the lambda expression, and the type of the lambda expression is an instance of the functional interface. By this definition, many existing interfaces in the JDK are functional interfaces, and can be implemented by lambda expressions. These include
- java.lang.Runnable
- java.util.Comparator
- java.beans.PropertyChangeListener
- java.awt.event.ActionListener
Java 8 introduces several new functional interfaces, designed specifically to work with enhancements to the Collections and related libraries, but useful in their own right. These include the following from the java.util.functions
package. Note that the new JDK8 function APIs are still evolving and the released version will likely differ somewhat from what is described here.
interface | method | purpose |
---|---|---|
Predicate |
boolean test(T t); |
returns true if the input matches some criteria |
Supplier |
T get(); |
A supplier of objects. The result objects are either created during the invocation of get or by some prior action. |
Block |
void accept(T t); |
Performs operations on an object. Can change the state of this object or other objects |
Function |
R apply(T t); |
Maps an input object of type T to an appropriate output object of type R |
MultiFunction |
void apply(Collector collector, T element); |
Maps a T value to multiple U values. |
BinaryOperator |
T operate(T left, T right); |
Performs an operation with two operands and returns a result of the same type. |
Target Typing
The syntax for lambda expressions doesn't include the functional interface type. So how does the compiler know the type of the lambda? It is determined by the context in which the lambda is used. The context must have a target type, T, and that is the type of the lambda expression. The compiler determines whether the lambda expression is compatible with the target type by checking the following conditions:
- T is a functional interface type
- the lambda expression has the same number and types of parameters as T's function
- the value returned by the lambda expression is compatible with the function's return type
- any exceptions thrown by the lambda are compatible with the function's throws expression
These rules imply that the same lambda expression can have a different type in different contexts.
interface Worker {
void doWork();
}
// Lambda type is Runnable
Runnable r = () -> System.out.println("no params lambda");
// Lambda type is Worker
Worker w = () -> System.out.println("no params lambda");
The target type can be specified with a type cast.
// compile error -- target type is Object
Object r = () -> System.out.println("hello"); // Compile Error
// Casting can be used to specify a target type.
Object r = (Runnable) () -> System.out.println("hello");
Since the compiler "knows" the parameter types, they usually don't need to be specified in the lambda.
// s is type String because the target function's signature is test(String s)
Predicate<String> p = s -> s.length() < 80;
// f is type File because the target function's signature is accept(File f)
FileFilter filter = f -> f.getAbsolutePath().endsWith(".txt");
The lambda must not throw any exceptions that are not declared by the target type.
DateFormat fmt = new SimpleDateFormat("yyyyMMdd");
// compile error -- DateFormat.parse(String s) throws ParseException
Function<String, Date> dateToString = d -> fmt.parse(d);
Method References
Frequently a lambda expression will simply make a call to a method, passing along the lambda's parameters. In cases like this, there is a new abbreviated syntax called method reference that omits the parameters. There are actually several types of method references, each with similar syntax and slightly different semantics. The following table summarizes the different types. In the table, C
is a class name, m
is a method name, v
is a variable, and the lambda's parameters are (a, b, c)
.
type of method called | syntax | invocation |
---|---|---|
static method | C::m |
C.m(a, b, c) |
instance method | C::m |
a.m(b, c) |
particular instance method | v::m |
v.m(a, b, c) |
constructor | C::new |
new C(a, b, c) |
Note that when referencing an instance method, the first parameter becomes the receiver in the invocation.
The following code shows examples of pairs of equivalent lambdas implemented first without and then with a method reference.
// invoke a static method
Block<String[]> b1 = s -> Arrays.sort(s);
Block<String[]> b2 = Arrays::sort; // method reference
// invoke an instance method
Predicate<String> p1 = s -> s.isEmpty();
Predicate<String> p2 = String::isEmpty(); // method reference
// Comparator calls an instance method
listOfStrings.sort((s1, s2) -> s1.compareTo(s2));
listOfStrings.sort(String::compareTo()); // method reference
// use an instance of SimpleDateFormat to convert a Data to a String
SimpleDateFormat fmt = new SimpleDateFormat("yyyyMMdd");
Function<Date, String> strToDate1 = d -> fmt.format(d);
Function<Date, String> strToDate2 = fmt::format; // method reference
// Supplier.get() returns a new JPanel instance
Supplier<JPanel> panelMaker1 = () -> new JPanel();
Supplier<JPanel> panelMaker2 = JPanel::new; // method reference JPanel p = panelMaker2.get(); // assigns a new JPanel to p.
Default Methods
Up until now, once an interface was added to the JDK, it's specification was frozen. Adding new methods to an existing interface would break compatibility with older programs which implemented the interface. The old program would throw a nasty error if the new method was called, since that method was not implemented. This made it difficult to evolve APIs, and led to misplaced methods and "garbage" classes. Consider the Collections.sort(List list)
method. It would have made more sense to add a sort
method to the List interface, but that couldn't be done. To improve this situation, Java 8 introduces default methods. The primary motivation is to support API evolution, and more generally to allow building better libraries.
A default method is an interface method which includes an implementation. Default methods are used extensively in Java 8, both in new interfaces, and as additions to existing library interfaces. For example, here is the definition of Iterable in JDK8.
public interface Iterable<T> {
Iterator<T> iterator();
public default void forEach(Block<? super T> block) { for (T t : this) {
block.accept(t);
}
}
}
The forEach method is a default method. Note the use of the new default keyword and the addition of a method implementation. Concrete implementations of Iterable do not need to provide an implementation for the forEach method. If none is provided, then the default implementation is used. If one is provided, then it overrides the default implementation. This means that existing programs will continue to compile and run.
While the primary motivation for adding default methods to Java is to make it easier to evolve APIs, there are other benefits as well, which contribute to the creation of better libraries. Consider the Iterator
interface. If you've implemented an Iterator, you probably didn't support the remove
method. But it still needed to be implemented, and usually it would just throw an exception:
@Override
public void remove() {
throw new UnsupportedOperationException("remove");
}
In Java 8 the Iterator
class contains a default implementation of remove (it's the same as the above implementation), freeing applications from having to implement their own, except in the rare case when a non-default implementation is required.
Default methods are also used in the new functional interfaces. While functional interfaces must contain only one abstract method, they may contain any number of default methods. These come in handy. For example, the Predicate interface contains default methods and, negate, or, and xor. Here's the implementation of negate:
public default Predicate<T> negate() {
return (T t) -> !test(t);
}
These default methods enable new predicates to be composed from existing ones.
Predicate<String> p1 = s -> s.endsWith(".htm");
Predicate<String> p2 = s -> s.endsWith(".html");
Predicate<String> p3 = p1.or(p2);
Default methods are virtual methods and can be overridden. What about multiple inheritance and the "diamond problem"? First, Java has always has multiple inheritance of type, now it has multiple inheritance of behavior. Second, although interfaces can now include code (behavior), they cannot include state. This simplifies the rules for multiple inheritance when there are diamond inheritance hierarchies. The new rules for inheritance are straightforward and intuitive:
- class methods beat interface methods (even if the class methods are abstract)
- overrider beats overridden
- when a class could potentially inherit more than one default methods with the same name, the method must be overridden.
This example shows a case where the class beats the interface.
class A {
public void m() {System.out.println("A's m()");}
}
interface B {
default void m() {System.out.println("B's m()");}
}
class C extends A implements B {
void x() {m();} // prints "A's m()"
}
In this example class C must implement method m because otherwise the compiler could not choose between the default methods in interface A or B. Class C's implementation of m uses the new super syntax for choosing which default method to execute: Interface.super.method()
.
interface A {
default void m() {System.out.println("A's m()");}
}
interface B {
default void m() {System.out.println("B's m()");}
}
class C implements A, B {
public void m() {B.super.m();} // prints "B's m()"}
Streams
Two of the primary goals for Java 8 are to modernize the Collections library and make parallelism easier. To see how these goals are being met, the place to start is to look at the new Stream interface. Note that the Stream and related interfaces are not yet finalized as of this writing, so the released version will likely be different than the one described here, which is based on the Jan 7, 2013 binary snapshot.
Some key characteristics of a Stream are:
- A Stream is a potentially infinite series of values which can be iterated over and operated on to produce some result.
- A Stream itself contains no storage for the values. Instead it obtains values from a source such as an Iterator or generator function.
- Streams support a functional style of programming, enabling operations to be chained together without requiring any intermediate variables.
- Streams can be either sequential or parallel, at the discretion of the application.
- The details of how the iteration is performed is controlled by the library.
The Stream interface contains two types of methods:
- Intermediate operations -- Transform the input stream into a new Stream of the same or different type
- Terminal operations -- Perform an operation on the input Stream and return a result which is not a stream
Stream operations are typically chained together in a pipeline of one or more intermediate operations followed by a single terminal operation. A typical pattern is generally some combination of filter, map, and reduce operations.
The main methods of the Stream interface are listed below.
method | return type | operation type | category |
---|---|---|---|
filter(Predicate predicate) |
Stream |
intermediate/lazy | filter |
map(Function mapper) |
Stream |
intermediate/lazy | map |
map(IntFunction mapper) |
IntStream |
intermediate/lazy | map |
mapMulti(MultiFunction mapper) |
Stream |
intermediate/lazy | map |
uniqueElements() |
Stream |
intermediate/lazy | filter |
sorted(Comparator comparator) |
Stream |
intermediate/lazy | filter |
forEach(Block block) |
void |
terminal/eager | |
tee(Block block);Stream |
Stream |
intermediate/lazy | |
limit(long sizeLimit) |
Stream |
intermediate/lazy | filter |
substream(long startIndex) |
Stream |
intermediate/lazy | filter |
substream(long startIndex, long endIndex) |
Stream |
intermediate/lazy | filter |
into(A target) |
A extends Destination> A |
terminal/eager | |
toArray() |
Object[] |
terminal/eager | |
reduce(T zero, BinaryOperator reducer) |
T |
terminal/eager | reduce |
reduce(BinaryOperator reducer) |
Optional |
terminal/eager | reduce |
reduce(U zero, BiFunction accumulator, BinaryOperator reducer) |
U |
terminal/eager | reduce |
accumulate(Accumulator reducer) |
R |
terminal/eager | reduce |
accumulate(Supplier resultFactory, BiBlock accumulator, BiBlock reducer) |
R |
terminal/eager | reduce |
accumulateConcurrent(ConcurrentTabulator tabulator) |
R |
terminal/eager | reduce |
max(Comparator comparator) |
Optional |
terminal/eager | reduce |
min(Comparator comparator) |
Optional |
terminal/eager | reduce |
anyMatch(Predicate predicate) |
boolean |
terminal/eager | reduce |
allMatch(Predicate predicate) |
boolean |
terminal/eager | reduce |
noneMatch(Predicate predicate) |
boolean |
terminal/eager | reduce |
findFirst() |
Optional |
terminal/eager | reduce |
findAny() |
Optional |
terminal/eager | reduce |
sequential() |
Stream |
intermediate/lazy | Stream conversion |
parallel() |
Stream |
intermediate/lazy | Stream conversion |
unordered() |
Stream |
intermediate/lazy | Stream conversion |
Obtaining a Stream
For Collection
classes, two new methods were added to the Collection
interface for creating Streams which iterate over the Collection.
Collection.stream()
-- returns a sequential StreamCollection.parallel()
-- returns a parallel Stream
A Stream can also be created given an Iterator
or Supplier
function (for infinite Streams). To create a Stream that supports parallel operations, a Spliterator
is required. Spliterator
is a new interface which provides the ability to decompose (split) an aggregate data structure and to iterate over the elements of the aggregate. To perform parallel options, a Stream recursively splits the original aggregate into smaller pieces -- each with it's own Spliterator
, until a threshold size is reached beyond which any further splitting would just generate additional cost. The resulting Spliterators can then be iterated over in parallel.
Regardless of whether the Stream is sequential or parallel, it can be used in basically the same way. All the details of iterating in parallel are hidden from the application. Here we will focus on sequential streams.
Let's look at how we can handle some common use cases using Streams.
Filter a collection based on some criteria.
// Given a list of Strings, return a new list with null and empty strings removed
List<String> strings = ...;
List<String> filtered = strings.stream().filter(s -> s != null << s.length() > 0)
.into(new ArrayList<String>());
A stream is obtained from the strings list using the stream() method. The filter method returns a new stream containing only those elements which satisfy some condition (a non-empty string in this example). The into() method is a terminal method which populates a destination collection with the elements of a stream. This shows the typical pattern for stream operations - a series of one or more intermediate operations (filter in this case) which return a new Stream, followed by a single terminal operation (into in this example), which returns something other than a stream (an ArrayList in this example).
To extend the above example, let's convert the strings to uppercase and sort them.
List<String> filtered = strings.stream().filter(s -> s != null << s.length() > 0)
.map(s -> s.toUpperCase()).sorted(String::compareTo).into(new ArrayList<String>());
The map
method returns a new Stream
with the original strings converted to upper case. Thesorted
method returns a new stream sorted using the given Comparator
. Note the use of the method reference String::compareTo
. The sorted
call could also be written sorted((s1, s2) -> s1.compareTo(s2))
.
The map
method can map its input to a different type, as in the next example, which takes a list of albums and compiles a set of artists.
List<Album> albums = ...;
Set<String> artists = albums.stream().map(a -> a.getArtist()).into(new HashSet<String>());
It's easy to pull multiple values from each element of a Stream
and "flatten" them into a new Stream. The next example finds all the tracks by a given composer given a Collection
of Albums.
Collection<Album> albums = ...;
// Find all tracks by composer
List<Track> tracks = albums.stream()
.<Track>mapMulti((collector, album) -> collector.yield(album.getTracks())) .filter(track -> track.getComposer().equals("Billy Strayhorn"))
.into(new ArrayList<Track>());
// Note: the <Track> cast before the mapMulti call shouldn't really be needed, but the compiler
// complains if it's not included.
The key here is the mapMulti
method. As the name implies, it is similar to the map method but yields multiple output values for each input. The type of the lambda expression is MultiFunction
, which maps a single T to multiple U's. It's function is apply(MultiFunction.Collector collector, T element)
, where element
is the upstream element, and collector
is an extra argument supplied by the Stream
framework for collecting multiple values from the element
. In this example, each Album has one or more tracks, which are yielded to the collector, resulting in a new flattened stream of tracks. This example works whether Album.getTracks()
returns a Collection
or an array. The yield
method is overloaded for a Collection, array of U, Stream, and a single U element, so values can be yielded singly or in aggregates. The results are flattened so yielding an array containing [1, 2]
is the same as calling yield[1], yield[2]
. The ability to yield values one at a time is important because it avoids the need to create temporary Collections if the aggregate values are not already in a Collection.
The reduce
methods perform a binary operation on the previous result and the next Stream element to produce a new result, then repeat this process until there are no more elements.
int[] intAry = {3, 5, 9};
// add up the elements in intAry
int sum = Arrays.stream(intAry).reduce(0, Integer::sum);
assertEquals(17, sum);
String[] strings = {"a", "b", "c"};
// same as Strings.join(",", strings)
String result = Arrays.stream(strings).reduce("", (s1, s2) -> s1 + "," + s2);
assertEquals("a,b,c", result);
The first parameter in the reduce
method is the zero
parameter. It is used as the first result, thus ensuring there is always a final result, even if the stream contains no elements.
To conclude, we show two examples of obtaining Streams from an Iterator and generator functions. The first example creates an Iterator
to generate the Fibonacci numbers. The second example does the same thing using the new Supplier
method, which is simpler to use when the sequence is infinite (we ignore integer overflow for these examples). Both examples use methods in the new java.util.stream.Streams class to obtain a Stream
given some input source.
@Test
public void streamFromIterator() {
Iterator<Integer> fibo = new Iterator<Integer>() {
private int f1 = 0;
private int f2 = 1;
@Override
public boolean hasNext() {
return true; // infinite iterator
}
@Override
public Integer next() {
int f = f1;
int nextf = f1 + f2;
f1 = f2;
f2 = nextf;
return f;
}
};
Stream<Integer> fiboStream = Streams.stream(Streams.spliteratorUnknownSize(fibo),
StreamOpFlag.NOT_SIZED | StreamOpFlag.IS_ORDERED | StreamOpFlag.IS_SORTED);
List<Integer> fibos10 = fiboStream.limit(10).into(new ArrayList<Integer>());
assertArrayEquals(new Integer[]{0, 1, 1, 2, 3, 5, 8, 13, 21, 34}, fibos10.toArray(new Integer[10]));
}
A simpler approach can be used for infinite streams. Instead of an Iterator, a Supplier interface is implemented. It has a single method, get()
, which returns the next value in the sequence.
@Test
public void streamFromSupplier() {
Supplier<Integer> fibo = new Supplier<Integer>() {
private int f1 = 0;
private int f2 = 1;
@Override
public Integer get() {
int f = f1;
int nextf = f1 + f2;
f1 = f2;
f2 = nextf;
return f;
}
};
Stream<Integer> fiboStream = Streams.generate(fibo);
List<Integer> fibos10 = fiboStream.limit(10).into(new ArrayList<Integer>());
assertArrayEquals(new Integer[]{0, 1, 1, 2, 3, 5, 8, 13, 21, 34}, fibos10.toArray(new Integer[10]));
}
Summary
Project Lambda is a big win for Java developers. It will enable us to write better code -- shorter, more declarative, easier to understand and maintain. And the cost in terms of additional language complexity is kept to a minimum. If you want to try out the new lambda features, you can download the latest JDK8 binary snapshot from jdk8.java.net/lambda (see the references below). For IDE support, IntelliJ IDEA version 12 has full support for jdk8.
References
- [1] The main Project Lambda page
http://openjdk.java.net/projects/lambda/ - [2] Binary snapshots of OpenJDK with Lambda support, for Windows, Solaris, Linux, and Mac OS X. Includes JDK javadoc download
http://jdk8.java.net/lambda/ - [3] IntelliJ IDEA 12 includes full support for the new jdk8 language features
http://www.jetbrains.com/idea/download/ - [4] Maurice Laftalin's Lambda FAQ http://www.lambdafaq.org/ is an excellent introduction to lambda expressions. Includes a comprehensive resources page.
- [5] "State of the Lambda" by Brian Goetz: A good overview of the new Lambda features by the chief Java language architect of the Lambda project.
- [6] "State of the Lambda: Libraries Edition" by Brian Goetz. A good overview of the proposed library enhancements, primarily focused on Streams.
- [7] "The Road to Lambda" by Brian Goetz
- "Lambda: A Peek Under the Hood" by Brian Goetz focuses on Lambda implementation details.
- [8] "Jump-Starting Lambda" by Stuart Marks and Mike Duigou.
- [9] "Project Lambda in Java SE 8" by Daniel Smith, Java Language Designer: video and slides (pdf).