Google Collections
By Dan Lewis, OCI Senior Software Engineer
April 2008
Introduction
Google Collections is a free, open source library of Java classes that extend and enhance Java's version of the canonical data structures, collectively called the Java Collections Framework. The goal of the project is to extend the Java Collections Framework and make working with it more convenient.
The most significant extensions of the Java Collections Framework are the new collections interfaces and their supporting implementations: BiMap
, MultiMap
, and MultiSet
. They reside within the com.google.common.collect package.
Google Collections also provides various general-purpose utility classes used by the collections package. They reside within the com.google.common.base package.
Please note that there is a follow-up article now available: Google Collections, Part 2.
Scenarios
The following scenarios demonstrate selected Google Collections features. The full source code is available.
- Create Generic Collections Conveniently
- Do Inverse Lookups with
BiMap
- Check the Frequency with
MultiMap
- Bag Groceries with
MultiSet
- Transform Lists Lazily
Scenario #1: Create Generic Collections Conveniently
Google Collections makes creating parameterized collections of Lists
and Sets
convenient.
Before:
- List<String> l1 = Arrays.asList("Monday", "Tuesday", "Wednesday");
- List<Double> l2 = Arrays.asList(1.0, 2.0, 3.0);
- Set<String> s1 = new HashSet<String>();
- Collections.addAll(s1, "Keys", "Badge", "Wallet");
- Map<String,String> m1 = new HashMap<String, String>();
After:
- List<String> l1 = Lists.newArrayList("Monday", "Tuesday", "Wednesday");
- List<Double> l2 = Lists.newLinkedList(1.0, 2.0, 3.0);
- Set<String> s1 = Sets.newHashSet("Keys", "Badge", "Wallet");
- Map<String,String> m1 = Maps.newHashMap();
Note above that Arrays.asList
is part of the standard JDK. However, it always returns an ArrayList
, while Lists contains several options to choose from.
Observe below how Immutable maps of up to five elements can be constructed and initialized on one line.
- enum Party { Elephant, Donkey }
- ...
- Map<Party,String> partyNames = Maps.immutableMap(Donkey, "Democrat", Elephant, "Republican");
- Map<String,Party> affiliations = Maps.immutableMap("Barack", Donkey, "Hillary", Donkey, "John", Elephant);
- Set<String> candidates = Sets.immutableSortedSet("John", "Barack", "Hillary");
- for (String candidate : candidates) {
- System.out.printf("%s is a %s\n", candidate, partyNames.get(affiliations.get(candidate)));
- }
The output is:
Barack is a Democrat
Hillary is a Democrat
John is a Republican
Note that the output is sorted, thanks to the use of Sets.immutableSortedSet()
.
Scenario #2: Do Inverse Lookups with BiMap
From the BiMap
javadocs:
A bimap (or "bidirectional map") is a map that preserves the uniqueness of its values as well as that of its keys. This constraint enables bimaps to support an "inverse view", which is another bimap containing the same entries as this bimap but with reversed keys and values.
In this example, we'll create a two-way mapping between U.S. paper currency note denominations and the portraits each features.
- BiMap<String,String> denomToPortrait = Maps.newHashBiMap();
- denomToPortrait.put("One", "Washington");
- denomToPortrait.put("Five", "Lincoln");
- denomToPortrait.put("Ten", "Hamilton");
- denomToPortrait.put("Twenty", "Jackson");
- denomToPortrait.put("Fifty", "Grant");
- denomToPortrait.put("Hundred", "Franklin");
As in a standard map we can look up the portrait via the denomination:
- System.out.printf("%s is on the %s\n", denomToPortrait.get("Ten"), "Ten");
- System.out.printf("%s is on the %s\n", denomToPortrait.get("Fifty"), "Fifty");
The output is:
Hamilton is on the Ten
Grant is on the Fifty
Now we can do an inverse lookup of the denomination via the portrait:
- BiMap<String, String> portraitToDenom = denomToPortrait.inverse();
- System.out.printf("%s is on the %s\n", "Jackson", portraitToDenom.get("Jackson"));
- System.out.printf("%s is on the %s\n", "Lincoln", portraitToDenom.get("Lincoln"));
The output is:
Jackson is on the Twenty
Lincoln is on the Five
Scenario #3: Check the Frequency with MultiMap
A multimap is a map that may contain multiple values for the same key. In the following examples this behavior is used to find the frequency of a given number of siblings. In other words, assume Jerry and Jay are the only people in the sample that have 3 siblings. The frequency of 3 siblings would thus be 2.
- Multimap<Integer, String> siblings = Multimaps.newHashMultimap();
- siblings.put(0, "Kenneth");
- siblings.put(1, "Joe");
- siblings.put(2, "John");
- siblings.put(3, "Jerry");
- siblings.put(3, "Jay");
- siblings.put(5, "Janet");
-
- for (int i = 0; i < 6; i++) {
- int freq = siblings.get(i).size();
- System.out.printf("%d siblings frequency %d\n", i, freq);
- }
The output is:
0 siblings frequency 1
1 siblings frequency 1
2 siblings frequency 1
3 siblings frequency 2
4 siblings frequency 0
5 siblings frequency 1
There are three subinterfaces of Multimap
: SetMultimap
, ListMultimap
, and SortedSetMultimap
. SetMultimap
ensures unique key/value pairs, while ListMultimap
does not. ListMultimap
, however, maintains the addition order of multiple values, and allows duplicates. SortedSetMultimap
is like SetMultimap
, but the multiple values for a given key are sorted by natural ordering. The example above utilizes a particular implementation of SetMultimap
called HashMultimap
.
Scenario #4: Bag Groceries with MultiSet
A Multiset
, also referred to as a bag in the javadocs, preserves equality even when order is different like a set. Unlike a set, however, a bag may contain duplicates. Google Collections' Multiset
provides a count method for counting duplicate members.
- HashMultiset<String> mon = Multisets.newHashMultiset();
- mon.add("milk");
- mon.add("bread");
-
- HashMultiset<String> thurs = Multisets.newHashMultiset();
- thurs.add("bread");
- thurs.add("milk");
-
- HashMultiset<String> sat = Multisets.newHashMultiset();
- sat.add("milk");
- sat.add("bread");
- sat.add("bread");
- sat.add("bread");
- sat.add("bread");
-
- System.out.printf("mon = thurs : %s\n",mon.equals(thurs));
- System.out.printf("mon = sat : %s\n",mon.equals(sat));
- System.out.printf("mon we bought %d of %s\n", mon.count("milk"), "milk");
- System.out.printf("sat we bought %d of %s\n", sat.count("bread"), "bread");
The output is:
mon = thurs : true
mon = sat : false
mon we bought 1 of milk
sat we bought 4 of bread
Scenario #5: Transform Lists Lazily
The Lists.transform
applies a function to every item in a given list. The function is not called until an element is accessed. This is important because changes in the given list after the transform will be applied.
The following example creates a transformed list that contains the string lengths of the corresponding elements in the original list:
- Function<String, Integer> strlen = new Function<String, Integer>() {
- @Override
- public Integer apply(String from) {
- Preconditions.checkNotNull(from);
- return from.length();
- }
- };
- List<String> from = Lists.newArrayList("abc", "defg", "hijkl");
- List<Integer> to = Lists.transform(from, strlen);
- for (int i = 0; i < from.size(); i++) {
- System.out.printf("%s has length %d\n", from.get(i), to.get(i));
- }
The output is:
abc has length 3
defg has length 4
hijkl has length 5
Now consider this example that detects palindromes:
- Function<String, Boolean> isPalindrome = new Function<String, Boolean>() {
- @Override
- public Boolean apply(String from) {
- Preconditions.checkNotNull(from);
- return new StringBuilder(from).reverse().toString().equals(from);
- }
- };
- List<String> from = Lists.newArrayList("rotor", "radar", "hannah", "level", "botox");
- List<Boolean> to = Lists.transform(from, isPalindrome);
- for (int i = 0; i < from.size(); i++) {
- System.out.printf("%s is%sa palindrome\n", from.get(i), to.get(i) ? " " : " NOT ");
- }
- // changes in the "from" list are reflected in the "to" list
- System.out.printf("\nnow replace hannah with megan...\n\n");
- from.set(2, "megan");
- for (int i = 0; i < from.size(); i++) {
- System.out.printf("%s is%sa palindrome\n", from.get(i), to.get(i) ? " " : " NOT ");
- }
-
The output is:
rotor is a palindrome
radar is a palindrome
hannah is a palindrome
level is a palindrome
botox is NOT a palindrome
now replace hannah with megan...
rotor is a palindrome
radar is a palindrome
megan is NOT a palindrome
level is a palindrome
botox is NOT a palindrome
Note that "hannah" is replaced with "megan" in the "from" list. We can see that our Palindrome function was applied again because this time the third element is no longer a palindrome. This is what is meant by the term "lazy". The transformation doesn't happen immediately when transform is called, but is applied each time an element is accessed.
Preconditions, used in both transform examples above, is part of the com.google.common.base package. It has static methods useful for validating arguments.
Considerations
Please consider the items listed below before adopting Google Collections. This will help you to avoid bugs and the need for refactoring later.
Tests
There are no published tests for the library. An outstanding issue (#54) on this subject explains that the tests are present and being extricated from other non-open source dependencies at Google.
Java Versions Supported
Google Collections target Java 1.5. (In fact, the IntelliJ IDEA Project files expect to find JDK 1.5.0_06). kevinb9n (Kevin Bourrillion): "We will upgrade to requiring Java 6, but will create a Java 5-compatible branch and will include both forms in our release."
Version
The stated version of Google Collections is 0.5 alpha. The imminent branch for Java 6 and the lack of published tests explain the "alpha" designation.
License
Apache License 2.0: http://www.apache.org/licenses/LICENSE-2.0 The Apache License 2.0 is approved by the OSI via the License Review Process. http://www.opensource.org/licenses/alphabetical
Alternatives
Apache Commons Collections - seeks to build upon the JDK classes by providing new interfaces, implementations and utilities.
Commons-Collections with Generics - A Java 5 generics-enabled version of the popular Jakarta Commons-Collections project.
Releases
There have been two "snapshots" made available on the website:
- Mar 21, 2008 Snapshot Release (release notes)
- Oct 22, 2007 Snapshot Release
Naming Convention for Utility Classes
Look for classes with names ending in 's'. These classes are usually good places to start because they contain factory initialization methods. See Lists
, Maps
, and Multimaps
.
Conclusion
Google Collections extends the Java Collections Framework with useful new interfaces and makes working with generics-enabled collections convenient.
Test drive this important and useful library now, but watch for publication of the promised tests and more information about the Java 5 / Java 6 branch before adopting. Also, be sure to evaluate both the Apache Commons Collections and Commons Collections with Generics libraries.
References
- [1] Google Collections Project Page on Google Code
http://code.google.com/p/google-collections/ - [2] What is the Google Collections Library? Interview on JavaLobby by Geertjan Wielenga with Kevin Bourrillion and Jared Levy
http://www.javalobby.org/articles/google-collections/ - [3] Java Collections Framework from Java 5 documentation
http://java.sun.com/j2se/1.5.0/docs/guide/collections/ - [4] Coding in the Small with Google Collections
http://publicobject.com/2007/09/series-recap-coding-in-small-with.html - [5] BiMap javadocs
http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/BiMap.html - [6] Apache Commons Collections
http://commons.apache.org/collections - [7] Commons-Collections with Generics
http://sourceforge.net/projects/collections - [8] Full Source Code for Above Scenarios
april2008-src.zip