Google Collections, Part 2
By Dan Lewis, OCI Senior Software Engineer
February 2009
Introduction
Data Structures are fundamental to computer software. They allow the efficient storage, manipulation, and retrieval of data. Java's implementation of the canonical Data Structures is called Java Collections Framework. Google Collections, an open source contribution from Google extends the Java Collections Framework. Google Collections is not a replacement for the Java Collections Framework, but rather supplements and extends the Java Collections Framework to provide more power and flexibility to the programmer.
This article is preceded by the April 2008 edition of the Google Collections SETT article. This article may be read independently of the previous article, but newcomers to Google Collections are encouraged to read the previous article.
Disclaimer: Google Collections library is currently under development and continues to evolve as of the time of this writing, and some APIs may be subject to change.
Topics
The following topics demonstrate selected Google Collections features and review the Java Collections Framework upon which it is based. The full source code of the examples is available.
- Use the Right Collection
- Take the Inverse View
- Check Types at Runtime
- Enforce Immutability
- Filter Out Unwanted Objects
- Collect User-Defined Types
- Case Study: Find the Right Blood Donor
1. Use the Right Collection
Think outside the List. Java Collections Framework and Google Collections provide many useful data structures beyond the List. Sometimes, a lot of code can be saved by using the correct collection.
Set
Sets do not allow repeated elements and do not preserve order. This example uses the newHashSet(E... elements)
static method of the Sets
class that accepts a varargs array of elements. Note that you don't have to specify the types on the right hand side of the assignment. Arrays or Lists that have uniqueness checks around every add method are good candidates to be refactored to be Sets.
Set<String> dontForget = Sets.newHashSet("Keys", "Badge", "Wallet");
dontForget.add("Keys"); // ignored, set will still only contain one instance of "Keys"
Multiset
Multisets (also known as bags) do allow repetition and do not preserve order. Subclass TreeMultiset keeps elements sorted according to their "natural" sort order (see http://java.sun.com/docs/books/tutorial/collections/interfaces/order.html) or can accept a Comparator. All Multisets know how many occurrences of each object are contained within and provide a count method. See the "Bag Groceries with Multiset" example in the previous Google Collections SETT article.
Multiset<String> groceries = Multisets.newHashMultiset("Apple", "Orange", "Orange", "Banana", "Pear");
System.out.printf("groceries contains %d oranges\n", groceries.count("Orange"));
The output is:
groceries contains 2 oranges
List
Lists allow repetition and preserve order. The following list contains the combination to a combination lock. The newLinkedList(E...elems)
method was deprecated between versions 0.6 and 0.8 of Google Collections because of a conflict with another method of the same name that accepts an Iterator. The newArrayList(E...elems)
method may well be deprecated by 1.0 for the same reason. If this occurs, Collections.addAll()
provides an alternative easy way to create populated lists.
List<Integer> combination = Lists.newArrayList(36, 22, 36);
Map
Maps don't allow duplicate keys, and each key maps to only one value. Multiple keys may refer to the same value. The newHashMap(K, V)
, newHashMap(K, V, K, V)
, and newHashMap(K, V, K, V, K, V)
methods were deprecated between 0.6 and 0.8 and will likely not return.
- Map<String, Integer> numLegs = Maps.newHashMap();
- numLegs.put("Horse", 4);
- numLegs.put("Zebra", 4);
- numLegs.put("Spider", 8);
- numLegs.put("Monkey", 2);
BiMap
BiMaps
require unique keys and values but provide an inverse view. Review the example below, and also see the "Do Inverse Lookups with BiMap
" example in the previous Google Collections SETT article.
- BiMap<String, String> usCoinsFrontBack1998 = Maps.newHashBiMap();
- usCoinsFrontBack1998.put("Abraham Lincoln", "Lincoln Memorial");
- usCoinsFrontBack1998.put("Thomas Jefferson", "Monticello");
- usCoinsFrontBack1998.put("Franklin D. Roosevelt", "Olive branch, Torch, Oak Branch");
- usCoinsFrontBack1998.put("George Washington", "Eagle");
- //usCoinsFrontBack1998.put("Not George Washington", "Eagle"); // java.lang.IllegalArgumentException
-
- String query1 = "Franklin D. Roosevelt";
- System.out.printf("front = %s, back = %s\n", query1, usCoinsFrontBack1998.get(query1));
-
- String query2 = "Monticello";
- System.out.printf("back = %s, front = %s\n", query2,
- usCoinsFrontBack1998.inverse().get(query2)); // now use the inverse lookup capability
The output is:
front = Franklin D. Roosevelt, back = Olive branch, Torch, Oak Branch
back = Monticello, front = Thomas Jefferson
Multimap
Multimap
allows keys to map to multiple values. Pass a key into get()
on a Multimap
and it will return a collection of all of the values associated with that key. a collection. Google provides and implements three subinterfaces: ListMultimap
, SetMultimap
, and SortedSetMultimap
. ListMultimap
allows duplicate key-value pairs and preserves insertion order of values. SetMultimap
disallows multiple key-value pairs. SortedSetMultimap
is a SetMultimap
that sorts the values for each key. The following example uses HashMultimap
, an implementation of SetMultimap
. One use of Multimap
is to check the frequency of a particular key. See the "Check the Frequency with Multimap
" example in the previous Google Collections SETT article.
- Multimap<String, String> pets = Multimaps.newHashMultimap();
- pets.put("John", "Cat");
- pets.put("John", "Dog");
- pets.put("Anne", "Fish");
- pets.put("Anne", "Bird");
- pets.put("Anne", "Bunny");
- pets.put("Anne", "Cat");
- pets.put("Dan", "Cat");
-
- final Collection<String> annePets = pets.get("Anne");
- System.out.printf("Anne's pets: %s\n", Join.join(", ", annePets));
The output is:
Anne's pets: Bunny, Bird, Cat, Fish
2. Take the Inverse View
BiMap
supports an inverse view, but Google Collections makes it easy to invert any Map
or Multimap
. Multimaps.forMap()
converts any Map
to a Multimap
. Once you have a Multimap
, call Multimaps.inverseHashMultimap()
, Multimaps.inverseArrayListMultimap()
or Multimaps.inverseTreeMultimap()
for an inverted view.
- SetMultimap<String, Integer> numLegsMulti = Multimaps.forMap(numLegs);
- HashMultimap<Integer, String> invNumLegsMulti = Multimaps.inverseHashMultimap(numLegsMulti);
- int numLegsQuery = 4;
- System.out.printf("animals with %d legs = %s\n", numLegsQuery,
- Join.join(", ", invNumLegsMulti.get(numLegsQuery)));
The output is:
animals with 4 legs = Horse, Zebra
3. Check Types at Runtime
One criticism of Java generics is that type information is "erased" at runtime. Java Collections Framework as of version 1.5 added the ability to create a "Checked" view of any Collection
, List
, Map
, Set
, SortedMap
, or SortedSet
. Note that the first example provides a compile warning but allows the put()
at runtime.
- Map<String, Integer> map = Maps.newHashMap();
- Map mapRaw = map;
- mapRaw.put("abc", "xyz"); // no RuntimeException
However, by calling Collections.checkedMap()
we add runtime checking of the types. Note the runtime exception thrown on the put:
- Map<String, Integer> map = Maps.newHashMap();
- Map<String, Integer> mapChecked = Collections.checkedMap(map, String.class,
- Integer.class);
- Map mapCheckedRaw = mapChecked;
- mapCheckedRaw.put("abc", "xyz"); // java.lang.ClassCastException
As of 0.8 Google Collections doesn't provide this capability for Multimap
. It would be an easy exercise to add this capability using ForwardedMultimap
. BiMap
and Multiset
are accepted by Collections.checkedMap()
and Collections.checkedCollection()
, respectively.
4. Enforce Immutability
Immutability is the inability to change. The first snippet shows a typical, mutable map:
- Map<String, Integer> map = Maps.newHashMap();
- map.put("abc", 123); // no RuntimeException
Now, we create a map using the ImmutableMap
factory method, and notice that we get a runtime exception if we attempt to add to it.
- Map<String, Integer> map = ImmutableMap.of("abc", 123);
- map.put("def", 456); // java.lang.UnsupportedOperationException
The of()
factory methods allow a max of three elements. Use the provided Builder
to add more elements:
- Map<String, Integer> map = new ImmutableMap.Builder<String, Integer>().put("abc", 123).build();
- map.put("def", 456); // java.lang.UnsupportedOperationException
Sometimes we want to take an existing map and make it immutable:
- Map<String, Integer> map = Maps.newHashMap();
- map.put("abc", 123); // no RuntimeException
-
- Map<String, Integer> immutableMap = ImmutableMap.copyOf(map);
- immutableMap.put("def", 456); // java.lang.UnsupportedOperationException
Java Collections Framework provides this capability on its own without Google Collections:
- Map<String, Integer> map = Maps.newHashMap();
- map.put("abc", 123); // no RuntimeException
-
- Map<String, Integer> immutableMap = Collections.unmodifiableMap(map);
- immutableMap.put("def", 456); // java.lang.UnsupportedOperationException
5. Filter Out Unwanted Objects
Google Collections provides a Constraint interface and some factory methods to apply constraints to existing collections. For the first example, we take a HashSet
and prove that adding null to it is allowed:
- Set<String> set = Sets.newHashSet();
- set.add("A");
- set.add(null);
In this example, we constrain a set to not allow null additions using the built-in notNull()
implementation of Constraint
:
- Set<String> set = Sets.newHashSet();
- Set<String> constrainedSet = Constraints.constrainedSet(set, Constraints.notNull());
- constrainedSet.add("A");
- constrainedSet.add(null); // NullPointerException here
The Constraint
s class provides methods constrainedCollection()
, constrainedList()
,constrainedMultiset()
, constrainedSortedSet()
in addition to constrainedSet()
.
6. Collect User-Defined Types
The examples in this article up to this point collect the built-in Java types, e.g. String, Integer, etc. Java Collections Framework and Google Collections demonstrate increased value when applied to more sophisticated user-defined types, i.e. types the programmer defines to represent abstractions in his/her problem domain.
The prerequisite to collecting a user-defined type is defining high quality equals()
and hashCode()
methods. See items #8 and #9 in Effective Java by Josh Bloch. The following class has high quality equals()
and hashCode()
methods.
- public class Combination {
- private int a;
- private int b;
- private int c;
-
- public Combination(int a, int b, int c) {
- this.a = a;
- this.b = b;
- this.c = c;
- }
-
- @Override public boolean equals(Object o) {
- if (o == this) {
- return true;
- }
- if (!(o instanceof Combination)) {
- return false;
- }
- Combination rhs = (Combination) o;
- return rhs.a == a && rhs.b == b && rhs.c == c;
- }
-
- @Override public int hashCode() {
- int result = 17;
- result = 31 * result + a;
- result = 31 * result + b;
- result = 31 * result + c;
- return result;
- }
- }
Now let's use Google Collections' BiMap
to collect our combinations and map them to our lockers.
- BiMap<String, Combination> locks = Maps.newHashBiMap();
- locks.put("school", new Combination(36, 22, 36));
- locks.put("ymca", new Combination(14, 7, 28));
- locks.put("club", new Combination(3, 18, 6));
Weeks later a combination is found written on a slip of paper in the junk drawer. To which locker does it refer?
BiMap<Combination, String> locksInv = locks.inverse();
System.out.println(locksInv.get(new Combination(36, 22, 36)));
The output is:
school
This works because the Combination
class provides an override for equals()
. If it didn't, the default definition inherited from Object
would be used, and the query into the locksInv
Map
would have returned null
. The default implementation of equals()
within the Object class returns true
only if both objects are the same object:
- public boolean equals(Object obj) {
- return (this == obj);
- }
Also note that because Combination
overrides equals()
, it must also, according to contract, override hashCode()
as well. From the Object.equals()
javadocs:
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
Ordering
Sorting is another area where user-defined types can leverage JCF and Google Collections. To take advantage, implement the Comparable
interface. As an example, let's consider a time slip. A time slip is a paper that a driver receives after a drag race. It shows his elapsed time for the distance he/she raced. This TimeSlip
class implements the Comparable
interface and overrides equals()
and hashCode()
:
- public class TimeSlip implements Comparable<TimeSlip> {
- String driver;
- int elapsedTime;
-
- public TimeSlip(String driver, int elapsedTime) {
- this.driver = driver;
- this.elapsedTime = elapsedTime;
- }
-
- @Override
- public boolean equals(Object o) {
- if (o == this) {
- return true;
- }
- if (!(o instanceof TimeSlip)) {
- return false;
- }
- TimeSlip rhs = (TimeSlip) o;
- return rhs.driver.equals(driver) && rhs.elapsedTime == elapsedTime;
- }
-
- @Override
- public int hashCode() {
- int result = 17;
- result = 31 * result + driver.hashCode();
- result = 31 * result + elapsedTime;
- return result;
-
- }
-
- public int compareTo(TimeSlip timeSlip) {
- if (timeSlip == this) {
- return 0;
- }
- if (elapsedTime < timeSlip.elapsedTime) {
- return -1;
- } else if (elapsedTime == timeSlip.elapsedTime) {
- return 0;
- } else {
- return +1;
- }
- }
-
- @Override
- public String toString() {
- return String.format("%s/%d", driver, elapsedTime);
- }
- }
Now TimeSlip
instances can be collected with ordering collections. Using Google Collections' TreeMultimap
, lets collect last weekend's drag racing results between Dan, Carey, and Dean. Because we have implemented Comparable
, TreeMultimap
will do the work of automatically sorting the TimeSlips
as they are added. And because it's a Multimap
, TreeMultimap
will also allow multiple values per key.
- Multimap<String, TimeSlip> weekend = Multimaps.newTreeMultimap();
- weekend.put("Friday", new TimeSlip("Carey", 13));
- weekend.put("Friday", new TimeSlip("Dan", 14));
- weekend.put("Sunday", new TimeSlip("Dean", 10));
- weekend.put("Friday", new TimeSlip("Dean", 12));
- weekend.put("Saturday", new TimeSlip("Dan", 13));
- weekend.put("Saturday", new TimeSlip("Dean", 11));
- weekend.put("Sunday", new TimeSlip("Dan", 12));
- weekend.put("Saturday", new TimeSlip("Carey", 12));
- weekend.put("Sunday", new TimeSlip("Carey", 11));
-
- System.out.printf("Friday's Times: %s\n", Join.join(", ", weekend.get("Friday")));
- System.out.printf("Saturday's Times: %s\n", Join.join(", ", weekend.get("Saturday")));
- System.out.printf("Sunday's Times: %s\n", Join.join(", ", weekend.get("Sunday")));
The output is:
Friday's Times: Dean/12, Carey/13, Dan/14
Saturday's Times: Dean/11, Carey/12, Dan/13
Sunday's Times: Dean/10, Carey/11, Dan/12
Observe that the results are in order, fastest to slowest, and grouped by day.
7. Case Study: Find the Right Blood Donor
In this exercise Google Collections and Java Collections Framework are used in concert to explore the problem of matching blood donors to recipients. First, we enumerate the blood types:
- enum BloodType {
- OPOS, APOS, BPOS, ABPOS, ONEG, ANEG, BNEG, ABNEG
- }
Now, we choose Multimap
to store the compatibility matrix of blood types. The first putAll(K key, V... values)
encodes the rule that ONEG can donate to ONEG, OPOS, ANEG, APOS, BNEG, BPOS, ABNEG, and ABPOS. Let's make it immutable because this information will not change during the execution of the program and to reflect the stability of these rules:
- // donor -> recipient red blood cell compatibility (source = http://en.wikipedia.org/wiki/Blood_type)
- Multimap<BloodType, BloodType> donorToRecipients = new ImmutableMultimap.Builder<BloodType, BloodType>().
- putAll(ONEG, ONEG, OPOS, ANEG, APOS, BNEG, BPOS, ABNEG, ABPOS).
- putAll(OPOS, OPOS, APOS, BPOS, ABPOS).
- putAll(ANEG, ANEG, APOS, ABNEG, ABPOS).
- putAll(APOS, APOS, ABPOS).
- putAll(BNEG, BNEG, BPOS, ABNEG, ABPOS).
- putAll(BPOS, BPOS, ABPOS).
- putAll(ABNEG, ABNEG, ABPOS).
- putAll(ABPOS, ABPOS).
- build();
Later in the exercise we'll want to determine the compatible donors for a recipient. So, we invert the Multimap
:
- // recipient -> donor red blood cell compatibility
- Multimap<BloodType, BloodType> recipientToDonor = Multimaps.inverseHashMultimap(donorToRecipients);
Now, let's collect some fictional people and their blood types, and create an inverse of this map as well.
- // cast member -> blood type N -> 1
- Map<String,BloodType> castBloodTypes = new ImmutableMap.Builder<String,BloodType>().
- put("Fred", ONEG).
- put("Wilma", APOS).
- put("Pebbles", ANEG).
- put("Barney", ABPOS).
- put("Betty", OPOS).
- put("Bamm-Bamm", ANEG).
- build();
-
- Multimap<BloodType, String> castBloodTypesInv = Multimaps.inverseHashMultimap(Multimaps.forMap(castBloodTypes));
Barney cuts his finger and needs blood, so let's do some lookups in our maps to find compatible donors from within the cast.
- // Barney cuts his finger and needs blood. Who from the cast can donate?
- String injured = "Barney";
-
- // create an empty set to hold the possible donors for later display
- Set<String> possibleDonors = new HashSet<String>();
-
- // look up the blood type of the injured person
- BloodType injuredType = castBloodTypes.get(injured);
-
- // look up the compatible donor types for the injured person's blood type
- Collection<BloodType> compatibleDonorTypes = recipientToDonor.get(injuredType);
-
- // iterate over each compatible donor bood type
- for (BloodType compatibleDonorType : compatibleDonorTypes) {
-
- // find cast members with the compatible type
- Collection<String> castMembersWithType = castBloodTypesInv.get(compatibleDonorType);
-
- // add all cast members with the compatible type to the set of possible donors
- possibleDonors.addAll(castMembersWithType);
- }
-
- possibleDonors.remove(injured); // the donor shouldn't try to donate to him/herself
- possibleDonors.remove("Pebbles"); // too young to donate
- possibleDonors.remove("Bamm-Bamm"); // too young to donate
-
- System.out.printf("%s can donate to %s\n", Join.join(", ", possibleDonors), injured);
-
The output is:
Betty, Fred, Wilma can donate to Barney
Going further, let's capture the distribution of blood types for a reference population in another map. Let's use this information to choose donor based on the availability of people with the same blood type.
Disclaimer: this probably isn't the best algorithm, and may not even be a good one
- // blood type -> percentage in USA population (source = http://en.wikipedia.org/wiki/Blood_type)
- Map<BloodType, Double> bloodTypeDistributionUSA = new ImmutableMap.Builder<BloodType, Double>()
- .put(OPOS, 37.4)
- .put(APOS, 35.7)
- .put(BPOS, 8.5)
- .put(ABPOS, 3.4)
- .put(ONEG, 6.6)
- .put(ANEG, 6.3)
- .put(BNEG, 1.5)
- .put(ABNEG, 0.6)
- .build();
-
- // create a Function that looks up the percentage in population based on Blood Type
- Function<BloodType, Double> bloodTypePct = Functions.forMap(bloodTypeDistributionUSA);
-
- // create a Function that looks up the Blood Type for a person
- Function<String, BloodType> personToBloodType = Functions.forMap(castBloodTypes);
-
- // create a Function that looks up the percentage in population with the same Blood Type as a given person
- Function<String, Double> personToBloodTypePct = Functions.compose(bloodTypePct, personToBloodType);
-
- // user our composed lookups to create a Comparator that sorts by availability
- Comparator<String> personToBloodTypePctComparator = Comparators.fromFunction(personToBloodTypePct);
-
- // reverse the comparator to prefer blood types with higher availability (descending)
- Comparator<String> personToBloodTypePctComparatorDescend =
- Collections.reverseOrder(personToBloodTypePctComparator);
-
- // create a Set of donors that is sorted according to blood type availability, favoring higher availability
- Set<String> possibleDonorsOptimized = Sets.newTreeSet(personToBloodTypePctComparatorDescend, possibleDonors);
-
- System.out.printf("%s (sorted in order of preference) can donate to %s",
- Join.join(", ", possibleDonorsOptimized), injured);
Note that we use some of the built-in "Functions" that Google collections provides. Functions.forMap()
creates a function that simply does a lookup in a Map
. Functions.compose()
composes two functions, e.g. f(g(x)). Comparators.fromFunction()
creates a Comparator given a Function, and we convert the Set of possible donors to an array to facilitate sorting. Finally, we use Java Collections Framework Collections.reverseOrder()
to flip the sort order.
The output is:
Betty, Wilma, Fred (sorted in order of preference) can donate to Barney
Considerations
Please consider the items listed below before adopting Google Collections. This will help you to avoid bugs and the need for refactoring later.
Tests
There are still no published tests for the library. An outstanding issue (#54) on this subject explains that the tests are present and being extricated from other non-open source dependencies at Google.
Java Versions Supported
Google Collections target Java 1.5. kevinb9n (Kevin Bourrillion): "We will upgrade to requiring Java 6, but will create a Java 5-compatible branch and will include both forms in our release." (1 Nov. 2007) however, "this will probably not happen for a long time" (6 Jan. 2009)
Version
The stated version of Google Collections is 0.8 alpha.
License
Apache License 2.0: http://www.apache.org/licenses/LICENSE-2.0 The Apache License 2.0 is approved by the OSI via the License Review Process. http://www.opensource.org/licenses/alphabetical
Alternatives
Apache Commons Collections - seeks to build upon the JDK classes by providing new interfaces, implementations and utilities.
Commons-Collections with Generics - A Java 5 generics-enabled version of the popular Jakarta Commons-Collections project.
References
- [1] Google Collections Project Page on Google Code
http://code.google.com/p/google-collections/ - [2] Apache Commons Collections
http://commons.apache.org/collections - [3] Commons-Collections with Generics
http://sourceforge.net/projects/collections - [4] Java Collections Framework
http://java.sun.com/javase/6/docs/technotes/guides/collections/index.html - [5] Full Java 5 compatible (Java 6 not required) Source Code for Above Scenarios
sourcecode.zip