Lesson: Custom Collection Implementations
Many programmers will never need to implement their own Collection
s classes. You can go pretty far using the implementations described in the preceding sections of this chapter. However, someday you might want to write your own implementation. It is fairly easy to do this with the aid of the abstract implementations provided by the Java platform. Before we discus show to write an implementation, let's discuss why you might want to write one.
Reasons to Write an Implementation
The following list illustrates the sort of custom Collection
s you might want to implement. It is not intended to be exhaustive:
- Persistent: All of the built-in
Collection
implementations reside in main memory and vanish when the program exits.If you want a collection that will still be present the next time the program starts, you can implement it by building a veneer over an external database. Such a collection might be concurrently accessible by multiple programs. - Application-specific: This is a very broad category. One example is an unmodifiable
Map
containing real-time telemetry data. The keys could represent locations, and the values could be read from sensors at these locations in response to theget
operation. - High-performance, special-purpose: Many data structures take advantage of restricted usage to offer better performance than is possible with general-purpose implementations. For instance, consider a
List
containing long runs of identical element values. Such lists, which occur frequently in text processing, can be run-length encoded — runs can be represented as a single object containing the repeated element and the number of consecutive repetitions. This example is interesting because it trades off two aspects of performance: It requires less space but more time than anArrayList
. - High-performance, general-purpose: The Java Collections Framework's designers tried to provide the best general-purpose implementations for each interface, but many, many data structures could have been used, and new ones are invented every day. Maybe you can come up with something faster!
- Enhanced functionality: Suppose you need an efficient bag implementation (also known as a multiset): a
Collection
that offers constant-time containment checks while allowing duplicate elements. It's reasonably straightforward to implement such a collection atop aHashMap
. - Convenience: You may want additional implementations that offer conveniences beyond those offered by the Java platform. For instance, you may frequently need
List
instances representing a contiguous range ofInteger
s. - Adapter: Suppose you are using a legacy API that has its own ad hoc collections' API. You can write an adapter implementation that permits these collections to operate in the Java Collections Framework. An adapter implementation is a thin veneer that wraps objects of one type and makes them behave like objects of another type by translating operations on the latter type into operations on the former.
How to Write a Custom Implementation
Writing a custom implementation is surprisingly easy. The Java Collections Framework provides abstract implementations designed expressly to facilitate custom implementations. We'll start with the following example of an implementation ofArrays.asList
.
public static <T> List<T> asList(T[] a) {
return new MyArrayList<T>(a);
}
private static class MyArrayList<T> extends AbstractList<T> {
private final T[] a;
MyArrayList(T[] array) {
a = array;
}
public T get(int index) {
return a[index];
}
public T set(int index, T element) {
T oldValue = a[index];
a[index] = element;
return oldValue;
}
public int size() {
return a.length;
}
Believe it or not, this is very close to the implementation that is contained in java.util.Arrays
. It's that simple! You provide a constructor and the get
, set
, and size
methods, and AbstractList
does all the rest. You get the ListIterator
, bulk operations, search operations, hash code computation, comparison, and string representation for free.
Suppose you want to make the implementation a bit faster. The API documentation for abstract implementations describes precisely how each method is implemented, so you'll know which methods to override to get the performance you want. The preceding implementation's performance is fine, but it can be improved a bit. In particular, the toArray
method iterates over the List
, copying one element at a time. Given the internal representation, it's a lot faster and more sensible just to clone the array.
public Object[] toArray() {
return (Object[]) a.clone();
}
With the addition of this override and a few more like it, this implementation is exactly the one found in
java.util.Arrays
. In the interest of full disclosure, it's a bit tougher to use the other abstract implementations because you will have to write your own iterator, but it's still not that difficult.
The following list summarizes the abstract implementations:
AbstractCollection
— aCollection
that is neither aSet
nor aList
. At a minimum, you must provide theiterator
and thesize
methods.AbstractSet
— aSet
; use is identical toAbstractCollection
.AbstractList
— aList
backed up by a random-access data store, such as an array. At a minimum, you must provide thepositional access
methods (get
and, optionally,set
,remove
, andadd
) and thesize
method. The abstract class takes care oflistIterator
(anditerator
).AbstractSequentialList
— aList
backed up by a sequential-access data store, such as a linked list. At a minimum, you must provide thelistIterator
andsize
methods. The abstract class takes care of the positional access methods. (This is the opposite ofAbstractList
.)AbstractQueue
— at a minimum, you must provide theoffer
,peek
,poll
, andsize
methods and aniterator
supportingremove
.AbstractMap
— aMap
. At a minimum you must provide theentrySet
view. This is typically implemented with theAbstractSet
class. If theMap
is modifiable, you must also provide theput
method.
The process of writing a custom implementation follows:
- Choose the appropriate abstract implementation class from the preceding list.
- Provide implementations for all the abstract methods of the class. If your custom collection is to be modifiable, you will have to override one or more of the concrete methods as well.The API documentation for the abstract implementation class will tell you which methods to override.
- Test and, if necessary, debug the implementation. You now have a working custom collection implementation.
- If you are concerned about performance, read the API documentation of the abstract implementation class for all the methods whose implementations you're inheriting. If any seem too slow, override them. If you override any methods, be sure to measure the performance of the method before and after the override. How much effort you put into tweaking performance should be a function of how much use the implementation will get and how critical to performance its use is. (Often this step is best omitted.)
API Design
In this short but important section, you'll learn a few simple guidelines that will allow your API to interoperate seamlessly with all other APIs that follow these guidelines. In essence, these rules define what it takes to be a good "citizen" in the world of collections.
Parameters
If your API contains a method that requires a collection on input, it is of paramount importance that you declare the relevant parameter type to be one of the collection interface types. Never use an implementation type because this defeats the purpose of an interface-based Collections Framework, which is to allow collections to be manipulated without regard to implementation details.
Further, you should always use the least-specific type that makes sense. For example, don't require a List
or a Set
if a Collection
would do. It's not that you should never require a List
or a Set
on input; it is correct to do so if a method depends on a property of one of these interfaces. For example, many of the algorithms provided by the Java platform require a List
on input because they depend on the fact that lists are ordered. As a general rule, however, the best types to use on input are the most general:Collection
and Map
.
Caution: Never define your own ad hoc
collection
class and require objects of this class on input. By doing this, you'd lose all the
benefits provided by the Java Collections Framework.
Return Values
You can afford to be much more flexible with return values than with input parameters. It's fine to return an object of any type that implements or extends one of the collection interfaces. This can be one of the interfaces or a special-purpose type that extends or implements one of these interfaces.
For example, one could imagine an image-processing package, called ImageList
, that returned objects of a new class that implements List
. In addition to the List
operations, ImageList
could support any application-specific operations that seemed desirable. For example, it might provide an indexImage
operation that returned an image containing thumbnail images of each graphic in the ImageList
. It's critical to note that even if the API furnishes ImageList
instances on output, it should accept arbitrary Collection
(or perhaps List
) instances on input.
In one sense, return values should have the opposite behavior of input parameters: It's best to return the most specific applicable collection interface rather than the most general. For example, if you're sure that you'll always return a SortedMap
, you should give the relevant method the return type of SortedMap
rather than Map
. SortedMap
instances are more time-consuming to build than ordinary Map
instances and are also more powerful. Given that your module has already invested the time to build a SortedMap
, it makes good sense to give the user access to its increased power. Furthermore, the user will be able to pass the returned object to methods that demand a SortedMap
, as well as those that accept any Map
.
Legacy APIs
There are currently plenty of APIs out there that define their own ad hoc collection types. While this is unfortunate, it's a fact of life, given that there was no Collections Framework in the first two major releases of the Java platform. Suppose you own one of these APIs; here's what you can do about it.
If possible, retrofit your legacy collection type to implement one of the standard collection interfaces. Then all the collections you return will interoperate smoothly with other collection-based APIs. If this is impossible (for example, because one or more of the preexisting type signatures conflict with the standard collection interfaces),define an adapter class that wraps one of your legacy collections objects, allowing it to function as a standard collection. (The Adapter
class is an example of a custom implementation.)
Retrofit your API with new calls that follow the input guidelines to accept objects of a standard collection interface, if possible. Such calls can coexist with the calls that take the legacy collection type. If this is impossible, provide a constructor or static factory for your legacy type that takes an object of one of the standard interfaces and returns a legacy collection containing the same elements (or mappings). Either of these approaches will allow users to pass arbitrary collections into your API.