Saturday 7 July 2012

Item 41: Use overloading judiciously


The following program is a well-intentioned attempt to classify collections according to whether they are sets, lists, or some other kind of collection:

// Broken! - What does this program print?
public class CollectionClassifier {
public static String classify(Set<?> s) {
return "Set";
}
public static String classify(List<?> lst) {
return "List";
}
public static String classify(Collection<?> c) {
return "Unknown Collection";
}
public static void main(String[] args) {
Collection<?>[] collections = {
new HashSet<String>(),
new ArrayList<BigInteger>(),
new HashMap<String, String>().values()
};
for (Collection<?> c : collections)
System.out.println(classify(c));
}
}

You might expect this program to print Set, followed by List and Unknown Collection, but it doesn’t. It prints Unknown Collection three times. Why does this happen? Because the classify method is overloaded, and the choice of which overloading to invoke is made at compile time. For all three iterations of the loop, the compile-time type of the parameter is the same: Collection<?>. The
runtime type is different in each iteration, but this does not affect the choice of overloading. Because the compile-time type of the parameter is Collection<?>, the only applicable overloading is the third one, classify(Collection<?>), and this overloading is invoked in each iteration of the loop.

The behavior of this program is counterintuitive because selection among overloaded methods is static, while selection among overridden methods is dynamic. The correct version of an overridden method is chosen at runtime, based on the runtime type of the object on which the method is invoked.

Consider the following program:

class Wine {
String name() { return "wine"; }
}
class SparklingWine extends Wine {
@Override String name() { return "sparkling wine"; }
}
class Champagne extends SparklingWine {
@Override String name() { return "champagne"; }
}
public class Overriding {
public static void main(String[] args) {
Wine[] wines = {
new Wine(), new SparklingWine(), new Champagne()
};
for (Wine wine : wines)
System.out.println(wine.name());
}
}

The name method is declared in class Wine and overridden in classes SparklingWine and Champagne. As you would expect, this program prints out wine, sparkling wine, and champagne, even though the compile-time type of the instance is Wine in each iteration of the loop.

Assuming a static method is required, the best way to fix the program is to replace all three overloadings of classify with a single method that does an explicit instanceof test:

public static String classify(Collection<?> c) {
return c instanceof Set ? "Set" :
c instanceof List ? "List" : "Unknown Collection";
}

It is bad practice to write code whose behavior is likely to confuse programmers. This is especially true for APIs. If the typical user of an API does not know which of several method overloadings will get invoked for a given set of parameters, use of the API is likely to result in errors. Therefore you should avoid confusing uses of overloading.

Exactly what constitutes a confusing use of overloading is open to some debate. A safe, conservative policy is never to export two overloadings with the same number of parameters. If a method uses varargs, a conservative policy is not to overload it at all, except as described in Item 42.

For constructors, you don’t have the option of using different names: multiple constructors for a class are always overloaded. You do, in many cases, have the option of exporting static factories instead of constructors (Item 1). Also, with constructors you don’t have to worry about interactions between overloading and overriding, because constructors can’t be overridden.

Prior to release 1.5, all primitive types were radically different from all reference types, but this is no longer true in the presence of autoboxing, and it has caused real trouble. Consider the following program:

public class SetList {
public static void main(String[] args) {
Set<Integer> set = new TreeSet<Integer>();
List<Integer> list = new ArrayList<Integer>();
for (int i = -3; i < 3; i++) {
set.add(i);
list.add(i);
}
for (int i = 0; i < 3; i++) {
set.remove(i);
list.remove(i);
}
System.out.println(set + " " + list);
}
}

The program adds the integers from -3 through 2 to a sorted set and to a list, and then makes three identical calls to remove on both the set and the list. If you’re like most people you’d expect the program to remove the non-negative values (0, 1, and 2) from the set and the list, and to print [-3, -2, -1] [-3, -2, -1]. In fact, the program removes the non-negative values from the set and the odd values from the list and prints [-3, -2, -1] [-2, 0, 2]. It is an understatement to call this behavior confusing.

In other words, adding generics and autoboxing to the language damaged the List interface. Luckily, few if any other APIs in the Java libraries were similarly damaged, but this tale makes it clear that it is even more important to overload with care now that autoboxing and generics are part of the language.

Array types and classes other than Object are radically different. Also, array types and interfaces other than Serializable and Cloneable are radically different. Two distinct classes are said to be unrelated if neither class is a descendant of the other [JLS, 5.5]. For example, String and Throwable are unrelated. It is impossible for any object to be an instance of two unrelated classes, so unrelated classes are radically different.

String class has had a contentEquals(StringBuffer) method since release 1.4. In release 1.5, a new interface called CharSequence was added to provide a common interface for StringBuffer, StringBuilder, String, CharBuffer, and other similar types, all of which were retrofitted to implement this interface. At the same time that CharSequence was added to the platform, String was outfitted with an overloading of the contentEquals method that takes a CharSequence.

The standard way to ensure this behavior is to have the more specific overloading forward to the more general:

public boolean contentEquals(StringBuffer sb) {
return contentEquals((CharSequence) sb);
}

While the Java platform libraries largely adhere to the spirit of the advice in this item, there are a number of classes that violate it. For example, the String class exports two overloaded static factory methods, valueOf(char[]) and valueOf( Object), that do completely different things when passed the same object reference. There is no real justification for this, and it should be regarded as an anomaly with the potential for real confusion.

To summarize, just because you can overload methods doesn’t mean you should. You should generally refrain from overloading methods with multiple signatures that have the same number of parameters. In some cases, especially where constructors are involved, it may be impossible to follow this advice. In that case, you should at least avoid situations where the same set of parameters can be passed to different overloadings by the addition of casts. If such a situation cannot be avoided, for example, because you are retrofitting an existing class to implement a new interface, you should ensure that all overloadings behave identically when passed the same parameters. If you fail to do this, programmers will be hard pressed to make effective use of the overloaded method or constructor, and they won’t understand why it doesn’t work.


Reference: Effective Java 2nd Edition by Joshua Bloch