对Java Generic相关知识的总结

最新推荐文章于 2024-05-12 02:07:39 发布

luedipiaofeng

最新推荐文章于 2024-05-12 02:07:39 发布

阅读量2k

点赞数

分类专栏： Java&J2EE 文章标签： java compiler wildcard class list object

Java&J2EE 专栏收录该内容

61 篇文章 0 订阅

订阅专栏

对于如 List<E> 、 List< String > 、 List ，其中 List<E> 称为 parameterized type ， E 称为 (formal) type parameter ， String 称为 actual type argument ， List 称为 raw type 。

Generic 的逻辑意义

原有 java 的类型系统

Generic 为 java 5 带来了新的类型，这使得 java 中的类型关系变得更加复杂，要弄清楚加入了 generic 后的类型关系就需要先弄清楚原先 java 中的类型系统。

首先，在类型的定义上，类之间不允许多重继承，类可以实现多个接口。

其次，在类型的使用上，每个变量都必须有明确的类型，变量只能指向相应类型（或相应类型的子类型）的对象，为了实现这一规则， compile time 会对所有的赋值操作做检测，而 runtime 则对所有的显示类型转换做检测。

最后，数组作为 java 中一个特殊的类型，如果 B extends A ，那么 B[] extends A[] ，当对数组 element 赋值时， runtime 会做检测，类型不符则抛 ArrayStoreException ，如下。由于有多重数组的出现，意味着 java 的类型系统种有无限种类型。

// B extends A

A a = new A();

A[] array = new B[1];

Array[0] = a; // ArrayStoreException

我认为理想状态下的 generic

首先，假设有 B<T> extends A ，那么 B<Object> extends A 、 B<String> extends B<Object> ，并且 runtime 对使用到 parameter type 的输入参数做类型检测。这跟原先 java 类型系统中的 array 是一致的。与数组相同的还有，因为有如 B<B<String>> 、 B< B<B<String>>> 等等类型的存在， generic 也可以无限增加可用类型。

其次，当 generic 跟继承连用时，（在不考虑接口的情况下）有三种新的形式： B<T> extends A 、 B extends A<String> 、 B<T> extends A<T> ，其中第三种情况意味着有 B<String> extend A<String> 。

现实中的 generic

事实上，在 java 5 中，对于 B<T> extends A ， B<Object> 跟 B<String> 之间并不存在继承关系（ invariant subtyping ），这跟数组（ covariant subtyping ）不同。之所以使用这种做法，我想有以下原因：

首先， java 5 compiler 使用 erasure 来支持 generic ，所有与 generic 相关的信息都不存在于 runtime （见下文中“ generic 的实现”），这就意味着 runtime 无法做如下的类型检测，而即便 runtime 有条件做类型检测，也势必影响代码的执行效率。

ArrayList<String> strList = new ArrayList<String>();

ArrayList<Object> objList = strList;

objList.add(new Object()); // runtime could not throw exception

其次，考虑下面的例子， B<T> extends A<T> ，有 B<String> extends A<String> ，如果使用 covariant subtyping ，又有 B<String> extends B<Object> ，这意味着存在多重继承，而多重继承在 java 里面是不被允许的。值得注意的是，尽管数组使用 covariant subtyping ，但却不会导致多重继承，因为数组属于系统类型， java 并不允许数组被继承。

采用了 invariant subtyping 之后，假如有 A<T> ，由于 A<Object> 不再是其他类型 A<String> 、 A<Integer> 等类型的父类，则无法声明可以指向所有 A<T> 类型对象的变量。为了解决这一问题， java 1.5 引入了 wildcard ，声明为 A<?> 类型的变量可以指向所有 A<T> 类型的对象。需要注意的是， wildcard 跟继承是两种不同的关系，继承使类型间呈现树状的关系，类型为 B 的变量可以指向的对象类型必须在以 B 为根节点的子树中，而类型为 A<?> 的变量可以指向的对象类型必须为类型树中 A<Object> 或与 A<Object> 平行的节点。最后， wildcard 跟继承结合使得 A<?> 类型变量能够指向的对象类型必须在以 A<Object> 及 A<Object> 平行的节点为根的所有子树中。

// A<T> extends Object, B extends Object, C extends B, D extends B

A<?> a; // instances of A<Object>, A<String>, A<Integer> can be assigned to this variable

B b; // instance of B, C, D can be assigned to this variable

Generic 的实现

加入了 generic 后 java 的 type safe

保证 type safe ，其实就关键在于确保所有变量所指向的对象的类型必须是正确的。我认为在理想状态下，应该实现以下几点：首先，类型为 A 的变量所能指向的对象类型必须在以 A 为根节点的子树中；其次，类型为 wildcard 的变量，如 A<?> ，所能指向的对象类型必须在以 A<Object> 及 A<Object> 平行的节点为根的所有子树中；最后，所有的显式转换在 runtime 必须做类型判定。其中，前两点由 compiler 实现，最后一点由 jvm 实现，然而事实上， java 5 仅实现了前两点，而决定不在 runtime 做检测。

Compile time 下 generic 的 type safe 主要包括 generic class 跟 generic method 的 type safe ，以下分开讨论。

Generic class 的 type safe

假设有以下的类：

public class A {};

public class B<T> extends A {

public T obj;

}

public class C<T> extends B<T> {

public void set(T obj) { this. obj = obj; }

public T get() { return obj; }

}

对于类型为 C<String> 的对象，能够指向它的变量的类型有： A 、 B<String> 、 C<String> 、 B<?> 、 C<?> 。对于类型为 A 的变量，通过该变量无法访问到任何与 T 相关的方法或对象变量，很显然在原有 java 的 type safe 机制仍然有效；对于类型为 B<String> 、 C<String> 的变量， compiler 对所有通过该变量所访问的方法（ set 、 get ）或对象变量 (obj) 进行检测，所有涉及到 T 的赋值都必须满足 T=String ，则 type safe 得以保证。对于类型为 B<?> 、 C<?> 的变量，通过该变量所访问的方法或对象变量，所有的输出值中 T 类型被替换成 T 的 bound （见下文中“ type parameter 的限制”），所有输入值中由于 T 类型未知，所以不能接受任何变量赋值（ null 除外）。在理想状态下，输入值中 T 类型应该也被替换成 T 的 bound ，然后由 runtime 去做类型判定，但是由于 runtime 没有 generic 相关的任何信息

C<String> strC = new C<String>();

C<?> c = strC;

// even if the following code pass compile time check, runtime could not throw exception

c.obj = new Object();

c.set(new Object());

// here’s a unexpected exception

String str = strC.obj;

str = strC.get();

在 generic class 的所有方法中， T 的类型被认为是其 bound 或者 bound 的某个子类。也就是说，首先， T 的变量只能指向类型为 T 或 T 的子类的对象；其次，通过 T 的变量只能访问到其 bound 的方法和对象变量。假设以下代码存在于 C 的 set 方法中：

public void set(T obj;) {

Object temp;

temp = obj; // ok

obj = temp; // ompile error

obj.toString(); // can access Object’s methods

}

Generic method 的 type safe

与 Generic class 不同的是，在 generic method 中， actual type argument 并非指定的，而是由 compiler 推断出的（ Inference ）。 Compiler 通过对 generic method 中的输入变量的类型推断 type parameter 的类型，如果不能够得到一个 unique smallest type ，则被视为 compile error ，参考以下代码：

public <A> void doublet(A a, A b) {};

…

// compile error, because String and Integer have both Comparable and Serializable as common supertypes

doublet(“abc”, 123);

当 wildcard 跟 generic method 同时使用时，有以下的特例：

public <T> List<T> test(List<T> list) { return list; }

…

List<?> wildcardList = new ArrayList<String>();

wildcardList = test(wildcardList);

最后， generic method 中对 type parameter 的使用所必须遵循的规则跟上面所提到的 generic class 的方法中的规则是一样的。

Erasure 的实现方式

Java 5 在 compiler 中采用 erasure 来实现 generic ，经过 erasure 的处理，所有与 generic 相关的信息将被抹掉（ erase ），同时在适当的位置插入显式类型转换，最终形成的 byte code 跟 java1.4 的 byte code 没有什么不一样。

首先， parameterized type ，被还原成其 non-parameterized type ，如 List<String> 将变成 List 。

其次， type parameter 被替换成它的 bound ，如 T 将变成 Object （假如它的 upper bound 是 Object ）。

接着，对于方法类成员的返回值，如果其类型为 parameter type ， erasure 则会插入显式转换。如：

public class A<T> {

public T get() { return null; }

}

…

A<String> a = new A<String>();

String temp = a.get();

// translate to

public class A {

public Object get() { return null; }

}

…

A a = new A();

String temp = (String) a.get();

最后 erasure 将在必要的时候插入 bridge method 。对于以下的代码

public class A<T> {

private T obj;

public void set(T obj) { this.obj = obj; }

public T get() { return obj; }

}

public class B extends A<String> {

public void set(String obj) {};

public String get() { return null;}

}

…

A<String> a = new B();

a.set(“abc”);

String temp = a.get();

在没有 bridge method 存在的情况下，对于 a 的方法的调用将无法获得多态性的支持，原因是 B 中的方法的 signature 跟 A 的不同，所以不被 jvm 视为重载。这时候 erasure 必须在 B 中插入如下的 bridge method ：

public void set(Object obj) { set((String) obj);}

public Object get() { return get(); }

需要注意的是 get 的 bridge method 在是编译不过的，因为 java 不允许这种形式的 overload ，事实上， bridge method 是直接在 byte code 中插入的。

最后值得注意的是， bridge method 只有在需要的时候被插入，如果 B 不重载 get 跟 set 方法，将不会有 bridge method 存在。

由于 runtime 缺乏 generic 相关的信息而导致的各种限制

1. 通过 wildcard 类型的变量访问方法及对象变量受到限制（如上文所述）。

2. 与 type parameter 相关的显式转化无法保证 type safe ，同时 compiler 会有 warning 。

List<?> list = new ArrayList<String>();

List<String> strList = (List<String>) list; // warning

public <T> T test1(Object obj) { return (T) obj; } // warning

public <T> T[] test2(Object[] objs) { return (T[]) objs; } // warning

3. 在创建某些类型对象时受到限制。

public <T> T test1(T sample) { return new T(); } // compile error

public <T> T[] test2(T sample) { return new T[0]; } // compile error

值得注意的是，即便提供了 actual type argument ，依然无法创建 parameterized type 的数组：

// compile error, but assumes that compiler allow to create such kind of array

List<String>[] lists = new List<String>[1];

List<Integer> intList = new List<Integer>();

intList.add(1);

Object[] objs = lists;

objs[0] = intList; // runtime could not throw an ArrayStoreException for this

String temp = lists[0].get(0); // unexpected error

通过 Class<T> 能够创建 T 的对象， Class<T> 的奥妙在于，一方面它能够通过 compiler 的检测，另一方面，它本身携带的信息也足以让 runtime 得以创建 T 的对象。

public T create(Class<T> c) { return c.newInstance(); };

…

String temp = create(String.class);

4. 不得不插入 bridge method （如上文所述）。

5. 使用 instanceof 时受到限制。

List<String> list = new ArrayList<String>();

boolean temp;

temp = list instanceof List<String>; // compile error

temp = list instanceof List<?>; // ok

6. 使用 reflection 时存在安全隐患。

public class A<T> {

public T obj;

}

public class B {

public A<String> a;

}

…

A<Integer> a = new A<Integer>();

B b = new B();

B.class.getField(“a”).set(a); // everything ok

String temp = b.a.obj; // unexpected error

7. Generic class 中的 type parameter 在其静态方法及静态变量中无法使用。

在 generic 中对 type parameter 的限制

使用 extends 关键字

对于如 A<T> 的 generic class ，可以使用 extends 来进一步限制 T 所能代表的类型。如：

public class A<T extends Number> { … }

…

A<Object> objA; // compile error

A<String> strA; // compile error

A<Number> numA; // ok

A<Integer> intA; // ok

这里， extends 意味着 T 必须是 Number 或者 Number 的子类，以下是对于 extends 更为复杂的使用。

public class B<T extends B<T>> {…}

public class C extends B<C> { … }

public class D extends C { … }

…

C c = new C();

D d = new D();

B<C> bc; // ok

bc = c;

bc = d;

B<D> bd; // compile error

B<Object> b; // compile error

这里，显然 B<Object> 是非法的，对于 B<D> ，虽然 D 继承了 C ，但是把 D 替换到“ T extends B<T> ”中，显然“ D extends B<D> ”不成立，所以 B<D> 也是非法的，与此类似的是 java.lang 里面的“ Enum<E extends Enum<E>> ”，这一声明保证了，假设有类 Test ，它不是 Enum 类型（编译器保证只有使用 enum 关键字创建时才能满足 Enum 类中对 T 的限制），那么无法声明 Enum<Test> 类型的变量。

public class E<T, S extends T> { … }

…

E<Number, Number> e1; // ok

E<Number, Integer> e2; // ok

E<Number, String> e3; // compile error

这里， extends 用来限制不同的 type parameter(T 、 S) 之间的关系。

最后，需要注意的是，在 generic class 里面通过使用 extends 对 type parameter 进行限制所导致的结果是删除了一部分类型（如上述的 E<Number,String> ），而并非阻止这一类型的对象的创建，而在 generic method 里面使用 extends 则在于阻止某些类型的对象作为输入参数，如：

public <T extends String> void test(List<T> list) {}

…

test(new ArrayList<String>()); // ok

test(new ArrayList<Object>()); // compile error

关于 wildcard – “?”

在不使用 super 关键字的时候， ”?” 可以理解成 parameter type 的匿名形式，事实上，这些 ”?” 都可以转变成用 parameter type 表示。

public void test(List<? extends String> list) {}

public <T extends String> void test(List<T> list) {}

“?” 只能作为 parameterized type 的 actual type argument 使用，同时由于 ”?” 是匿名的形式，编译器并不会认为出现两次的 A<?> 要求相同的 type parameter 。

public void test(? obj) {} // compile error

public <T> void test1(List<T> list1, List<T> list2) {}

public void test2(List<?> list1, List<?> list2) {}

…

test1(new ArrayList<Object>(), new ArrayList<String>()); // compile error

test2(new ArrayList<Object>(), new ArrayList<String>()); // ok

使用 super 关键字

对于 T super A ， super 表示 T 必须是 A 或者 A 的父类，相比起 extends ，对 super 的使用有更多的限制。考虑以下代码：

public <T super Number> void test(T obj) {} // assumes that there’s no compile error

…

test(new Object()); // this looks reasonable

test(1); // Integer is not super class of Number, so compiler should reject this, but Integer is also an object, why would a method accept object as valid argument but not integer?

可以看到， super 限制对象类型必须是某个类或其父类，而继承则允许父类的变量接受子类的对象，这两者是互相抵触的，所以对 super 的时候有以下的限制：

首先， super 只能在 parameterized type 的 actual type argument 中作为限制条件使用，在这个时候，它并不和继承相抵触。如：

public void test(List<? super Number> list) {}

其次， super 不能用于限制非匿名的 parameter type ，显然如果可以这样的话，就会出现上述代码中的错误。这就决定了 super 只能与 ”?” 连用。

Generic 的向前兼容

下面以 List<E> 为例，为了向前兼容原先的代码，允许使用 List 这样的 raw type 。在语义上， List 跟 List<Object> 是一致的，然而在语法上 List 跟 List<?> 更为类似，并且编译器允许原先在 List 使用原先在 List<?> 上禁止的某些操作（对于 List<?> 为 compile error 的操作在 List 上仅仅是 warning ）。

首先，在类型转换上， List 跟 List<?> 是等价的， List 类型可以赋给任何指定了 actual type argument 的 parameterized type （如 List<String> ），而 List<?> 则不行（除非使用显式转换）。

List list;

List<?> wildcardList;

List<String> strList;

list = wildcardList;

list = strList;

wildcardList = list;

wildcardList = strList;

strList = list; // warning

strList = wildcardList; // compile error

其次， compiler 不允许通过 List<?> 访问任何输入参数与 E 相关的方法，而 List 则仅给出 wanring 。

list.add(“abc”); // waring

wildcardList.add(“abc”); // compile error

其他

关于 GJ

GJ 是使 java 支持 generic 的一个开源项目， java 5 的 generic 是参照 GJ 实现的， GJ 的语法以及实现方式基本上跟目前 java 5 的 generic 相同，当然在某些细节方面 java 5 的 generic 做了改动，研究 GJ 有利于更好的理解 java 5 的 generic 。

Security Implications

考虑以下的代码：

public class SecureChannel extends Channel {

public String read();

}

public class A {

public LinkedList<SecureChannel> cs;

…

}

由于 LinkedList<SecureChannel> 在 runtime 会退化成 LinkedList ，恶意的代码很容易在 cs 里面放入其他类型的 Channel 。 GJ 中建议使用 type specialization 来解决这一问题，但是由于 java 5 的 generic 仅在需要的时候插入 bridge 方法，所以这一方法在 java 5 中是无效的。

在方法参数中使用 extends 、 super 关键字的技巧

考虑如下代码：

public class A<T> {

private T obj;

public void set(T obj) { this.obj = obj; }

public T get() { return obj; }

}

…

public class Test<T> {

public void test(A<T> a) { … }

…

}

…

A<Object> a = new A<Object>();

A<Number> numA = new A<Number>();

A<Integer> intA = new A<Integer>();

Test<Number> test = new Test<Number>();

test.test(a); // compile error (1)

test.test(numA); // ok

test.test(intA); // compile error (2)

对于 test 方法，如果方法内仅需要调用 A<T> 的 set 方法（即仅需要用到输入值为 T 类型的方法，注意，这不包括如 List<T> 这种类新），使用 A<? super T> 代替 A<T> 可能会更为合适，这使得 (1) 得以编译通过，由于仅需要调用 A<T> 的 set 方法， A<Object>.set(Object) 显然比 A<Number>.set(Number) 允许更多的类型，从而使 A<Object> 可以替换 A<Number> 。相似的，如果 test 方法内仅需要调用 A<T> 个 get 方法，则使用 A<? extends T> 代替 A<T> 可能会更合适，这使得 (2) 得以编译通过。

关于 type parameter 的命名规范

推荐使用精简同时有意义的名称，如 E for element 、 T for type （最好是单个字母），同时避免使用任何小写字母以使得 type parameter 能够从一般的类还有接口名称中被区分出来。如果需要同时使用多个 type parameter ，则考虑使用邻近的几个不同字母，如 T 、 S 。如果在某个类中已经使用了某个字母作为 type parameter ，则在其 generic method 以及 nested class 中避免使用同样的字母。

Generic method 采用 inference 所产生的问题

public interface I {}

public class A implements I {}

public class B{}

public <T> void test(T a, T b);

…

test(new A(), new B()); // ok

当代码改动 B 也需要实现接口 I 的时候：

public class B implements I {}

…

test(new A(), new B()); // compile error

仍然搞不懂的地方

对于类似 java.util.Collections 的 max 方法，经过我的试验以下两种声明方式所能接受的类型是一样的，不明白它为什么要用前者。

public static <T extends Object & Comparable<? super T>> T max1(Collection<? extends T> coll)

public static <T extends Object & Comparable<? super T>> T max2(Collection<T> coll)

参考资料

GJ- Making the future safe for the past: Adding Genericity to the JavaTM Programming Language

Generics in the Java Programming Language

luedipiaofeng

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
对Java Generic相关知识的总结

对于如 List 、 List 、 List ，其中 List 称为 parameterized type ， E 称为 (formal) type parameter ， String 称为 actual type argument ， List 称为 raw type 。 Generic 的逻辑意义原有 java 的类型系统 Generic 为 java 5 带来了新的类型
复制链接

扫一扫