关于java Collection 的10个最常见的问题--------

最新推荐文章于 2023-05-09 10:11:13 发布

Abner_Niu

最新推荐文章于 2023-05-09 10:11:13 发布

阅读量865

点赞数 1

分类专栏： Java 文章标签： Collection Top10

Java 专栏收录该内容

61 篇文章 0 订阅

订阅专栏

原文 Top 10 questions about Java Collections

下面是StackOverflow上关于java Collection最受欢迎的几个问题。

1，什么时候使用LinkedList，ArrayList

ArrayList本质是个数组，可以通过下标直接访问它的元素。但是如果数组满了，就需要重新分配一个数组，原来数组中的所有元素需要全部移到到新的数组中，需要的时间复杂度为O（n），add、remove时也需要移到已有的元素。这是ArrayList最大的缺点。

LinkedList是个双链表，如果要查找一个在中间的元素，需要从List的头search到尾，另一方面，在LinkedList中add、remove会较快，因为只改变局部。

最“坏”情况下的时间复杂度如下表：

                   | Arraylist | LinkedList
 ------------------------------------------
 get(index)        |    O(1)   |   O(n)
 add(E)            |    O(n)   |   O(1)
 add(E, index)     |    O(n)   |   O(n)
 remove(index)     |    O(n)   |   O(n)
 Iterator.remove() |    O(n)   |   O(1)
 Iterator.add(E)   |    O(n)   |   O(1)

除了运行时间，对于非常大的list，内存使用也是需要特别考虑的。在LinkedList中，每个点都需要两个pointer来链接之前和之后的note，而在ArrayList中只需要一个数组就可以了。

2，当遍历一个集合时，要删除某个元素的操作

当遍历一个集合时，唯一正确删除某个元素的方式是 Iterator.remove()，

Iterator<Integer> itr = list.iterator();
while(itr.hasNext()) {
   // do something
   itr.remove();
}

最常见的错误：

for(Integer i: list) {
  list.remove(i);
}

运行上面的代码会报 ConcurrentModificationException异常，因为在for statement已经生成iterator去遍历list，同时list被iterator.remove()改变了。java中，不允许一个线程改变一个Collection时另一个线程去改变它。 （这块没懂，如上面的错误代码，把Integer换成自定义的类，就不会报错了！，如下，下面的代码也会报ConcurrentModificationException异常的，因为高级for循环也用的是Iterator,当执行Iterator.next()时就会检查所否collection有改变）

class Apple {
	private String color;

	public Apple(String color) {
		this.color = color;
	}
	
	public String getColor(){
		return color;
	}
}

	
	public static void main(String[] args) throws Exception {
		Apple a1 = new Apple("green");
		Apple a2 = new Apple("black");
		Apple a3 = new Apple("red");
		LinkedList<Apple> list = new LinkedList<Apple>();
		list.add(a1);
		list.add(a2);
		list.add(a3);
		
		System.out.println(list);
		
		for(Apple a : list){
			if(a.getColor().equals("red"))
				list.remove(a);
		}
		
		System.out.println(list);
		
	}

3，怎么将List转换成int[ ] ?

最简单的办法是用Apache Commons Lang 的ArrayUtils 。

int[] array = ArrayUtils.toPrimitive(list.toArray(new Integer[0]));

在JDK中没有简便的方法，不应该用List.toArrays()，因为这样会把List转成Integer[ ]，而不是int[ ] ,正确的应该这样：

int[] array = new int[list.size()];
for(int i=0; i < list.size(); i++) {
  array[i] = list.get(i);
}

-------2015.5.8-----------

List.toArray()将List中的对象放到数组中返回

Apple[] apples = (Apple[]) arrayList.toArray();

如果如上那样把Object[ ] 转成另外类型的数组，会报异常ClassCastException。可以这么变通下

        Object[] os = list.toArray();
        for(Object apple:os){
            System.out.println(((Apple)apple).getColor());
        }

或者使用另外一种方法 List.toArray(T[ ] a)

        Apple[] apples = new Apple[0];
        apples = arrayList.toArray(apples);
        for(Apple apple: apples){
            System.out.println(apple.getColor());
        }

上述两种toArray方法，最后都调到

    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

而toArray()传给的newType是Object类型，而toArray(T[ ] a)传递的newType是T类型的，copyOf()方法会创建一个newType类型的数组返回。

而java中的类型强制转换，有继承关系的向上转型是没有问题的，如果向下转型，除非要转的那个类型就是那个类型，否则就会运行时异常“ClassCastException"。

4，如果将int[ ] 转成 LIst

如3.

最简单的用ArrayUtils。

List list = Arrays.asList(ArrayUtils.toObject(array));

在JDK中，这么搞，

int[] array = {1,2,3,4,5};
List<Integer> list = new ArrayList<Integer>();
for(int i: array) {
  list.add(i);
}

5，过滤一个Collection的最好方式

最简单的是使用第三方的包，像Guava或者Apache Commons Lang ，都提供了filter（）方法，返回匹配 Predicate的元素。

在JDK中，会麻烦些，好消息是在jdk8中会引入 Predicate接口。

Iterator<Integer> itr = list.iterator();
while(itr.hasNext()) {
   int i = itr.next();
   if (i > 5) { // filter all ints bigger than 5
      itr.remove();
   }
}

当然也可以仿效Guava，自己定义一个接口Predicate，大部分高级开发者会这么做的。

public interface Predicate<T> {
   boolean test(T o);
}
 
public static <T> void filter(Collection<T> collection, Predicate<T> predicate) {
    if ((collection != null) && (predicate != null)) {
       Iterator<T> itr = collection.iterator();
          while(itr.hasNext()) {
            T obj = itr.next();
            if (predicate.test(obj)) {
               itr.remove();
            }
        }
    }
}

filter(list, new Predicate<Integer>() {
    public boolean test(Integer i) { 
       return i > 5; 
    }
});

6,把List转成Set的最简便的方法

有两种方式，看你的需求选择。第一种方式是把List放在HashSet中，hashCode（）相同的会只保留一个，大部分情况下，这么做都可以。第二种方式是如果对顺序有要求，可以用TreeSet。

Set<Integer> set = new HashSet<Integer>(list);

Set<Integer> set = new TreeSet<Integer>(aComparator);
set.addAll(list);

7，如何去掉ArrayList中重复的元素

和上面的问题相似，将ArrayList中元素放到Set中，去掉重复的，然后再放回到ArrayList来，如果要求顺序，那就用TreeSet。

ArrayList** list = ... // initial a list with duplicate elements
Set<Integer> set = new HashSet<Integer>(list);
list.clear();
list.addAll(set);

		ArrayList list = ... // initial a list with duplicate elements
		Comparator<Integer> comparator = new Comparator<Integer>() {
			
			@Override
			public int compare(Integer o1, Integer o2) {
				// TODO Auto-generated method stub
				return o1 - o2;
			}
		};
		Set<Integer> set = new TreeSet<Integer>(comparator);
		set.addAll(list);
		list.clear();
		list.addAll(set);

8，有序的集合

有很多方法可以得到一个有序的集合。

1），Collections.sort() 可以排序List。性能在n log（n）。

2），PriorityQueue 提供了一个有序的Queue，和Collections.sort（）的区别是，它的有序是一直存在，但只能取得head元素，而且不能按index随机的取元素，像PriorityQueue.get(3)。

3），如果没有重复的元素，还可以用TreeSet，也可以一直保持顺序，像PriorityQueue，还可以从head到tail元素都可以取得，但仍不能随机访问。

简而言之，Collections.sort()提供了一个一次性的List，PriorityQueue和TreeSet可以一直保持顺序，但牺牲了按index随机访问。

9，Collections.emptyList() vs new instance

对emptyMap（）和emptySet（）也一样。

两者都返回一个空的List，但Collections.emptyList()返回的list无法再添加新的元素，因为它是“immutable”不可变的。实际上每次调用Collections.emptyList不会新创建一个空的List实例，而会复用一个已有的List。

--------2015.5.10----------------

Collections.emptyList(),emptySet(),emptyMap()怎么用呢？

看了网上的资料，有人说用于在初始化Collection前，不想让变量赋值为null，就用empltyXXX()，因为empltyXXX()返回的是一个final的，static变量，如果有很多需要这样的场景，会省很多内存。

10，Collections.copy()

有两种方式可以将一个源list（source list）复制到目标list（destination list）中。一种方式是使用ArrayList的构造方法。

[java]view plaincopyprint? 
   
 ArrayList<Integer> dstList = new ArrayList<Integer>(srcList);  

另一种方式是使用Collections.copy()方法（如下所示）。请注意第一行，我们新分配了一个list，它的长度至少要和源list相同。这是因为在Collections的javadoc中指明，目标list至少要和源list一样长。

[java]view plaincopyprint? 
   
 ArrayList<Integer> dstList = new ArrayList<Integer>(srcList.size());  
 Collections.copy(dstList, srcList);  

这两种方式都是通过浅拷贝（shallow copy）来实现的。那么，它们的区别在哪儿呢？

首先，即使dstList没有足够的空间来容纳srcList中所有的元素，Collections.copy()方法也不会重新分配dstList的容量，它只会抛出一个IndexOutOfBoundsException异常。有的能可能会问，这样做有什么好处。其中一个原因是，这样可以确保该方法能在线性时间（linear time）内运行完毕。同时，当你只想重用数组而不需要在ArrayList的构造方法中重新分配新的内存空间时，该方法变得十分实用。

其次，Collections.copy()只能将List作为源集合和目标集合，而ArrayList更通用一些，它可以接收Collection参数。

-------------------update 2015.3.23---------------

上面问题2：

遍历集合并同时修改集合的内容，如果用一般的方式

    public static void main(String[] args) {

        List<String> list = new LinkedList<String>();
        list.add("a");
        list.add("b");
        list.add("c");

        for(int i=0;i<list.size();i++){
            System.out.println(list.get(i));
            if(i == 0)
                list.remove(i);
        }
    }

输出：

a
c

因为删除元素后，集合中的元素重新组织，元素b被放在了index 为0的位置，但循环下一次i就成1了，所以b没有被输出，但实际上仍在集合中。

如果用高级For循环

    public static void main(String[] args) {

        List<String> list = new LinkedList<String>();
        list.add("a");
        list.add("b");
        list.add("c");

        for(String s : list){
            System.out.println(s);
            if(s.equals("a"))
                list.remove(s);
        }
    }

输出：

a
Exception in thread "main" java.util.ConcurrentModificationException
	at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:761)
	at java.util.LinkedList$ListItr.next(LinkedList.java:696)
	at xixixix.Test2.main(Test2.java:21)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

遍历并删除元素时，最好用Iterator.remove()，否则都可能产生ConcurrentModificationException的异常。

“

下面是网上的其他解释，更能从本质上解释原因：
Iterator 是工作在一个独立的线程中，并且拥有一个 mutex 锁。 Iterator 被创建之后会建立一个指向原来对象的单链索引表，当原来的对象数量发生变化时，这个索引表的内容不会同步改变，所以当索引指针往后移动的时候就找不到要迭代的对象，所以按照 fail-fast 原则 Iterator 会马上抛出 java.util.ConcurrentModificationException 异常。
所以 Iterator 在工作的时候是不允许被迭代的对象被改变的。但你可以使用 Iterator 本身的方法 remove() 来删除对象， Iterator.remove() 方法会在删除当前迭代对象的同时维护索引的一致性。

”

java的高级for循环（for-each）要求遍历的容器实现了Iterable接口，比如已知可遍历的容器array，Collection，它遍历时，实际上是用Iterator来遍历。

如果用普通的for循环遍历Collection，同时删除元素是不会报ConcurrentModificationException异常。