Java 集合总结与源码分析
集合总览
集合这块主要分为两大块 Connection 和 Map
- Set,接口,无序,不重复
- HashSet,Set 实现类,无序,不重复
- SortSet 接口,有序,不重复
- TreeSet,SortSet实现类,有序,不重复
- LinkedHashSet,Set实现类,无序,不重复,具有链表特性
- List,接口,有序,可重复
- ArrayList,List基于数组的实现类,有序,可重复,线程不安全
- Vector,List实现类,有序,可重复,线程安全
- LinkedList,List基于链表的实现类,有序,可重复,具有链表性质
- Queue,接口,队列,先进先出
- PriorityQueue,Queue实现类,元素具有优先级的队列
- Map,接口,定义了 k-v 的数据存储方式,无序,key不重复
- AbstractMap,接口,继承Map,抽象一般的Map
- HashMap,AbstractMap实现类,无序,key不重复
- TreeMap,AbstractMap实现类,有序,key不重复
- SortedMap,接口,继承Map,抽象有序的Map
- NavigableMap SortedMap 实现类,有序的Map
- AbstractMap,接口,继承Map,抽象一般的Map
典型源码分析
case1 Collection 在使用foreach 进行遍历时候,如果对集合中的元素进行删除或者增加为什么为抛出异常 java.util.ConcurrentModificationException?
已这段代码为例, 运行后抛出异常 java.util.ConcurrentModificationException
public static void main(String[] args) {
ArrayList<String> arrayList = new ArrayList();
arrayList.add("111");
arrayList.add("222");
arrayList.add("333");
arrayList.add("444");
for(String str: arrayList){
if("222".equals(str)){
arrayList.remove(str);
}
}
}
- 要想知道原因,先了解几个概念。在ArrayList 内部有一个内部类 Itr 实现了Iterator 接口 这个内部类相当于一个简版的复制版实例,ArrayList的遍历就是靠这个内部类来实现的。这个内部类有一个非常重要的属性 expectedModCount 此变量初始值和当前实例的size一样,意思是迭代器预计需要变动多少多少次。cursor 为当前遍历的下标。
/**
* An optimized version of AbstractList.Itr
*/
private class Itr implements Iterator<E> {
int cursor; // index of next element to return
int lastRet = -1; // index of last element returned; -1 if no such
int expectedModCount = modCount;
}
- arrayList.remove 方法源码如下,remove 方法执行后会导致 modCount 数值加1
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
private void fastRemove(int index) {
modCount++;//这一步很关键
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
}
- ArrayList 迭代操作在每次获得元素的时候都会调用 next 方法,在next 方法一开始的时候就会有一个check操作,这个操作正是校验 modCount 和 expectedModCount 值是否一致,如果不一样则抛出异常。
public E next() {
checkForComodification(); //这步很关键
int i = cursor;
if (i >= size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[lastRet = i];
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
case2 Collection 在使用foreach 进行遍历时候,如果对集合中的倒数第二个元素执行上述操作却能成功?
public static void main(String[] args) {
ArrayList<String> arrayList = new ArrayList();
arrayList.add("111");
arrayList.add("222");
arrayList.add("333");
arrayList.add("444");
for(String str: arrayList){
if("333".equals(str)){
arrayList.remove(str);
}
}
System.out.println(arrayList);
}
//结果输出 [111, 222, 444] 符合预期
- 同样来看 remove 操作,在remove 操作后,使用数组复制的操作,将原来的数组长度减去1,也就是说size减一了。
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved); //这一步很关键
elementData[--size] = null; // clear to let GC do its work
}
- 而迭代每次一开始就判断是否还有下一个元素,而此时 cursor=3 而 size 也是3。因此此方法返回 false。遍历直接结束了,不会到执行下一次 next 方法,因此不会抛出异常。
public boolean hasNext() {
return cursor != size;
}
case3 同样是Collection遍历 为什么使用用Iterator 就能实现循环内删除元素?
- 使用迭代器进行迭代,结果正常
public static void main(String[] args) {
ArrayList<String> arrayList = new ArrayList();
arrayList.add("111");
arrayList.add("222");
arrayList.add("333");
arrayList.add("444");
Iterator<String> iterator = arrayList.iterator();
while (iterator.hasNext()){
String next = iterator.next();
if("222".equals(next)){
iterator.remove();
}
}
System.out.println(arrayList);
}
//结果输出 [111, 333, 444] 符合预期
- 其关键点就是 remove 的区别。使用迭代器进行remove 与之前两个case 的remove调用的不是同一个方法,这个remove 方法是 内部类中自定义的。
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
ArrayList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
expectedModCount = modCount; //这一步非常关键
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
- 可以看出迭代器的 remove 操作,先将删除元素的操作同步到真正的当前实例,然后调整游标,最后重新将expectedModCount 与 modCount 赋值相等。因此也不会抛出异常。
case4 HashSet 底层实现是HashMap
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
集合工具类
Collections 常用API
public static void main(String[] args) {
List<Integer> list = new ArrayList<>();
list.add(100);
list.add(54);
list.add(2);
list.add(67);
list.add(999);
Integer max = Collections.max(list); //求数组最大值
Integer min = Collections.min(list); //求数组最小值
List emptyList = Collections.EMPTY_LIST; //得到一个空数组
Map emptyMap = Collections.EMPTY_MAP; //得到一个空map
Collections.shuffle(list); //随机得到数组中一个元素
Collections.sort(list); //数组排序
List<Integer> integers = Collections.synchronizedList(list); //获取一个线程安全的List
}
Arrays 常用API
/**
* 快速创建一个数组,参数长度不固定
* 注意事项,返回的是一个 ArrayList类型,它是 Arrays 的内部类,只实现了简单的查询接口
* 如果对数据进行修改会抛出异常 java.lang.UnsupportedOperationException
*/
List<String> strings = Arrays.asList("aaa", "bbb");
Java8 Stream 便捷操作集合
/**
* 用于后续演示方便 定义一个StudentInfo Java类
*/
public class StudentInfo {
private String userId; //id
private String userName; //姓名
private int age; //年龄
private int score; //考试得分
//此处省略 get,set方法, 构造器方法
}
List 转 map
public static void main(String[] args) {
List<StudentInfo> studentInfos = new ArrayList<>();
studentInfos.add(new StudentInfo("user_1","小明",25,65));
studentInfos.add(new StudentInfo("user_2","小红",27,80));
studentInfos.add(new StudentInfo("user_3","小花",20,90));
studentInfos.add(new StudentInfo("user_4","小李",18,75));
studentInfos.add(new StudentInfo("user_5","小明",50,100));
studentInfos.add(new StudentInfo("user_5","小琴",50,90));
/**
* 将list 转化为map,userId作为key;
* 1)切记作为key的属性不能重复,否则会抛出异常 java.lang.IllegalStateException: Duplicate key
* 2) 使用第三个参数定义处理策略 例如(key1,key2) -> key2 重复的key取后面的数值
*/
Map<String, StudentInfo> collect = studentInfos.stream().collect(Collectors.toMap(StudentInfo::getUserId, k -> k, (key1,key2) -> key2));
//key为 userId, value 为userName
Map<String, String> collect1 = studentInfos.stream().collect(Collectors.toMap(StudentInfo::getUserId,StudentInfo::getUserName,(key1,key2) -> key2));
}
List 转 List
public static void main(String[] args) {
List<StudentInfo> studentInfos = new ArrayList<>();
studentInfos.add(new StudentInfo("user_1","小明",25,65));
studentInfos.add(new StudentInfo("user_2","小红",27,80));
studentInfos.add(new StudentInfo("user_3","小花",20,90));
studentInfos.add(new StudentInfo("user_4","小李",18,75));
studentInfos.add(new StudentInfo("user_5","小明",50,100));
studentInfos.add(new StudentInfo("user_5","小琴",50,90));
//将所有对象的 userId单独提取出来为list
List<String> collect = studentInfos.stream().map(StudentInfo::getUserId).collect(Collectors.toList());
//将所有对象的 userId单独提取出来为set
Set<String> setCollect = studentInfos.stream().map(StudentInfo::getUserId).collect(Collectors.toSet());
//将所有对象的 userId单独提取出来为去重的list
List<String> collect1 = studentInfos.stream().map(StudentInfo::getUserId).distinct().collect(Collectors.toList());
}
分组统计
public static void main(String[] args) {
List<StudentInfo> studentInfos = new ArrayList<>();
studentInfos.add(new StudentInfo("user_1","小明",25,65));
studentInfos.add(new StudentInfo("user_2","小红",27,80));
studentInfos.add(new StudentInfo("user_3","小花",20,90));
studentInfos.add(new StudentInfo("user_4","小李",18,75));
studentInfos.add(new StudentInfo("user_5","小明",50,100));
studentInfos.add(new StudentInfo("user_5","小琴",50,90));
//根据userId 分组,key为userId,value为每组的元素
Map<String, List<StudentInfo>> collect = studentInfos.stream().collect(Collectors.groupingBy(StudentInfo::getUserId));
//根据userName 分组,key为userName,value为每组的元素
Map<String, List<StudentInfo>> collect1 = studentInfos.stream().collect(Collectors.groupingBy(StudentInfo::getUserName));
//根据userId 分组统计各项指标
Map<String, IntSummaryStatistics> mapSummary = studentInfos.stream().collect(Collectors.groupingBy(StudentInfo::getUserId, Collectors.summarizingInt(StudentInfo::getScore)));
IntSummaryStatistics user_5 = mapSummary.get("user_5");
double average = user_5.getAverage(); //平均值
long count = user_5.getCount(); //计数
int max = user_5.getMax(); //最大值
int min = user_5.getMin(); //最小值
long sum = user_5.getSum(); //总和
//userId 分组统计每一组的总分数 key userId,value 每一组总分数
Map<String, Integer> collect2 = studentInfos.stream().collect(Collectors.groupingBy(StudentInfo::getUserId, Collectors.summingInt(StudentInfo::getScore)));
//userId 分组统计每一组的平均年龄 key userId,value 每一组平均年龄
Map<String, Double> collect3 = studentInfos.stream().collect(Collectors.groupingBy(StudentInfo::getUserId, Collectors.averagingInt(StudentInfo::getAge)));
}
其它用法
public static void main(String[] args) {
List<StudentInfo> studentInfos = new ArrayList<>();
studentInfos.add(new StudentInfo("user_1","小明",25,65));
studentInfos.add(new StudentInfo("user_2","小红",27,80));
studentInfos.add(new StudentInfo("user_3","小花",20,90));
studentInfos.add(new StudentInfo("user_4","小李",18,75));
studentInfos.add(new StudentInfo("user_5","小明",50,100));
studentInfos.add(new StudentInfo("user_5","小琴",50,90));
//截取前3条
List<StudentInfo> collect = studentInfos.stream().limit(3).collect(Collectors.toList());
//过滤掉 年龄小于20岁的
List<StudentInfo> collect1 = studentInfos.stream().filter(k -> k.getAge() > 20).collect(Collectors.toList());
//循环遍历
studentInfos.stream().forEach(k-> System.out.println(k.getUserId()));
//按照年龄从大到小的顺序排列
studentInfos.stream().sorted(Comparator.comparing(StudentInfo::getAge).reversed()).forEach(k -> System.out.println(k.getAge()));
//统计分数大于80的,此流慎用,线程不安全,底层机制 forkJoin
long count = studentInfos.parallelStream().filter(k -> k.getScore() > 80).count();
}