从List与Set的查重效率问题浅谈为什么重写hashCode方法

最新推荐文章于 2023-01-19 02:08:34 发布

冷瞳凛

最新推荐文章于 2023-01-19 02:08:34 发布

阅读量267

点赞数

文章标签： java

本文链接：https://blog.csdn.net/m0_49218779/article/details/107592614

版权

前几天刷集合基础题发现了一个非常有意思的现象，原题是这样的：将1-100之间所有的正整数随机存放在一个List集合中。

题目很简单，实现方法也有好多种，无非就是先随机再去重操作。但如果局限于用List集合自身方法去操作，会发现一旦数据量骤增之后List的执行效率变得极低。随后尝试借助set集合来实现，结果对比令人咂舌。

public class Demo10 {
	public static void main(String[] args) {
		long t1=System.currentTimeMillis();
		method1(100000);
		System.out.println(System.currentTimeMillis()-t1);
		long t2=System.currentTimeMillis();
		method2(100000);
		System.out.println(System.currentTimeMillis()-t2);
	}
	
	//List去重
	public static List<Integer> method1(int n){
		Random rd=new Random();
		List<Integer> list=new ArrayList<Integer>();
		while(true) {
			int index=rd.nextInt(n)+1;
			if(!list.contains(index)) {
				list.add(index);
			}
			if(list.size()==n) {
				break;
			}
		}
		return list;
	}
	//Set去重
	public static List<Integer> method2(int n){
		Random rd=new Random();
		List<Integer> list=new ArrayList<Integer>();
		 Set<Integer> set = new HashSet<>();
		while(true) {
			int index=rd.nextInt(n)+1;
			if(!set.contains(index)) {
				set.add(index);
				list.add(index);
			}
			if(list.size()==n) {
				break;
			}
		}
		return list;
	}
}

为什么看似同样的使用了顶层接口Collection的判断方法contains()，执行效率会差这么多呢？

原因在于这两个集合对于查重的时候底层实现方法也是不同的。

List集合和Set集合对于重复的定义也是有区别的。

对于List集合，仅仅依赖于equals()方法确定对象是否重复，这也使得后期查找相同元素的时候底层需要逐个比较。

对于Set集合，结合了HashCode()算法与equals()方法，在查重时首先排除了Hash值不同的元素，再去对余下少量元素进行比较，有了这层筛选机制，工作量大大降低，执行效率显著提高。

这也是为什么重写equals()方法要顺带重写HashCode()方法的原因。

冷瞳凛

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
从List与Set的查重效率问题浅谈为什么重写hashCode方法

前几天刷集合基础题发现了一个非常有意思的现象，原题是这样的：将1-100之间所有的正整数随机存放在一个List集合中。题目很简单，实现方法也有好多种，无非就是先随机再去重操作。但如果局限于用List集合自身方法去操作，会发现一旦数据量骤增之后List的执行效率变得极低。随后尝试借助set集合来实现，结果对比令人咂舌。public class Demo10 { public static void main(String[] args) { long ...
复制链接

扫一扫