HashSet剖析-CSDN博客

本文链接：https://blog.csdn.net/u011475873/article/details/46458575

昨天分析了下HashMap的原理，今天看到了HashSet，觉得和HashMap关系挺大，就把自己的理解记录下来。

HashSet实现了Set接口，set和数学中的set集合一样，不能包含重复元素，对null值也一样，即最多只能有一个null值。HashSet受hash表支持，实际上是由HashMap实现的，看他的构造方法：

  public HashSet() {
        map = new HashMap<>();
    }

    /**
     * Constructs a new set containing the elements in the specified
     * collection.  The <tt>HashMap</tt> is created with default load factor
     * (0.75) and an initial capacity sufficient to contain the elements in
     * the specified collection.
     *
     * @param c the collection whose elements are to be placed into this set
     * @throws NullPointerException if the specified collection is null
     */
    public HashSet(Collection<? extends E> c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
        addAll(c);
    }

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * the specified initial capacity and the specified load factor.
     *
     * @param      initialCapacity   the initial capacity of the hash map
     * @param      loadFactor        the load factor of the hash map
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero, or if the load factor is nonpositive
     */
    public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * the specified initial capacity and default load factor (0.75).
     *
     * @param      initialCapacity   the initial capacity of the hash table
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero
     */
    public HashSet(int initialCapacity) {
        map = new HashMap<>(initialCapacity);
    }

基本上都是对HashMap进行封装，而他的基本方法很多都是由HashMap实现：

    public boolean isEmpty() {
        return map.isEmpty();
    }

   
    public boolean contains(Object o) {
        return map.containsKey(o);
    }

   
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

  
    public boolean remove(Object o) {
        return map.remove(o)==PRESENT;
    }

   
    public void clear() {
        map.clear();
    }

在HashSet的add方法中，可以看到，添加的元素直接作为HashMap的键，值直接赋为PRESENT，若添加相同元素，进行覆盖，不变，返回false，若添加不同元素，直接在数组中添加，并返回true，保证添加不会重复。

和HashMap一样，当存储的是我们自定义对象是，需要重写Hashcode函数和equals函数。两个自定义对象相同，在HashSet里面，要求Hashcode返回值相同，equals返回true！

举个例子：

package text3;

import java.util.HashSet;

class Student 
{ 
    private String id;
    private String name;
    public Student(String id, String name)
    { 
        this.id = id; 
        this.name = name; 
    } 
    // 根据 id 判断两个 student 是否相等
    public boolean equals(Object o)   
    {     
        if (o.getClass() == Student.class)   
        {   
        	Student n = (Student)o;   
            return n.id.equals(id);   
        }   
        return false;   
    }   
	 
    // 根据 id 计算 Name 对象的 hashCode() 返回值
    public int hashCode() 
    { 
        return id.hashCode(); 
    }

    public String toString() 
    { 
        return "id=" + id + ", name=" + name; 
    } 
 } 
 
 public class Main 
 { 
    public static void main(String[] args) 
    { 
        HashSet<Student> set = new HashSet<Student>(); 
        Student student1 = new Student("001" , "张三");
        set.add(student1); 
        set.add(new Student("002" , "张三")); 
        set.add(new Student("003" , "李四")); 
        set.add(new Student("003" , "李四")); 
        for(Student stu:set)
        	System.out.println(stu);
        System.out.println(Student.class);
        System.out.println(student1.getClass());
    } 
}

上例的输出

可以看到，插入了4个对象，忽略了重复的那个，只是输出了3个。我重写了equals和hashcode方法，在hashcode方法中，先判断对象是否相同，这里的class方法和getClass方法分别可以获得类和对象的运行类，从结果可以看到，都是class text3.Student，这里牵扯到了java的反射机制：对于任意一个类，都能够知道这个类的所有属性和方法；对于任意一个对象，都能够调用它的任意一个方法。Java反射机制主要提供了以下功能：在运行时判断任意一个对象所属的类；在运行时构造任意一个类的对象；在运行时判断任意一个类所具有的成员变量和方法；在运行时调用任意一个对象的方法；生成动态代理。

当判断到运行类相同，在比较他们的id是否相同。同样，在hashcode方法中，直接返回id的hashcode值。当equals返回true，且hashcode相同是，他们两个对象就是相同的。

这里重写的toString方法，好处是在运行print时，自动调用toString函数，不用显示打印。