从源码看容器-Set

最新推荐文章于 2023-12-13 12:06:35 发布

金大大jhz

最新推荐文章于 2023-12-13 12:06:35 发布

阅读量139

点赞数

分类专栏：容器

本文链接：https://blog.csdn.net/u013374645/article/details/82504042

版权

容器专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本文深入探讨了HashSet和TreeSet的实现原理，包括构造器、常用API及内部数据结构。HashSet基于HashMap，确保元素唯一性，支持快速查找，但不保证顺序。TreeSet基于红黑树，提供排序功能，适合有序数据存储。

摘要由CSDN通过智能技术生成

前言

前面学习了HashMap、LinkedHashMap与TreeMap，今天来学习下Set接口下的实现类。

1、HashSet

构造器

 public HashSet() {
        map = new HashMap<>();
    }

public HashSet(Collection<? extends E> c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
        addAll(c);
    }

 public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }

 public HashSet(int initialCapacity) {
        map = new HashMap<>(initialCapacity);
    }

 HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = new LinkedHashMap<>(initialCapacity, loadFactor);
    }

可以看到，HashSet一共有五个构造器，134相信很好理解，而最后一个构造器多了一个dummy参数，我们看下源码的注释：

   /**
     * Constructs a new, empty linked hash set.  (This package private
     * constructor is only used by LinkedHashSet.) The backing
     * HashMap instance is a LinkedHashMap with the specified initial
     * capacity and the specified load factor.
     *
     * @param      initialCapacity   the initial capacity of the hash map
     * @param      loadFactor        the load factor of the hash map
     * @param      dummy             ignored (distinguishes this
     *             constructor from other int, float constructor.)
     * @throws     IllegalArgumentException if the initial capacity is less
     *             than zero, or if the load factor is nonpositive
     */

这个构造器是专为LinkedHashSet准备的，可以看到dummy只是为了区别于其他构造器的参数，并无实际作用，用于重载，而构造器中的实际存储对象类型是LinkedHashMap。现在着重看下第二个构造函数，当传入的clection.size为c.size时，将默认的HashSet的大小置为max((c.size()/.75f) + 1, 16)，即c.size()/.75f+ 1和16中较大的那个，我们知道HashMap的默认大小为16，负载因子为0.75，当hashmap的存储容量大于阈值（总大小×0.75）时，则需要对HashMap进行扩容，而(c.size()/.75f) + 1)正好是这个总容量的大小；而16的含义是为了避免重复计算，我们知道HashMap的总容量永远是2的整数次方，当c.size较小时，将其赋值为默认大小16可以避免重复的计算（这个计算过程在前面的HashMap中讲过，即使是位运算，也是要耗时的）。

常用API

    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

新增元素则相当于在hashmap中put进去了一个节点，若这个过程是key相同的覆盖，则返回false，否则返回true，这里可以看到put时的节点value为PRESENT，实际就是一个静态的object对象，hashset中每个元素都作为底层hashmap的key且都指向了这个object对象，这样做可以节省空间。

 public boolean remove(Object o) {
        return map.remove(o)==PRESENT;
    }

   public boolean contains(Object o) {
        return map.containsKey(o);
    }

    public void clear() {
        map.clear();
    }

    public Iterator<E> iterator() {
        return map.keySet().iterator();
    }

    public boolean isEmpty() {
        return map.isEmpty();
    }

可以看到，一切的操作都是基于HashMap实现的，实质上是利用HashMap中key不能重复的特点来保证HashSet中元素的唯一性，同时，它也继承了HashMap其他的特点，允许存Null，非线程安全，不保证元素的顺序，而LinkedHashSet则采用LinkedHashMap来实现底层元素的存储，因此能保证顺序。

2、TreeSet

TreeSet的存储数据结构也是红黑树，了解红黑树也是了解TreeSet的关键。

构造器

TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }

public TreeSet() {
        this(new TreeMap<E,Object>());
    }

public TreeSet(Comparator<? super E> comparator) {
        this(new TreeMap<>(comparator));
    }

 public TreeSet(Collection<? extends E> c) {
        this();
        addAll(c);
    }

 public TreeSet(SortedSet<E> s) {
        this(s.comparator());
        addAll(s);
    }

TreeSet的构造器大体非为两类，带比较器参数和不带比较器参数，前者为按比较器自定义排序，后者为自然排序。

常用API

大部分方法和HashSet类似，本质上一个是利用组合使用HashMap中的方法，另一个使用TreeMap中的方法，这里我们看一下和HashSet不一样的方法。

    //方法时间复杂度:线性
    public  boolean addAll(Collection<? extends E> c) {
        /**如果
         * 1.底层Map没有存储元素
         * 2.参数中包含元素
         * 3.参数类型为SortedSet
         * 4.底层map类型为TreeMap
         */
        if (m.size()==0 && c.size() > 0 &&
                c instanceof SortedSet &&
                m instanceof TreeMap) {
            //类型转化
            SortedSet<? extends E> set = (SortedSet<? extends E>) c;
            TreeMap<E,Object> map = (TreeMap<E, Object>) m;
            //获取参数集合c的比较器.
            Comparator<?> cc = set.comparator();
            //获取底层treemap的比较器.
            Comparator<? super E> mc = map.comparator();
            //如果两个比较器等价,则进行插入操作;否则不插入.
            if (cc==mc || (cc != null && cc.equals(mc))) {
                map.addAllForTreeSet(set, PRESENT);
                return true;
            }
        }
        //调用AbstractCollection方法
        return super.addAll(c);
    }

    //根据key值范围,截取set.调用的是NavigableSet的方法
    public NavigableSet<E> subSet(E fromElement, boolean fromInclusive,
                                  E toElement,   boolean toInclusive) {
        return new TreeSet<>(m.subMap(fromElement, fromInclusive,
                toElement,   toInclusive));
    }

    //返回视图中所有元素的值:<=toElement
    public NavigableSet<E> headSet(E toElement, boolean inclusive) {
        return new TreeSet<>(m.headMap(toElement, inclusive));
    }

   //返回视图中所有元素的值:>=toElement
    public NavigableSet<E> tailSet(E fromElement, boolean inclusive) {
        return new TreeSet<>(m.tailMap(fromElement, inclusive));
    }

    //参数为true,则包含等于;否则不包含等于.
    //因此本方中,返回key的范围为:[fromElement,toElement)
    public SortedSet<E> subSet(E fromElement, E toElement) {
        return subSet(fromElement, true, toElement, false);
    }

    //返回key的范围:[,toElement)
    public SortedSet<E> headSet(E toElement) {
        return headSet(toElement, false);
    }

    /**
     * @throws ClassCastException {@inheritDoc}
     * @throws NullPointerException if {@code fromElement} is null
     *         and this set uses natural ordering, or its comparator does
     *         not permit null elements
     * @throws IllegalArgumentException {@inheritDoc}
     */
    //返回key范围:[,fromElement]
    public SortedSet<E> tailSet(E fromElement) {
        return tailSet(fromElement, true);
    }

    //获取底层treemap的比较器.
    public Comparator<? super E> comparator() {
        return m.comparator();
    }

    //获取第一个key
    public E first() {
        return m.firstKey();
    }

    //获取最后一个key
    public E last() {
        return m.lastKey();
    }


    /*--------NavigableSet的API方法---------*/
    //返回比e小且和e的差最小的key
    public E lower(E e) {
        return m.lowerKey(e);
    }

    //返回<=e的最大key
    public E floor(E e) {
        return m.floorKey(e);
    }

   //返回>=e的最小key
    public E ceiling(E e) {
        return m.ceilingKey(e);
    }

    //返回>e的最小值
    public E higher(E e) {
        return m.higherKey(e);
    }

    //返回最小key;如果底层map为空,则返回null
    public E pollFirst() {
        Map.Entry<E,?> e = m.pollFirstEntry();
        return (e == null) ? null : e.getKey();
    }

   //返回最大key;如果底层map为空,则返回null
    public E pollLast() {
        Map.Entry<E,?> e = m.pollLastEntry();
        return (e == null) ? null : e.getKey();
    }

小结

1、TreeSet 是红黑树实现的,Treeset中的数据是自动排好序的，不允许放入null值

2、HashSet 是哈希表实现的,HashSet中的数据是无序的，可以放入null，但只能放入一个null

3、HashSet要求放入的对象必须实现HashCode()方法，放入的对象，是以hashcode码作为标识的，而具有相同内容的 String对象，hashcode是一样，所以放入的内容不能重复。但是同一个类的对象可以放入不同的实例