Java 集合详解、源码分析(2)

最新推荐文章于 2023-02-05 18:17:30 发布

栗少

最新推荐文章于 2023-02-05 18:17:30 发布

阅读量79

点赞数

文章标签：集合

原文链接：https://www.wkliu.top/index.php/archives/23/

版权

set接口和常用方法

set接口基本介绍

image-20210930095712915

无序（添加和取出的顺序不一致），没有索引
不允许重复元素，所以最多包含一个null
JDK API中Set接口的实现类有

set接口常用方法

和List接口一样，set接口也是Collection的子接口，因此，常用方法和Collection接口一样

set接口的遍历方式

同Collection的遍历方法一样，因为set接口是Collection接口的子接口。

可以使用迭代器
增强for
==不能使用索引的方式遍历==

public class SetMethod {
    public static void main(String[] args) {
//        以set接口的实现子类 HashSet举例 set接口方法
//        1. set接口的实现类（Set接口对象），不能存放重复的元素，可以添加一个null
//        2. set接口对象存放的数据是无序的（即添加的顺序和取出的顺序不一致）
//        3. 注意：取出的顺序虽然不是添加的顺序，但是它是固定的。
        Set set = new HashSet();
        set.add("1");
        set.add("2");
        set.add("3");
        set.add("3");
        set.add("4");
        set.add("100");
        set.add(null);
        set.add(null);
        System.out.println(set);
//        遍历
//        1. 迭代器
        System.out.println("====迭代器遍历");
        Iterator iterator = set.iterator();
        while (iterator.hasNext()) {
            Object next =  iterator.next();
            System.out.println(next);
        }
//         2. 增强for遍历
        System.out.println("====增强for遍历");
        for (Object o : set) {
            System.out.println(o);
        }
//        set接口对象，不能通过索引来获取，无法使用 普通for循环遍历
    }
}

image-20210930101904730

HashSet

Hash基本介绍

image-20210930105106576

HashSet 实现了Set接口

HashSet 实际上是HashMap

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * default initial capacity (16) and load factor (0.75).
     */
    public HashSet() {
        map = new HashMap<>();
    }

可以存放null值，但是只能有一个null
HashSet不保证元素是有序的，取决于hash值，再确定索引的结果

不能有重复元素/对象

import java.util.HashSet;

public class HashSet01 {
    public static void main(String[] args) {

        HashSet set = new HashSet();

        System.out.println(set.add("卡卡西")); // true
        System.out.println(set.add("佐助"));  // true
        System.out.println(set.add("鸣人"));  // true
        System.out.println(set.add("佐助"));  // false
        System.out.println(set.add("博人")); //true

        set.remove("佐助");
        System.out.println(set);


//
        set = new HashSet();
//        hashSet 不能添加相同的元素/数据
        set.add("lucy");  // true
        set.add("lucy");  // false
        set.add(new Dog("tom"));  //true
        set.add(new Dog("tom"));  //true
        System.out.println(set);

//        经典面试题
        // String类 相同字符的hash值结果一样，equals也为true,可见源码
        set.add(new String("wkliu")); // true
        set.add(new String("wkliu")); // false

        System.out.println(set);
    }
}

class Dog{
    private String name;

    public Dog(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return "Dog{" +
                "name='" + name + '\'' +
                '}';
    }
}

HashSet 底层机制

HashSet 底层是HashMap，HashMap底层是（数组+链表+红黑树）

模拟底层简单结构

结构：

image-20210930111256972

@SuppressWarnings({"all"})
public class HashSetStructure {
    public static void main(String[] args) {

//        模拟 HashSet 的底层 即HashMap底层

//        1. 创建一个数组，数组的类型是Node
        Node[] table = new Node[16];
        System.out.println("table="+table);

//        2. 创建节点
        Node john = new Node("john",null);

        table[2] = john;

        Node jack = new Node("jack", null);
//        将jack节点挂载到 john后
        john.next = jack;

        Node rose = new Node("rose",null);
//        将rose节点挂载到jack后
        jack.next = rose;

        Node lucy = new Node("lucy", null);
//        把lucy放到 table数组索引为3的位置
        table[3] = lucy;

        System.out.println("table="+table);
    }
}

//节点，存储数据，可以指向下一个节点，从而形成链表
class Node{
    Object item; //存放数据
    Node next;   // 指向下一个节点

    public Node(Object item, Node next) {
        this.item = item;
        this.next = next;
    }
}

image-20210930111313858

HashSet add()方法底层

HashSet 底层是 HashMap
添加一个元素时，先得到hash值会转成 -> 索引值
找到存储数据表table，看这个索引位置是否已经存放的有元素
如果没有，直接加入
如果有，调用 equals() 比较，如果想同，就放弃添加，如果不相同，则添加到最后
在 java8 中，如果一条链表的元素个数超过 TREEIFY_THRESHOLD（默认是8），并且table的大小 >= MIN_TREEIFY_CAPCAITY（默认64），就会进行树化（红黑树）

@SuppressWarnings({"all"})
public class HashSetSource {

    public static void main(String[] args) {

        HashSet hashSet = new HashSet();

        hashSet.add("java");
        hashSet.add("pp");
        hashSet.add("java");

        System.out.println("haseSet="+hashSet);
    }
}

第一次add()

image-20211001105110672

创建hashset对象，构造器：

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * default initial capacity (16) and load factor (0.75).
     */
    public HashSet() {
        map = new HashMap<>();
    }

执行add()方法

    /**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element <tt>e</tt> to this set if
     * this set contains no element <tt>e2</tt> such that
     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns <tt>false</tt>.
     *
     * @param e element to be added to this set
     * @return <tt>true</tt> if this set did not already contain the specified
     * element
     */
    public boolean add(E e) {    // e:"java"
        return map.put(e, PRESENT)==null;  //e:"java" map:"{}"  (static) PRESENT =  new Object
    }

执行put()方法，该方法会执行hash(key)方法，得到key对应的hash值（不是hashcode，通过算法计算hash值，降低hash冲突的几率）

    /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {   // key:"java"  value:PRESENT/Object@502
        return putVal(hash(key), key, value, false, true); // key:"java"  value:PRESENT/Object@502
    }

hash(key)方法：

    /**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); // >>> 无符号右移16位，降低hash冲突的几率
    }

==putVal()方法==

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {

        // 定义辅助变量 tab、p、n、i
        Node<K,V>[] tab; Node<K,V> p; int n, i;

        // table 是hashMap的一个属性，类型是Node[]  存放 node节点的数组
        // 如果 table为null或者大小为0，执行resize()方法，table第一次扩容16个空间
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;

        // (1)根据key得到的hash 计算该key应该存放到tab表的哪个索引位置，
        //    并且把这个位置变量赋给辅助变量p
        // (2) 判断 p 是否为null
        //     如果p为null,表示还没有存放元素，就创建一个Node key="java",value=PRESENT
        //     将该Node存入tab中: tab[i] = newNode(hash, key, value, null);
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 操作次数
        ++modCount;

        // 判断是否需要扩容，threshold为扩容操作临界值，默认为12，tab超过12执行扩容
        if (++size > threshold)
            resize();

        //空方法，是HashMap 给其子类留的 让其子类执行一些操作
        afterNodeInsertion(evict);

        // 放回 null,代表成功
        return null;
    }

第二次add()

putVal()方法

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
    
    // 第二次添加，table表不为空，不执行if
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
    
    	// (1)根据key得到的hash 计算该key应该存放到tab表的哪个索引位置，
        //    并且把这个位置变量赋给辅助变量p
        // (2) 判断 p 是否为null
        //     如果p为null,表示还没有存放元素，就创建一个Node key="java",value=PRESENT
        //     将该Node存入tab中: tab[i] = newNode(hash, key, value, null);
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

image-20211001110044575

第三次add()相同元素

putVal()方法

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
    
    // 第三次添加，table表不为空，不执行if
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
    
    	// (1)根据key得到的hash 计算该key应该存放到tab表的哪个索引位置，
        //    并且把这个位置变量赋给辅助变量p
        // (2) 判断 p 是否为null
    	//     此时p 不为null,执行else方法
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
           // 定义辅助变量。开发技巧：在需要局部变量（辅助变量）时候，再创建
            Node<K,V> e; K k;
            
            //p.hash == hash：如果当前索引位置对应的链表的第一个元素和准备添加的key的hash值相同
            // 并且满足下面两个条件之一：
            //		1.(k = p.key) == key: 准备加入的key 和 p 指向的Node节点的key 是同一个对象
            //		2.(key != null && key.equals(k))) 当前key不为null与 p指向Node结点的key的equals() 和准备加入的key比较后相同
            if (p.hash == hash &&  
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            
            // 判断 p 是否为 红黑树
            // 如果是红黑树 调用putTreeVal()方法追加
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // 如果table对应的索引位置已经是一个链表，使用for循环比较
                // 1. 依次和该链表的每一个元素比较后，都不相同，则加入到该链表的最后
                //      在把元素添加到链表后，立即判断，该链表是否已经达到8个节点，如果达到就调用treeifyBin()对当前链表进行树化，转成红黑树
                //		在转成红黑树时，要进行判断，如果该table数组的大小小于64，先对table表扩容，判断不成立才进行红黑树转化
                // 2. 在依次比较的过程中，如果有相同的元素，直接break 
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

HashSet 的扩容和转成红黑树机制

HashSet 底层是HashMap，第一次添加时，table数组扩容到16，临界值（threshold）是 16*加载因子（loadFactor）是0.75 = 12
如果table数组使用到了临界值12，就会扩容到 16*2 = 32，新的临界值就是 32*0.75 = 24，依次类推
在Java8中，如果一条链表的元素个数到 TREEIFY_THRESHOLD（默认是8），并且table的大小 >= MIN_TREEIFY_CAPACITY（默认64），就会进行树化（红黑树），否则仍然采用数组扩容机制

注意：

每添加一个元素（包括在table表，与表中链表）即添加一个节点，会执行一次 ++size，当size > threshold 时就会执行扩容。

table表扩容并不是表的16个大小被添加完才执行，当所有元素的个数大于临界值时就会执行扩容。

HashSet最佳实现

定义一个Eeployee类，该类包括：private成员属性name,age

创建3个Employee对象放入 HashSet中
当 name和age的值相同时，认为是相同员工，不能添加到HashSet集合中

重写 hashCode()与equals()

package set;

import java.util.HashSet;
import java.util.Objects;

/**
 * 定义一个Eeployee类，该类包括：private成员属性name,age
 *
 * 1. 创建3个Employee对象放入 HashSet中
 * 2. 当 name和age的值相同时，认为是相同员工，不能添加到HashSet集合中
 */

public class HashSetExercise {

    public static void main(String[] args) {

        HashSet hashSet = new HashSet();
        hashSet.add(new Employee("ll",18));
        hashSet.add(new Employee("kk",20));
        hashSet.add(new Employee("ll",18));
        System.out.println(hashSet);
    }
}
// 创建对象
class Employee{
    private String name;
    private int age;

    public Employee(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
//    如果 name和age的值相同，则返回相同的hash值
    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        Employee employee = (Employee) o;
        return age == employee.age && Objects.equals(name, employee.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }



    @Override
    public String toString() {
        return "Employee{" +
                "name='" + name + '\'' +
                ", age=" + age +
                '}';
    }
}

LinkedHashSet

LinkedHashSet 基本介绍

LinkedHashSet 是 HashSet的子类
LinkedHashSet 底层是一个LinkedHashMap，底层维护了一个数组 + 双向链表
LinkedHashSet 根据元素的 hashCode 值来决定元素的存储位置，同时使用链表维护元素的次序，这使得元素看起来是以插入顺序保存得。
LinkedHashSet 不允许添加重复元素

注意：

在LinkedHashSet中维护了一个hash表和双向链表（LinkedHashSet 有 head和tail）
每一个节点有 before和after属性，这样可以形成双向链表
在添加一个元素时，先求hash值，在求索引，确定该元素在table的位置，然后将添加的元素加入到双向链表（如果已经存在，不添加【原则和hashset一样】）
这样遍历LinkedHashSet 也能确保插入顺序和遍历顺序一致

public class LinkedHashSetSource {

    public static void main(String[] args) {

        LinkedHashSet set = new LinkedHashSet();
        set.add(new String("AA"));
        set.add(456);
        set.add(456);
        set.add(new Customer("刘",1001));
        set.add(123);
        set.add("wkliu");


        System.out.println("set"+set);
//        LinkedHashSet 加入顺序和取出顺序一致
//        LinkedHashSet 底层维护得是一个LinkedHashMap(是HashMap的子类）
//        LinkedHashSet 底层结构 （数组 + 双向链表）
//        添加第一次时，直接将 table 扩容到 16，存放的结点类型是 LinkedHashMap$Entry
//        数组是 HashMap$Node[]   存放的元素/数据是 LinkedHashMap$Entry类型

        /*
           继承关系在内部类中完成
          static class Entry<K,V> extends HashMap.Node<K,V> {
                 Entry<K,V> before, after;
                 Entry(int hash, K key, V value, Node<K,V> next) {
                      super(hash, key, value, next);
                 }
            }

         */
    }
}

class Customer{
    private String name;
    private int no;

    public Customer(String name, int no) {
        this.name = name;
        this.no = no;
    }

    @Override
    public String toString() {
        return "Customer{" +
                "name='" + name + '\'' +
                ", no=" + no +
                '}';
    }
}

image-20211009110046554

image-20211009110019209

image-20211009105745476

TreeSet

不允许添加重复元素，不允许添加 null
无序（没有按照输入顺序进行输出）
遍历结果有顺序
底层为二叉树，且采用中序遍历得到结果（左节点，根节点，右节点）

添加Integer类型数据

元素有序（（Interger)按照升序进行排序）
底层利用内部比较器排序
底层：二叉树
遍历时采用中序遍历（左节点，根节点，右节点）

public class TreeSet_Integer {
    public static void main(String[] args) {
        TreeSet<Integer> treesetInt = new TreeSet<>();
        treesetInt.add(12);
        treesetInt.add(3);
        treesetInt.add(7);
        treesetInt.add(9);
        treesetInt.add(3);
        System.out.println(treesetInt.size());
        System.out.println(treesetInt);
    }
}

image-20211029174255624

image-20211029174217402

添加String 类型数据

底层实现内部比较器

public class TreeSet_ {
    public static void main(String[] args) {

//        1. 当使用无参构造器，创建TreeSet时，是自然排序 如a~z
//        2. 使用TreeSet提供的构造器，可以传入一个比较器（匿名内部类）
//              并指定排序规则



        TreeSet<String> treeSet = new TreeSet();
        treeSet.add("jack");
        treeSet.add("tom");
        treeSet.add("10");
        treeSet.add("sp");
        treeSet.add("1");
        treeSet.add("5");
        treeSet.add("a");
        //treeSet.add(null);  报NullPointerException

        System.out.println(treeSet);
    }
}

image-20211029174523253

添加自定义类型数据

创建student类，实现内部比较器

package set;

import java.util.TreeSet;

public class TreeSet_Student {

    public static void main(String[] args) {
        TreeSet<Student>  ts = new TreeSet<>();
        ts.add(new Student(10,"alili"));
        ts.add(new Student(8,"blili"));
        ts.add(new Student(9,"clili"));
        ts.add(new Student(3,"blili"));
        ts.add(new Student(7,"blili"));
        ts.add(new Student(1,"clili"));
        ts.add(new Student(3,"elili"));

        System.out.println(ts.size());
        System.out.println(ts);
    }


}


class Student implements Comparable<Student>{
    private int age;
    private String name;

    public Student(int age, String name) {
        this.age = age;
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return "Student{" +
                "age=" + age +
                ", name='" + name + '\'' +
                '}';
    }


    @Override
    public int compareTo(Student o) {
        return this.getAge() - o.getAge();
    }
}

image-20211029175142462

利用外部比较器

public class TreeSet_ {
    public static void main(String[] args) {

//        1. 当使用无参构造器，创建TreeSet时，是自然排序 如a~z
//        2. 使用TreeSet提供的构造器，可以传入一个比较器（匿名内部类）
//              并指定排序规则



        TreeSet<String> treeSet = new TreeSet();
        treeSet.add("jack");
        treeSet.add("tom");
        treeSet.add("10");
        treeSet.add("sp");
        treeSet.add("1");
        treeSet.add("5");
        treeSet.add("a");
        //treeSet.add(null);  报NullPointerException

        System.out.println(treeSet);


        TreeSet treeSet2 = new TreeSet(new Comparator() {
            @Override
            public int compare(Object o1, Object o2) {
//                调用 String 的 compareTo()方法进行字符串大小比较
                return ((String) o1).compareTo((String) o2);
            }
        });
        treeSet2.add("jack");
        treeSet2.add("tom");
        treeSet2.add("10");
        treeSet2.add("sp");
        treeSet2.add("1");
        treeSet2.add("5");
        treeSet2.add("a");
        
        System.out.println(treeSet2);

    }
}

image-20211029175747118

源码

底层为 TreeMap

private transient NavigableMap<E,Object> m;

// 空构造器，底层创建 TreeMap 
public TreeSet() {
        this(new TreeMap<E,Object>());
    }

// this 构造器
TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }

  public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }