HashMap底层原理以及源码分析

最新推荐文章于 2024-09-29 18:51:32 发布

TyRed08

最新推荐文章于 2024-09-29 18:51:32 发布

阅读量332

点赞数 16

分类专栏： Java 文章标签：哈希算法算法 java

本文链接：https://blog.csdn.net/qq_41493103/article/details/141426502

版权

Java 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

1、Hash

核心理论: Hash也称散列、哈希，对应的英文都是Hash。基本原理就是把任意长度的输入，通过Hash算法变成固定长度的输出。这个映射的规则就是对应的Hash算法，而原始数据映射后的二进制串就是哈希值。

2、Hash的特点:

1、从hash值不可以反向推导出原始的数据

2、输入数据的微小变化会得到完全不同的hash值，相同的数据会得到相同的值

3.哈希算法的执行效率要高效，长的文本也能快速地计算出哈希值

4、hash算法的冲突概率要小

由于hash的原理是将输入空间的值映射成hash空间内，而hash值的空间远小于输入的空间。根据抽屉原理，一定会存在不同的输入被映射成相同输出的情况。
抽屉原理:桌上有十个苹果，要把这十个苹果放到九个抽屉里，无论怎样放，我们会发现至少会有一个抽屉里面放不少于两个苹果
这一现象就是我们所说的“抽屉原理”。

3、HashMap源码

	/**
     * The default initial capacity - MUST be a power of two.
     * 默认Node数组的初始容量-必须为2的幂。
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     * hashMap的最大容量，如果两个构造函数都使用参数隐式指定了更高的值，则使用该容量。
     * 必须是二<= 1 << 30的幂。
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     * 这是如果在构造函数中未指定负载因子的时候，就使用默认的负载系数0.75。
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     * 使用树而不是列表列出bin的bin计数阈值。
     * 当将元素添加到至少具有这么多节点的bin中时，bin会转换为树。
     * 该值必须大于2，并且至少应为8，才能与删除树的假设有关，即收缩时转换回原始分类箱。
     * <p>
     * 数化阈值，链表长度达到这个阈值就可能被数化，
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     * 在调整大小操作期间用于取消树状化（拆分的）箱的箱计数阈值。
     * 应小于TREEIFY_THRESHOLD，并且最多为6以与移除下的收缩检测相啮合。
     * <p>
     * 数降级为链表的阈值
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     * 可将其分类为树木的最小工作台容量。
     * （否则，如果bin中的节点过多，则将调整表的大小。）应至少为4 * TREEIFY_THRESHOLD，
     * 以避免调整大小和树化阈值之间发生冲突。
     * <p>
     * hashMap所有node长度大于这个值时，才才能数化链表长度大于8的链表
     */
    static final int MIN_TREEIFY_CAPACITY = 64;
//内部类，所有数据都封装成一个一个的node
static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
}

	/**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     * 该表在首次使用时初始化，并根据需要调整大小。
     * 分配时，长度始终是2的幂。
     * （在某些操作中，我们还允许长度为零，以允许使用当前不需要的引导机制。）
     */
    transient HashMap.Node<K, V>[] table;

    /**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     * 保存缓存的entrySet（）。 注意，AbstractMap字段用于keySet（）和values（）。
     */
    transient Set<Map.Entry<K, V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     * 此映射中包含的键-值映射数。 当前元素个数
     */
    transient int size;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     * <p>
     * 已对HashMap进行结构修改的次数，结构修改是指更改HashMap中的映射数或以其他方式修改其内部结构
     * （例如，重新哈希）的修改。 此字段用于使HashMap的Collection-view上的迭代器快速失败。
     * （请参见ConcurrentModificationException）。
     */
    transient int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     * <p>
     * The javadoc description is true upon serialization.
     * Additionally, if the table array has not been allocated, this
     * field holds the initial array capacity, or zero signifying
     * DEFAULT_INITIAL_CAPACITY.
     * <p>
     * 下一个要调整大小的大小值（容量*负载系数）。
     * <p>
     * 序列化后，javadoc描述为true。
     * 此外，如果尚未分配表阵列，则此字段将保留初始阵列容量，
     * 或为零，表示DEFAULT_INITIAL_CAPACITY。
     * <p>
     * 扩容阈值，当前hash表中元素超过阈值时，触发扩容
     * threshold=capacity*loadFactor
     */

    int threshold;

    /**
     * The load factor for the hash table.
     * 哈希表的负载因子。
     */
    final float loadFactor;


/**
     * Returns a power of two size for the given target capacity.
     * 对于给定的目标容量，返回两倍大小的幂。
     * 返回一个大一等于cap的并且是2的幂的数
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

/**
     * 扰动函数：
     * 作用：让key的hash值的搞16位都参与路由运算
     * 这样做的目的就在于你求于的时候包含了高16位和第16位的特性
     * 也就是说你所计算出来的hash值包含从而使得你的hash值更加不确定
     * 来降低碰撞的概率
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }