HashMap底层原理详解：从源码到实战-CSDN博客

本文链接：https://blog.csdn.net/jnimiijabc/article/details/147629314

前言

作为Java集合框架中最重要且使用频率最高的数据结构之一，HashMap的底层原理是每个Java开发者必须掌握的核心知识。本文将深入剖析HashMap的实现原理，对比不同JDK版本的优化，解析关键源码，并解答高频面试问题。

一、 HashMap基础结构演变

JDK1.7的实现：数组+链表

在JDK1.7及之前，HashMap采用经典的"数组+链表"结构：

// JDK1.7的Entry实现
static class Entry<K,V> implements Map.Entry<K,V> {
    final K key;
    V value;
    Entry<K,V> next;  // 链表指针
    int hash;
    
    // 构造方法和其余代码...
}

数组的每个位置被称为一个"桶"(bucket)，当发生哈希冲突时，新的元素会被添加到链表头部（头插法）。

JDK1.8的优化：数组+链表/红黑树

JDK1.8对HashMap进行了重大优化：

// JDK1.8的Node实现
static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;  // 仍然保留链表结构
    
    // 构造方法和其余代码...
}

// 红黑树节点
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;  // 红黑树父节点
    TreeNode<K,V> left;    // 左子树
    TreeNode<K,V> right;   // 右子树
    TreeNode<K,V> prev;    // 前驱节点
    boolean red;          // 颜色标记
    
    // 构造方法和其余代码...
}

主要优化点：

1. 当链表长度≥8且数组长度≥64时，链表转为红黑树

2. 当红黑树节点数≤6时，退化为链表

3. 哈希冲突时采用尾插法而非头插法

二、关键源码解析

1.hash()计算优化

// JDK1.8的hash方法
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

这种"高16位异或低16位"的设计既考虑了性能又减少了碰撞：

1.保留了高16位的特征。

2.增加了低16位的随机性。

3.相比JDK1.7减少了四次位运算，性能更优。

2. 扩容机制(resize)

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    
    // 计算新容量和新阈值
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // 双倍扩容
    }
    // ... 其余初始化逻辑
    
    // 数据迁移
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // 链表优化重hash
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

JDK1.8的扩容优化：

1. 引入高低位链表，无需重新计算hash。

2. 扩容后位置要么是原位置，要么是原位置+旧容量。

3. 红黑树的拆分优化。

3. 线程安全问题

HashMap不是线程安全的，常见问题：

JDK1.7：多线程扩容可能导致环形链表，引起死循环。

JDK1.8：数据覆盖问题（多线程put时可能丢失数据）。

解决方案：

// 使用Collections.synchronizedMap
Map<String, String> map = Collections.synchronizedMap(new HashMap<>());

// 或者使用ConcurrentHashMap
Map<String, String> concurrentMap = new ConcurrentHashMap<>();

三、高频面试问题解析

为什么链表长度≥8转为红黑树？

1. 统计学基础：在理想随机hash情况下，链表长度达到8的概率极低（约0.00000006）。

2. 时间复杂度：链表查找：O(n)

红黑树查找：O(log n)。

3. 空间权衡：TreeNode占用空间是普通Node的两倍，只在必要时转换。

4. 退化机制：节点数≤6时退化为链表，避免频繁转换。

为什么初始容量是2的幂次方？

1. 方便通过`(n - 1) & hash`代替取模运算，效率更高

2. 扩容时可以利用高低位链表优化，只需判断`(e.hash & oldCap) == 0`

3. 使元素分布更均匀，减少哈希冲突

四、手写简化版HashMap

public class MyHashMap<K, V> {
    private static final int DEFAULT_CAPACITY = 16;
    private static final float DEFAULT_LOAD_FACTOR = 0.75f;
    private static final int TREEIFY_THRESHOLD = 8;
    
    private Node<K, V>[] table;
    private int size;
    private int threshold;
    private float loadFactor;
    
    static class Node<K, V> {
        final int hash;
        final K key;
        V value;
        Node<K, V> next;
        
        Node(int hash, K key, V value, Node<K, V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
    }
    
    public MyHashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        this.threshold = (int)(DEFAULT_CAPACITY * DEFAULT_LOAD_FACTOR);
    }
    
    private int hash(K key) {
        if (key == null) return 0;
        int h = key.hashCode();
        return h ^ (h >>> 16);
    }
    
    public V put(K key, V value) {
        if (table == null || table.length == 0) {
            resize();
        }
        
        int hash = hash(key);
        int index = (table.length - 1) & hash;
        Node<K, V> head = table[index];
        
        // 检查是否已存在
        for (Node<K, V> node = head; node != null; node = node.next) {
            if (node.hash == hash && 
                (node.key == key || (key != null && key.equals(node.key)))) {
                V oldValue = node.value;
                node.value = value;
                return oldValue;
            }
        }
        
        // 添加新节点
        addNode(hash, key, value, index);
        return null;
    }
    
    private void addNode(int hash, K key, V value, int index) {
        Node<K, V> head = table[index];
        Node<K, V> newNode = new Node<>(hash, key, value, head);
        table[index] = newNode;
        
        if (++size > threshold) {
            resize();
        }
    }
    
    private void resize() {
        Node<K, V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int newCap = oldCap == 0 ? DEFAULT_CAPACITY : oldCap << 1;
        threshold = (int)(newCap * loadFactor);
        
        @SuppressWarnings("unchecked")
        Node<K, V>[] newTab = (Node<K, V>[])new Node[newCap];
        table = newTab;
        
        if (oldTab != null) {
            for (int i = 0; i < oldCap; i++) {
                Node<K, V> node = oldTab[i];
                if (node != null) {
                    oldTab[i] = null; // help GC
                    if (node.next == null) {
                        newTab[node.hash & (newCap - 1)] = node;
                    } else {
                        // 简化处理，实际应该像JDK1.8那样优化
                        do {
                            Node<K, V> next = node.next;
                            int newIndex = node.hash & (newCap - 1);
                            node.next = newTab[newIndex];
                            newTab[newIndex] = node;
                            node = next;
                        } while (node != null);
                    }
                }
            }
        }
    }
    
    // 其他方法如get, remove等省略...
}

五、实战建议

1. 初始化容量：预估元素数量，避免频繁扩容

  // 预计存储100个元素
   Map<String, String> map = new HashMap<>(128);  // 100/0.75 ≈ 133，取2^n的128

2. 键对象设计：
重写hashCode()和equals()方法
保证不可变性（最好使用不可变对象作为键）

3. 性能监控：关注哈希冲突情况，可通过反射查看table内容

4. 并发场景：优先考虑ConcurrentHashMap或Collections.synchronizedMap

总结

HashMap的优化演进体现了Java集合框架的持续改进。理解其底层原理不仅能帮助我们在面试中游刃有余，更能指导我们编写出更高效的代码。从JDK1.7到JDK1.8，HashMap在数据结构、哈希算法、扩容机制等方面都进行了显著优化，这些改进思路也值得我们学习借鉴。