Java Map，Set实现类原理分析

最新推荐文章于 2023-02-27 11:35:05 发布

choulie4795

最新推荐文章于 2023-02-27 11:35:05 发布

阅读量99

点赞数

文章标签： java 数据结构与算法

原文链接：https://my.oschina.net/u/128964/blog/1835553

版权

1.HashMap

HashMap中有一个Node<K,V>的数组table，这个是存放所有添加到HashMap的键值的（源码是这么定义的：transient Node<K,V>[] table;），所以HashMap的数据存储结构第一先是数组。

我们再看看Node的结构。

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

（这里是Node源码的部分代码），从这部分源码能知道Node是一个单链表结构的静态类。

至此基本确定了HashMap的存储结构，那这么维护数据的存取呢？下面我们再看一部分源码。

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

上面这部分代码是给HashMap添加数据的，主要需要理解的方法是hash和putVal。hash是通过一个算法获取一个对象的hash值，这里主要是获取map中key的hash值。而putVal是真正的设置HashMap的数据的方法，它首先把通过hash方法获取到的key的hash值作为index，然后从table中获取数据，如果table的这个index位置没有数据，那就创建一个Node对象存放在此index处。如果此index已经有数据，需要判断当前添加的数据和此处已经存放的数据是否是同一个key，如果是则更新value，如果不是则添加到node链表的尾部（此处会遍历node链表，判断整个链上的所有数据）。

这里需要解释一个名词：哈希碰撞（哈希冲突），有句是这么说的hash值会相同，equals比较不一定相同，如果equals比较相同那就相同。也就是说hashmap把通过hash方法算出来的hash值一样的数据放到了table的同一个index处，然后通过equals比较后判断是更新链表中的数据还是新增链表中的数据。

现在HashMap的存储结构已经明了了。它是一个通过hash值确定的散列链表结构。

有一点需要注意，在putValue方法里有一段代码

for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null);
        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
            treeifyBin(tab, hash);
        break;
    }
    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
        break;
    p = e;
}

这部分代码是在单链表增加数据的时候做的判断，其中TREEIFY_THRESHOLD的默认值为8，而其中的判断 if (binCount >= TREEIFY_THRESHOLD - 1)如果成立则调用treeifBin这个方法。下面我们看看源码这个方法做了什么。

    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

这里是把一个Node的链表转换为了一个TreeNode（传说中的红黑二叉树），原来Hash碰撞中链表结构的数量大于8个，则调用树化转为红黑树结构，红黑树查找稍微快些，所以table的存储某个index上超过8组数据就变成了treeNode，至于为什么Node的数组可以存储TreeNode的对象呢，那是因为TreeNode是Node的子类,看看源码就知道了。

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V>

TreeNode继承了LinkedHashMap里的Entry

static class Entry<K,V> extends HashMap.Node<K,V>

LinkedHashMap里的Entry继承了Node。

2.LinkedHashMap

它继承了HashMap，Entry里增加了 before, after;两个成员变量。也就是说HashMap里用的是单链表，二LinkedHashMap里用的是双向链表。

看下源码：

    Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p =
            new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);
        return p;
    }

    TreeNode<K,V> newTreeNode(int hash, K key, V value, Node<K,V> next) {
        TreeNode<K,V> p = new TreeNode<K,V>(hash, key, value, next);
        linkNodeLast(p);
        return p;
    }

    private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
    }

在创建节点的时候都维护了before和after的值。保证了顺序。其他的基本和HashMap一致。

3.TreeMap

采用红黑二叉树的存储方式，成员变量为一个Entry<K,V>的根节点和一个Comparator<? super K>的对象。每次添加数据，通过compare方法比较key的值，小值在左，大值在右，相同则覆盖。而查找的时候道理一样，遇到compare返回的值一样则为需要寻找的数据。

4.HashTable

这个类基本被废弃不用了，这里简单的说下它和HashMap的区别。

首先在存储结构上基本一直都是数组桶+单链表。不同是在初始化桶的数量和扩充桶的算法上。

HashTable默认的初始大小为11，之后每次扩充为原来的2n+1；HashMap默认的初始化大小为16，之后每次扩充为原来的2倍。

HashTable不允许key为null；二HashMap对key为null做了特殊处理，把hash处理成了0。

HashTable在用key的hash值直接是key.hashCode();这与HashMap不一样。

HashTable在put和get方法上都增加了synchronized，所以是线程安全的，所以如果需要考虑线程安全问题还可以使用HashTable。

5.HashSet

HashSet有一个HashMap的成员变量，也就是它用HashMap来存储数据，在添加数据的时候把需要存储的数据当做了key进行存储。

6.TreeSet

TreeSet存储数据其实是用了TreeMap，然后把需要存储的数据当做key存入TreeMap中。

7.LinkedHashSet

它继承了HashSet，基本方法为HashSet的方法。

转载于:https://my.oschina.net/u/128964/blog/1835553

choulie4795

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Java Map，Set实现类原理分析

1.HashMap HashMap中有一个Node<K,V>的数组table，这个是存放所有添加到HashMap的键值的（源码是这么定义的：transient Node<K,V>[] table;），所以HashMap的数据存储结构第一先是数组。我们再看看Node...
复制链接

扫一扫