HashMap(JDK1.7实现）

最新推荐文章于 2022-10-14 20:47:07 发布

alex-zhou96

最新推荐文章于 2022-10-14 20:47:07 发布

阅读量185

点赞数

分类专栏： # Java语言面试整理

本文链接：https://blog.csdn.net/ZHOUJIAN_TANK/article/details/104453117

版权

Java语言面试整理专栏收录该内容

6 篇文章 0 订阅

订阅专栏

JDK1.7 数组+链表

JDK1.8 数组+链表+红黑树

解决Hash冲突方法:

链表方法

0、小结

HashMap内部哟一个哈希表，即数组table,每个元素table[i]指向一个单项链表，根据键计算出hash值，取模得到数组中的索引位置bucketIndex，然后操作table[bucketIndex]指向的单向链表。

计算hash值

获取索引位置 tableIndex

操作table[TABLEIndex]指向的单向链表

存取的时候依据键的hash值，只在对应的链表中操作，不会访问别的链表，在对应链表操作时也是先比较hash值，如果相同再用equals方法比较。这就要求，相同对象棋hashCode返回值必须相同，如果是键自定义的类就特别注意这一点

在JDK1.8中做了优化，一定要注意-=

线程不安全*

1、HashMap jdk7内部组成

	//默认的Hash桶的大小，必须为2的幂次
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    //哈希桶最大值
    static final int MAXIMUM_CAPACITY = 1 << 30;

	//负载因子
    static final float DEFAULT_LOAD_FACTOR = 0.75f;


  
    //空表
    static final Entry<?,?>[] EMPTY_TABLE = {};

    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    //table是一个Entry类型的数组，称为哈希表或哈希桶，
    // 其中每个元素指向一个单项链表，链表中的每个节点表示表示一个键值对
    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

    /**
     * The number of key-value mappings contained in this map.
     */
    //表示实际键值对的个数
    transient int size;

    /**
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    // If table == EMPTY_TABLE then this is the initial capacity at which the
    // table will be created when inflated.
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;

  
    transient int modCount;

    /**
     * The default threshold of map capacity above which alternative hashing is
     * used for String keys. Alternative hashing reduces the incidence of
     * collisions due to weak hash code calculation for String keys.
     * <p/>
     * This value may be overridden by defining the system property
     * {@code jdk.map.althashing.threshold}. A property value of {@code 1}
     * forces alternative hashing to be used at all times whereas
     * {@code -1} value ensures that alternative hashing is never used.
     */
    static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;

Entry<K,V>[] table: table是一个Enmtrt类型的数组，称为哈希表或者哈希桶，其中每个元素指向一个单项链表，链表中的每个节点表示一个键值对；Entry是内部类

threshold：哈希表扩展的阈值；计算方式=table.length*负载因子；当键值对的个数size大于等于threshold时考虑进行扩展

 static class Entry<K,V> implements Map.Entry<K,V> {
        final K key; //key
        V value;//值
        Entry<K,V> next;//下一个节点
        int hash;//hash值

2、构造方法

2.1、public HashMap()

//默认容量16，负载因子0.75
public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
    }

2.2、public HashMap(int initialCapacity) {

public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

2.3、public HashMap(int initialCapacity, float loadFactor)

3、保存键值对 put(key,value)

允许key与value为null

计算键的哈希值
根据哈希值得到保存位置（取模）
插到对应位置的链表头部（若该链表没有值）或更新已有值
根据需要扩展table大小

public V put(K key, V value) {
        //如果是第一次保存，首先调用 inflateTable()方法给table分配实际的空间
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        //检查key是否为null，如果是，调用putForNullKey单独处理
        if (key == null)
            return putForNullKey(value);
    
        //key不为null的情形
        //计算hash的值，基于key自身的hashCode方法的返回值又进行了一些位运算，目的是随机和均匀性
        int hash = hash(key);
        int i = indexFor(hash, table.length);//计算将这个键值对放到table的哪个位置

        //在链表中逐个查找是否已经有这个键了
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            //比较的时候，先比较hash值，hash相同的时候，再使用equals方法进行比较
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                //如果找到直接修改Entry中的value值即可
                e.recordAccess(this);
                return oldValue;
            }
        }

        //modCount++的含义与ArrayList和LinkedList中介绍一样，为记录修改次数，方便在迭代中检测结构性变化
        modCount++;
        //如果没找到，调用addEntry方法在给定的位置添加数据
        addEntry(hash, key, value, i);
        return null;
    }

//

///inflateTable方法//
private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize

        //默认情况下，capacity的值为16，threshold变为12,table会分配一个长度为16的Entry的数组
        int capacity = roundUpToPowerOf2(toSize);
        //阈值=capacity*3/4
        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
    }




/hash方法//
  final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }


///indeFor（方法///
//调用indexFor方法，计算应该将这个键值对放到table的哪个位置
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        //HashMap中，length为2的幂次方，h&(length-1)等价于求模运算 h%length
        return h & (length-1);//等价于进行取余操作
    }



addEntry方法
void addEntry(int hash, K key, V value, int bucketIndex) {
        //加入size已经要超过阈值threshold了，并且对应的table位置与已经插入过对象了
        if ((size >= threshold) && (null != table[bucketIndex])) {
            //调用resize方法对table进行扩展，扩展策略是乘2
            resize(2 * table.length);
			
            hash = (null != key) ? hash(key) : 0;

            bucketIndex = indexFor(hash, table.length);
        }

        //如果是空间够的，不需要resize，则调用createEntry方法添加。
        createEntry(hash, key, value, bucketIndex);

    }


///resize该方法///
    void resize(int newCapacity) {
        //记录旧的表
        Entry[] oldTable = table;
        //旧的表的长度
        int oldCapacity = oldTable.length;

        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];

        //调用transfer方法将原来的键值对移植过来
        transfer(newTable, initHashSeedAsNeeded(newCapacity));

        table = newTable;
        //计算新的阈值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }



//这段代码遍历原来的每个键值对，计算新位置，并保存到新位置
    void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }




 void createEntry(int hash, K key, V value, int bucketIndex) {
        //新建一个Entry对象，插入单向链表的头部，并增加size；注意与JDK1.8的区别
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

4、得到值 get(key)

计算键的hash值
根据hash找到table中的对应链表
在链表中查找
逐个标胶，先通过hash快速比较，hash相同再通过equas比较

//HashMap支持key为Null,key为null的时候放在table[0],调用getForNullKey()获取值
    public V get(Object key) {
        //key为null时，
        if (key == null)
            return getForNullKey();

        //获取Entry
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }


final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        //计算hash值
        int hash = (key == null) ? 0 : hash(key);

        //根据hash值找到table中的对应链表
        //在链表中遍历查找
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            //逐个比较，先通过hash快速比较，hash相同再通过equals比较
            if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

5、删除键值对

public V remove(Object key) {
        Entry<K,V> e = removeEntryForKey(key);
        return (e == null ? null : e.value);
    }



//找到在数组中的位置遍历链表删除
    final Entry<K,V> removeEntryForKey(Object key) {
        if (size == 0) {
            return null;
        }
        //计算hash值
        int hash = (key == null) ? 0 : hash(key);
        //获取该值的索引位置
        int i = indexFor(hash, table.length);
        //遍历table[i],查找待删节点，使用prev指向前一个节点，next指向后一个接亲
        //e指向当前节点
        Entry<K,V> prev = table[i];
        Entry<K,V> e = prev;

        while (e != null) {
            Entry<K,V> next = e.next;
            Object k;
            //找到了，删除
            if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k)))) {
                modCount++;
                size--;
                if (prev == e)
                    table[i] = next;
                else
                    prev.next = next;
                e.recordRemoval(this);
                return e;
            }
            prev = e;
            e = next;
        }

        return e;
    }

面试题

1、为什么初始化数组大小为2的幂次方》？

16 32

方便进行与操作（等价于取余操作，加速)_

2、为什么hashcode之后进行位移运算?

h: 0101 0101

15: 0000 1111

使得hashCode的高位参与运算，避免hash值过于集中

解决哈希冲突的方法

解决哈希冲突的方法一般有：开放定址法、链地址法（拉链法）、再哈希法、建立公共溢出区等方法。

2.1 开放定址法

从发生冲突的那个单元起，按照一定的次序，从哈希表中找到一个空闲的单元。然后把发生冲突的元素存入到该单元的一种方法。开放定址法需要的表长度要大于等于所需要存放的元素。
在开放定址法中解决冲突的方法有：线行探查法、平方探查法、双散列函数探查法。
开放定址法的缺点在于删除元素的时候不能真的删除，否则会引起查找错误，只能做一个特殊标记。只到有下个元素插入才能真正删除该元素。

v2.1.1 线行探查法

线行探查法是开放定址法中最简单的冲突处理方法，它从发生冲突的单元起，依次判断下一个单元是否为空，当达到最后一个单元时，再从表首依次判断。直到碰到空闲的单元或者探查完全部单元为止。
可以参考csdn上flash对该方法的演示：
http://student.zjzk.cn/course_ware/data_structure/web/flash/cz/kfdzh.swf

2.1.2 平方探查法

平方探查法即是发生冲突时，用发生冲突的单元d[i], 加上 1²、 2²等。即d[i] + 1²，d[i] + 2², d[i] + 3²…直到找到空闲单元。
在实际操作中，平方探查法不能探查到全部剩余的单元。不过在实际应用中，能探查到一半单元也就可以了。若探查到一半单元仍找不到一个空闲单元，表明此散列表太满，应该重新建立。

2.1.3 双散列函数探查法

这种方法使用两个散列函数hl和h2。其中hl和前面的h一样，以关键字为自变量，产生一个0至m—l之间的数作为散列地址；h2也以关键字为自变量，产生一个l至m—1之间的、并和m互素的数(即m不能被该数整除)作为探查序列的地址增量(即步长)，探查序列的步长值是固定值l；对于平方探查法，探查序列的步长值是探查次数i的两倍减l；对于双散列函数探查法，其探查序列的步长值是同一关键字的另一散列函数的值。

2.2 链地址法（拉链法）

链接地址法的思路是将哈希值相同的元素构成一个同义词的单链表，并将单链表的头指针存放在哈希表的第i个单元中，查找、插入和删除主要在同义词链表中进行。链表法适用于经常进行插入和删除的情况。
如下一组数字,(32、40、36、53、16、46、71、27、42、24、49、64)哈希表长度为13，哈希函数为H(key)=key%13,则链表法结果如下：

0
1 -> 40 -> 27 -> 53
2
3 -> 16 -> 42
4
5
6 -> 32 -> 71
7 -> 46
8
9
10 -> 36 -> 49
11 -> 24
12 -> 64
注：在java中，链接地址法也是HashMap解决哈希冲突的方法之一，jdk1.7完全采用单链表来存储同义词，jdk1.8则采用了一种混合模式，对于链表长度大于8的，会转换为红黑树存储。

2.3 再哈希法

就是同时构造多个不同的哈希函数：
Hi = RHi(key) i= 1,2,3 … k;
当H1 = RH1(key) 发生冲突时，再用H2 = RH2(key) 进行计算，直到冲突不再产生，这种方法不易产生聚集，但是增加了计算时间。

2.4 建立公共溢出区

将哈希表分为公共表和溢出表，当溢出发生时，将所有溢出数据统一放到溢出区。

alex-zhou96

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HashMap(JDK1.7实现）

JDK1.7 数组+链表JDK1.8 数组+链表+红黑树解决Hash冲突方法:链表方法0、小结HashMap内部哟一个哈希表，即数组table,每个元素table[i]指向一个单项链表，根据键计算出hash值，取模得到数组中的索引位置bucketIndex，然后操作table[bucketIndex]指向的单向链表。计算hash值获取索引位置 tableIndex操...
复制链接

扫一扫