HashMap中有一些我们容易忽视的点
1. 关于key的hash和equals
- public V put(K key, V value) {
- if (table == EMPTY_TABLE) {
- inflateTable(threshold);
- }
- if (key == null)
- return putForNullKey(value);
- int hash = hash(key);
- int i = indexFor(hash, table.length);
- for (Entry<K,V> e = table[i]; e != null; e = e.next) {
- Object k;
- if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
- V oldValue = e.value;
- e.value = value;
- e.recordAccess(this);
- return oldValue;
- }
- }
- modCount++;
- addEntry(hash, key, value, i);
- return null;
- }
由上述代码知道,hash值是用来确定bucketIndex,equals是用来和链表上的值比较,因此对于key是自定义的类,强烈建议重写hashCode和equals方法。此处也是容易引起内存泄露的点。
下面摘抄一段JDK里面的注释,
- /**
- * Returns a hash code value for the object. This method is
- * supported for the benefit of hash tables such as those provided by
- * {@link java.util.HashMap}.
- * <p>
- * The general contract of {@code hashCode} is:
- * <ul>
- * <li>Whenever it is invoked on the same object more than once during
- * an execution of a Java application, the {@code hashCode} method
- * must consistently return the same integer, provided no information
- * used in {@code equals} comparisons on the object is modified.
- * This integer need not remain consistent from one execution of an
- * application to another execution of the same application.
- * <li>If two objects are equal according to the {@code equals(Object)}
- * method, then calling the {@code hashCode} method on each of
- * the two objects must produce the same integer result.
- * <li>It is <em>not</em> required that if two objects are unequal
- * according to the {@link java.lang.Object#equals(java.lang.Object)}
- * method, then calling the {@code hashCode} method on each of the
- * two objects must produce distinct integer results. However, the
- * programmer should be aware that producing distinct integer results
- * for unequal objects may improve the performance of hash tables.
- * </ul>
- * <p>
- * As much as is reasonably practical, the hashCode method defined by
- * class {@code Object} does return distinct integers for distinct
- * objects. (This is typically implemented by converting the internal
- * address of the object into an integer, but this implementation
- * technique is not required by the
- * Java<font size="-2"><sup>TM</sup></font> programming language.)
- *
- * @return a hash code value for this object.
- * @see java.lang.Object#equals(java.lang.Object)
- * @see java.lang.System#identityHashCode
- */
- public native int hashCode();
2. rehash的条件
- void addEntry(int hash, K key, V value, int bucketIndex) {
- if ((size >= threshold) && (null != table[bucketIndex])) {
- resize(2 * table.length);
- hash = (null != key) ? hash(key) : 0;
- bucketIndex = indexFor(hash, table.length);
- }
- createEntry(hash, key, value, bucketIndex);
- }
if条件告诉我们rehash的条件要同时满足两个:map中元素个数不小于阀值即容量*负载因子,对应的bucketIndex处有元素。
另外,如下代码作备忘,
- static int indexFor(int h, int length) {
- // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
- return h & (length-1);
- }
3. 可以插入(null, null)吗
- Map<String, String> map = new HashMap<String, String>();
- map.put(null, null);
- System.out.println(map.size()); // 1
- private V putForNullKey(V value) {
- for (Entry<K,V> e = table[0]; e != null; e = e.next) {
- if (e.key == null) {
- V oldValue = e.value;
- e.value = value;
- e.recordAccess(this);
- return oldValue;
- }
- }
- modCount++;
- addEntry(0, null, value, 0); // hash = 0, bucketIndex = 0
- return null;
- }
注意,Hashtable和ConcurrentHashMap进行put时若value为null,将抛出NullPointerException。
4. table默认初始大小 - 16
- public HashMap(int initialCapacity, float loadFactor) {
- // ...
- this.loadFactor = loadFactor; // 0.75
- threshold = initialCapacity; // 16
- init(); // nothing
- }
- public V put(K key, V value) {
- if (table == EMPTY_TABLE) {
- inflateTable(threshold);
- }
- // ...
- }
- private void inflateTable(int toSize) {
- // Find a power of 2 >= toSize
- int capacity = roundUpToPowerOf2(toSize); // 16
- threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
- table = new Entry[capacity];
- initHashSeedAsNeeded(capacity);
- }
- private static int roundUpToPowerOf2(int number) {
- // assert number >= 0 : "number must be non-negative";
- return number >= MAXIMUM_CAPACITY
- ? MAXIMUM_CAPACITY
- : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
- }
5. 关于HashMap里的hash(Object key)方法
- final int hash(Object k) {
- int h = hashSeed;
- if (0 != h && k instanceof String) {
- return sun.misc.Hashing.stringHash32((String) k);
- }
- h ^= k.hashCode();
- // This function ensures that hashCodes that differ only by
- // constant multiples at each bit position have a bounded
- // number of collisions (approximately 8 at default load factor).
- h ^= (h >>> 20) ^ (h >>> 12);
- return h ^ (h >>> 7) ^ (h >>> 4);
- }
- /**
- * Initialize the hashing mask value. We defer initialization until we
- * really need it.
- */
- final boolean initHashSeedAsNeeded(int capacity) {
- boolean currentAltHashing = hashSeed != 0;
- boolean useAltHashing = sun.misc.VM.isBooted() &&
- (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
- boolean switching = currentAltHashing ^ useAltHashing;
- if (switching) {
- hashSeed = useAltHashing
- ? sun.misc.Hashing.randomHashSeed(this)
- : 0;
- }
- return switching;
- }
6.从hashmap的put方法的具体逻辑来看,里面充斥着线程不安全因素:
1)map为空时,同时初始化map
2)hash冲突时,同时操作链表
3)同时rehash时造成机器load高
7.元素为什么放在链表的头部:因为单链表,放头部最快
8.LinkedHashMap的实现:继承于HashMap,内部的Entry也继承于HashMap.Entry,增加了属性before,after,另外,重写了get,addEntry,createEntry方法,提供了iterator相关方法。
9.TreeMap的实现,重点看put方法
- public V put(K key, V value) {
- Entry<K,V> t = root;
- if (t == null) {
- compare(key, key); // type (and possibly null) check
- root = new Entry<>(key, value, null);
- size = 1;
- modCount++;
- return null;
- }
- int cmp;
- Entry<K,V> parent;
- // split comparator and comparable paths
- Comparator<? super K> cpr = comparator;
- if (cpr != null) {
- do {
- parent = t;
- cmp = cpr.compare(key, t.key);
- if (cmp < 0)
- t = t.left;
- else if (cmp > 0)
- t = t.right;
- else
- return t.setValue(value);
- } while (t != null);
- }
- else {
- if (key == null)
- throw new NullPointerException();
- Comparable<? super K> k = (Comparable<? super K>) key;
- do {
- parent = t;
- cmp = k.compareTo(t.key);
- if (cmp < 0)
- t = t.left;
- else if (cmp > 0)
- t = t.right;
- else
- return t.setValue(value);
- } while (t != null);
- }
- Entry<K,V> e = new Entry<>(key, value, parent);
- if (cmp < 0)
- parent.left = e;
- else
- parent.right = e;
- fixAfterInsertion(e);
- size++;
- modCount++;
- return null;
- }
10. jdk7 ConcurrentHashMap的get,size,put方法
- public V get(Object key) {
- Segment<K,V> s; // manually integrate access methods to reduce overhead
- HashEntry<K,V>[] tab;
- int h = hash(key);
- long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
- if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
- (tab = s.table) != null) {
- for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
- (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
- e != null; e = e.next) {
- K k;
- if ((k = e.key) == key || (e.hash == h && key.equals(k)))
- return e.value;
- }
- }
- return null;
- }
- public int size() {
- // Try a few times to get accurate count. On failure due to
- // continuous async changes in table, resort to locking.
- final Segment<K,V>[] segments = this.segments;
- int size;
- boolean overflow; // true if size overflows 32 bits
- long sum; // sum of modCounts
- long last = 0L; // previous sum
- int retries = -1; // first iteration isn't retry
- try {
- for (;;) {
- if (retries++ == RETRIES_BEFORE_LOCK) {
- for (int j = 0; j < segments.length; ++j)
- ensureSegment(j).lock(); // force creation
- }
- sum = 0L;
- size = 0;
- overflow = false;
- for (int j = 0; j < segments.length; ++j) {
- Segment<K,V> seg = segmentAt(segments, j);
- if (seg != null) {
- sum += seg.modCount;
- int c = seg.count;
- if (c < 0 || (size += c) < 0)
- overflow = true;
- }
- }
- if (sum == last)
- break;
- last = sum;
- }
- } finally {
- if (retries > RETRIES_BEFORE_LOCK) {
- for (int j = 0; j < segments.length; ++j)
- segmentAt(segments, j).unlock();
- }
- }
- return overflow ? Integer.MAX_VALUE : size;
- }
- public V put(K key, V value) {
- Segment<K,V> s;
- if (value == null)
- throw new NullPointerException();
- int hash = hash(key);
- int j = (hash >>> segmentShift) & segmentMask;
- if ((s = (Segment<K,V>)UNSAFE.getObject // nonvolatile; recheck
- (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment
- s = ensureSegment(j);
- return s.put(key, hash, value, false);
- }
- public V putIfAbsent(K key, V value) {
- Segment<K,V> s;
- if (value == null)
- throw new NullPointerException();
- int hash = hash(key);
- int j = (hash >>> segmentShift) & segmentMask;
- if ((s = (Segment<K,V>)UNSAFE.getObject
- (segments, (j << SSHIFT) + SBASE)) == null)
- s = ensureSegment(j);
- return s.put(key, hash, value, true);
- }
- final V put(K key, int hash, V value, boolean onlyIfAbsent) {
- HashEntry<K,V> node = tryLock() ? null :
- scanAndLockForPut(key, hash, value);
- V oldValue;
- try {
- HashEntry<K,V>[] tab = table;
- int index = (tab.length - 1) & hash;
- HashEntry<K,V> first = entryAt(tab, index);
- for (HashEntry<K,V> e = first;;) {
- if (e != null) {
- K k;
- if ((k = e.key) == key ||
- (e.hash == hash && key.equals(k))) {
- oldValue = e.value;
- if (!onlyIfAbsent) {
- e.value = value;
- ++modCount;
- }
- break;
- }
- e = e.next;
- }
- else {
- if (node != null)
- node.setNext(first);
- else
- node = new HashEntry<K,V>(hash, key, value, first);
- int c = count + 1;
- if (c > threshold && tab.length < MAXIMUM_CAPACITY)
- rehash(node);
- else
- setEntryAt(tab, index, node);
- ++modCount;
- count = c;
- oldValue = null;
- break;
- }
- }
- } finally {
- unlock();
- }
- return oldValue;
- }
- /**
- * Scans for a node containing given key while trying to
- * acquire lock, creating and returning one if not found. Upon
- * return, guarantees that lock is held. UNlike in most
- * methods, calls to method equals are not screened: Since
- * traversal speed doesn't matter, we might as well help warm
- * up the associated code and accesses as well.
- *
- * @return a new node if key not found, else null
- */
- private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
- HashEntry<K,V> first = entryForHash(this, hash);
- HashEntry<K,V> e = first;
- HashEntry<K,V> node = null;
- int retries = -1; // negative while locating node
- while (!tryLock()) {
- HashEntry<K,V> f; // to recheck first below
- if (retries < 0) {
- if (e == null) {
- if (node == null) // speculatively create node
- node = new HashEntry<K,V>(hash, key, value, null);
- retries = 0;
- }
- else if (key.equals(e.key))
- retries = 0;
- else
- e = e.next;
- }
- else if (++retries > MAX_SCAN_RETRIES) {
- lock();
- break;
- }
- else if ((retries & 1) == 0 &&
- (f = entryForHash(this, hash)) != first) {
- e = first = f; // re-traverse if entry changed
- retries = -1;
- }
- }
- return node;
- }