Collection map

最新推荐文章于 2024-10-01 05:04:32 发布

dinghui1928

最新推荐文章于 2024-10-01 05:04:32 发布

阅读量76

点赞数

文章标签：数据结构与算法 java

原文链接：http://www.cnblogs.com/gschain/p/11243059.html

版权

HashMap 的工作原理及代码实现

底层数组+链表实现，可以存储null键和null值，线程不安全

初始size为16，扩容：newsize = oldsize*2，size一定为2的n次幂

扩容针对整个Map，每次扩容时，原来数组中的元素依次重新计算存放位置，并重新插入

插入元素后才判断该不该扩容，有可能无效扩容（插入后如果扩容，如果没有再次插入，就会产生无效扩容）

当Map中元素总数超过Entry数组的75%，触发扩容操作，为了减少链表长度，元素分配更均匀

计算index方法：index = hash & (tab.length – 1)

HashMap的初始值还要考虑加载因子:

哈希冲突：若干Key的哈希值按数组大小取模后，如果落在同一个数组下标上，将组成一条Entry链，对Key的查找需要遍历Entry链上的每个元素执行equals()比较。

加载因子：为了降低哈希冲突的概率，默认当HashMap中的键值对达到数组大小的75%时，即会触发扩容。因此，如果预估容量是100，即需要设定100/0.75＝134的数组大小。

空间换时间：如果希望加快Key查找的时间，还可以进一步降低加载因子，加大初始大小，以降低哈希冲突的概率。

两个重要的参数：

简单的说，Capacity就是bucket的大小，Load factor就是bucket填满程度的最大比例。如果对迭代性能要求很高的话不要把capacity设置过大，也不要把load factor设置过小。当bucket中的entries的数目大于capacity*load factor时就需要调整bucket的大小为当前的2倍。

put函数的实现

put函数大致的思路为：

1.对key的hashCode()做hash，然后再计算index;

2.如果没碰撞直接放到bucket里；

3.如果碰撞了，以链表的形式存在buckets后；

4.如果碰撞导致链表过长(大于等于TREEIFY_THRESHOLD (8))，就把链表转换成红黑树；

5.如果节点已经存在就替换old value(保证key的唯一性)

6.如果bucket满了(超过load factor*current capacity)，就要resize。

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

static final float DEFAULT_LOAD_FACTOR = 0.75f;

newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); //12

public V put(K key, V value) {

// 对key的hashCode()做hash

return putVal(hash(key), key, value, false, true);

}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,

boolean evict) {

Node<K,V>[] tab; Node<K,V> p; int n, I;

// tab为空则创建

if ((tab = table) == null || (n = tab.length) == 0)

n = (tab = resize()).length;

// 计算index，并对null做处理

if ((p = tab[i = (n - 1) & hash]) == null)

tab[i] = newNode(hash, key, value, null);

else {

Node<K,V> e; K k;

// 节点存在

if (p.hash == hash &&

((k = p.key) == key || (key != null && key.equals(k))))

e = p;

// 该链为树

else if (p instanceof TreeNode)

e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);

// 该链为链表

else {

for (int binCount = 0; ; ++binCount) {

if ((e = p.next) == null) {

p.next = newNode(hash, key, value, null);

if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st

treeifyBin(tab, hash);

break;

}

if (e.hash == hash &&

((k = e.key) == key || (key != null && key.equals(k))))

break;

p = e;

}

// 写入

if (e != null) { // existing mapping for key

V oldValue = e.value;

if (!onlyIfAbsent || oldValue == null)

e.value = value;

afterNodeAccess(e);

return oldValue;

}

++modCount;

// 超过load factor*current capacity，resize

if (++size > threshold)

resize();

afterNodeInsertion(evict);

return null;

}

get函数的实现

在理解了put之后，get就很简单了。大致思路如下：

1.bucket里的第一个节点，直接命中；

2.如果有冲突，则通过key.equals(k)去查找对应的entry

若为树，则在树中通过key.equals(k)查找，O(logn)；

若为链表，则在链表中通过key.equals(k)查找，O(n)。

public V get(Object key) {

Node<K,V> e;

return (e = getNode(hash(key), key)) == null ? null : e.value;

}

final Node<K,V> getNode(int hash, Object key) {

Node<K,V>[] tab; Node<K,V> first, e; int n; K k;

if ((tab = table) != null && (n = tab.length) > 0 &&

(first = tab[(n - 1) & hash]) != null) {

// 直接命中

if (first.hash == hash && // always check first node

((k = first.key) == key || (key != null && key.equals(k))))

return first;

// 未命中

if ((e = first.next) != null) {

// 在树中get

if (first instanceof TreeNode)

return ((TreeNode<K,V>)first).getTreeNode(hash, key);

// 在链表中get

do {

if (e.hash == hash &&

((k = e.key) == key || (key != null && key.equals(k))))

return e;

} while ((e = e.next) != null);

}

return null;

}

//Interger hashCode = value

public static int hashCode(int value) {

return value;

}

// aa hashcode 为3104

HashMap hashMap = new HashMap();

hashMap.put("aa", 1);

hashMap.put("aa", 2);

hashMap.put(3104, "dd");

hashMap.get("aa");

Iterator iter = hashMap.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

aa:2

3104:dd

HashMap hashMap = new HashMap();

hashMap.put(31, 1); //hash=31 1 1111 tab中i=15

hashMap.put(3104, "dd”); //hash=3104 tab中I=0

hashMap.put(63, 2); //hash=63 11 1111 tab中i=15

hashMap.get(Integer.valueOf(63));

Iterator iter = hashMap.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

3104:dd

31:1

63:2

HashTable的工作原理及代码实现

Hashtable的函数都是同步的，这意味着它是线程安全的。它的key、value都不可以为null。

底层数组+链表实现，无论key还是value都不能为null，线程安全，实现线程安全的方式是在修改数据时锁住整个HashTable，效率低，ConcurrentHashMap做了相关优化

初始size为11，扩容：newsize = olesize*2+1

计算index的方法：index = (hash & 0x7FFFFFFF) % tab.length

Hashtable的“拉链法”

put 方法的整个流程为：

1.判断 value 是否为空，为空则抛出异常；

2.计算 key 的 hash 值，并根据 hash 值获得 key 在 table 数组中的位置 index，如果 table[index] 元素不为空，则进行迭代，如果遇到相同的 key，则直接替换，并返回旧 value；

3.否则，我们可以将其插入到 table[index] 位置。

public synchronized V put(K key, V value) {

// Hashtable中不能插入value为null的元素！！！

// Make sure the value is not null

if (value == null) {

throw new NullPointerException();

}

// 若“Hashtable中已存在键为key的键值对”，

// 则用“新的value”替换“旧的value”

// Makes sure the key is not already in the hashtable.

Entry<?,?> tab[] = table;

int hash = key.hashCode();

int index = (hash & 0x7FFFFFFF) % tab.length;

@SuppressWarnings("unchecked")

Entry<K,V> entry = (Entry<K,V>)tab[index];

for(; entry != null ; entry = entry.next) {

if ((entry.hash == hash) && entry.key.equals(key)) {

V old = entry.value;

entry.value = value;

return old;

}

addEntry(hash, key, value, index);

return null;

}

private void addEntry(int hash, K key, V value, int index) {

// 若“Hashtable中不存在键为key的键值对”，

// (01) 将“修改统计数”+1

modCount++;

// (02) 若“Hashtable实际容量” > “阈值”(阈值=总的容量 * 加载因子)

// 则调整Hashtable的大小

Entry<?,?> tab[] = table;

if (count >= threshold) {

// Rehash the table if the threshold is exceeded

rehash();

tab = table;

hash = key.hashCode();

index = (hash & 0x7FFFFFFF) % tab.length;

}

// (03) 将“Hashtable中index”位置的Entry(链表)保存到e中

// Creates the new entry.

@SuppressWarnings("unchecked")

Entry<K,V> e = (Entry<K,V>) tab[index];

//(04) 创建“新的Entry节点”，并将“新的Entry”插入“Hashtable的index位、置”，并设置e为“新的Entry”的下一个元素(即“新Entry”为链表表头)。

tab[index] = new Entry<>(hash, key, value, e);

count++; // (05) 将“Hashtable的实际容量”+1

}

public synchronized V get(Object key) {

Entry<?,?> tab[] = table;

int hash = key.hashCode();

int index = (hash & 0x7FFFFFFF) % tab.length;

for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {

if ((e.hash == hash) && e.key.equals(key)) {

return (V)e.value;

}

return null;

}

Hashtable hashtable = new Hashtable();

hashtable.put(31, 1);

hashtable.put(3104, "dd");

hashtable.put(63, 2);

hashtable.get(Integer.valueOf(63));

Iterator iter = hashtable.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

31:1

63:2

3104:dd

LinkedHashMap

迭代HashMap的顺序并不是HashMap放置的顺序，也就是无序。

通过维护一个运行于所有条目的双向链表，LinkedHashMap保证了元素迭代的顺序。该迭代顺序可以是插入顺序或者是访问顺序。

LinkedHashMap可以认为是HashMap+LinkedList，即它既使用HashMap操作数据结构，又使用LinkedList维护插入元素的先后顺序。

Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {

LinkedHashMap.Entry<K,V> p =

new LinkedHashMap.Entry<K,V>(hash, key, value, e);

linkNodeLast(p);

return p;

}

private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {

LinkedHashMap.Entry<K,V> last = tail;

tail = p;

if (last == null)

head = p;

else {

p.before = last;

last.after = p;

}

LinkedHashMap linkedHashMap = new LinkedHashMap();

linkedHashMap.put(31, 1);

linkedHashMap.put(3104, "dd");

linkedHashMap.put(63, 2);

linkedHashMap.get(Integer.valueOf(63));

Iterator iter = linkedHashMap.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

31:1

3104:dd

63:2

TreeMap

public class TreeMap<K,V>

extends AbstractMap<K,V>

implements NavigableMap<K,V>, Cloneable, java.io.Serializable

{}

TreeMap 是一个有序的key-value集合，它是通过红黑树实现的。

TreeMap 继承于AbstractMap，所以它是一个Map，即一个key-value集合。

TreeMap 实现了NavigableMap接口，意味着它支持一系列的导航方法。比如返回有序的key集合。

TreeMap 实现了Cloneable接口，意味着它能被克隆。

TreeMap 实现了java.io.Serializable接口，意味着它支持序列化。

public V put(K key, V value) {

Entry<K,V> t = root;

if (t == null) {

compare(key, key); // type (and possibly null) check

root = new Entry<>(key, value, null);

size = 1;

modCount++;

return null;

}

int cmp;

Entry<K,V> parent;

// split comparator and comparable paths

Comparator<? super K> cpr = comparator;

if (cpr != null) {

do {

parent = t;

cmp = cpr.compare(key, t.key);

if (cmp < 0)

t = t.left;

else if (cmp > 0)

t = t.right;

else

return t.setValue(value);

} while (t != null);

}

else {

if (key == null)

throw new NullPointerException();

@SuppressWarnings("unchecked")

Comparable<? super K> k = (Comparable<? super K>) key;

do {

parent = t;

cmp = k.compareTo(t.key);

if (cmp < 0)

t = t.left;

else if (cmp > 0)

t = t.right;

else

return t.setValue(value);

} while (t != null);

}

Entry<K,V> e = new Entry<>(key, value, parent);

if (cmp < 0)

parent.left = e;

else

parent.right = e;

fixAfterInsertion(e);

size++;

modCount++;

return null;

}

TreeMap treeMap = new TreeMap();

treeMap.put(31, 1);

treeMap.put(3104, "dd");

treeMap.put(63, 2);

treeMap.get(Integer.valueOf(63));

Iterator iter = treeMap.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

31:1

63:2

3104:dd

ConcurrentHashMap 的工作原理及代码实现

底层采用分段的数组+链表实现，线程安全

通过把整个Map分为N个Segment，可以提供相同的线程安全，但是效率提升N倍，默认提升16倍。(读操作不加锁，由于HashEntry的value变量是 volatile的，也能保证读取到最新的值。)

Hashtable的synchronized是针对整张Hash表的，即每次锁住整张表让线程独占，ConcurrentHashMap允许多个修改操作并发进行，其关键在于使用了锁分离技术

有些方法需要跨段，比如size()和containsValue()，它们可能需要锁定整个表而而不仅仅是某个段，这需要按顺序锁定所有段，操作完毕后，又按顺序释放所有段的锁

扩容：段内扩容（段内元素超过该段对应Entry数组长度的75%触发扩容，不会对整个Map进行扩容），插入前检测需不需要扩容，有效避免无效扩容

ConcurrentHashMap采用了非常精妙的"分段锁"策略，ConcurrentHashMap的主干是个Segment数组。

final Segment<K,V>[] segments;

　　Segment继承了ReentrantLock，所以它就是一种可重入锁（ReentrantLock)。在ConcurrentHashMap，一个Segment就是一个子哈希表，Segment里维护了一个HashEntry数组，并发环境下，对于不同Segment的数据进行操作是不用考虑锁竞争的。

HashEntry是目前我们提到的最小的逻辑处理单元了。一个ConcurrentHashMap维护一个Segment数组，一个Segment维护一个HashEntry数组。

ConcurrentHashMap作为一种线程安全且高效的哈希表的解决方案，尤其其中的"分段锁"的方案，相比HashTable的全表锁在性能上的提升非常之大。

ConcurrentHashMap concurrentHashMap = new ConcurrentHashMap();

concurrentHashMap.put(31, 1);

concurrentHashMap.put(3104, "dd");

concurrentHashMap.put(63, 2);

concurrentHashMap.get(Integer.valueOf(63));

Iterator iter = concurrentHashMap.entrySet().iterator();

while (iter.hasNext()) {

Map.Entry entry = (Map.Entry) iter.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

3104:dd

31:1

63:2

WeakHashMap

public class WeakHashMap<K,V>

extends AbstractMap<K,V>

implements Map<K,V> {

public V put(K key, V value) {

Object k = maskNull(key);

int h = hash(k);

Entry<K,V>[] tab = getTable();

int i = indexFor(h, tab.length);

for (Entry<K,V> e = tab[i]; e != null; e = e.next) {

if (h == e.hash && eq(k, e.get())) {

V oldValue = e.value;

if (value != oldValue)

e.value = value;

return oldValue;

}

modCount++;

Entry<K,V> e = tab[i];

tab[i] = new Entry<>(k, value, queue, h, e);

if (++size >= threshold)

resize(tab.length * 2);

return null;

}

WeakHashMap

实现了Map接口，使用弱引用作为内部数据的存储方案。WeakHashMap是弱引用的典型应用，可以作为简单的缓存表解决方案。WeakHashMap会在系统内存范围内，保存所有表项目，一旦内存不够，在GC时，没有被引用的表项很快会被清除掉，从而避免系统内存溢出。

IdentityHashMap

IdentityHashMap 处理哈希冲突的方式是通过线性探测法

Key 通过对象地址计算hash值

HashMap()操作的时候，key内容是不能重复的，当新增相同key的内容时候，新增内容会替换掉原来的key的内容

要想key内容能够重复（指的是两个对象的地址不一样，key1!=key2）。则要使用IdentityHashMap类。

private static int hash(Object x, int length) {

int h = System.identityHashCode(x);

// Multiply by -127, and left-shift to use least bit as part of hash

return ((h << 1) - (h << 8)) & (length - 1);

}

public V put(K key, V value) {

final Object k = maskNull(key);

retryAfterResize: for (;;) {

final Object[] tab = table;

final int len = tab.length;

int i = hash(k, len);

for (Object item; (item = tab[i]) != null;

i = nextKeyIndex(i, len)) {

if (item == k) {

@SuppressWarnings("unchecked")

V oldValue = (V) tab[i + 1];

tab[i + 1] = value;

return oldValue;

}

final int s = size + 1;

// Use optimized form of 3 * s.

// Next capacity is len, 2 * current capacity.

if (s + (s << 1) > len && resize(len))

continue retryAfterResize;

modCount++;

tab[i] = k;

tab[i + 1] = value;

size = s;

return null;

}

IdentityHashMap identityHashMap = new IdentityHashMap<>();

identityHashMap.put(new Integer(31), 11);

identityHashMap.put(new Integer(31), 22);

Integer a = new Integer(32);

identityHashMap.put(a, 33);

System.out.println(identityHashMap.size());

identityHashMap.get(a);

Iterator iterator = identityHashMap.entrySet().iterator();

while (iterator.hasNext()) {

Map.Entry entry = (Map.Entry) iterator.next();

System.out.println(entry.getKey() + ":" + entry.getValue());

}

32:33

31:11

31:22

转载于:https://www.cnblogs.com/gschain/p/11243059.html

dinghui1928

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫