java实现hashmap_Java的HashMap实现

最新推荐文章于 2024-08-14 21:49:54 发布

weixin_26737625

最新推荐文章于 2024-08-14 21:49:54 发布

阅读量573

点赞数

文章标签： java

原文链接：https://medium.com/swlh/hashmap-implementation-for-java-90a5f58d4a5b

版权

java实现hashmap

HashMap is a dictionary data structure provided by java. It’s a Map-based collection class that is used to store data in Key & Value pairs. In this article, we’ll be creating our own hashmap implementation.

HashMap是java提供的字典数据结构。这是一个基于Map的集合类，用于在“键与值”对中存储数据。在本文中，我们将创建自己的hashmap实现。

The benefit of using this data structure is faster data retrieval. It has data access complexity of O(1) in the best case.

使用此数据结构的好处是更快的数据检索。在最佳情况下，它的数据访问复杂度为O(1)。

In layman’s terms, a for each key we get the associated value.

用外行的术语来说，对于每个键，我们得到关联的值。

To implement this structure,

要实现此结构，

We need a list to store all the keys
我们需要一个列表来存储所有密钥
Key — Value relationship to get value based on key
密钥-基于密钥获取价值的价值关系

We can have a list containing all the key, values and to access we need to search all of it.

我们可以有一个包含所有键，值的列表，要访问我们需要搜索所有键，值。

But the main point of hash map is to access the keys faster in 0(1) access time.

但是哈希映射的主要目的是在0(1)访问时间内更快地访问键。

Here, hashing comes into play. We can hash the key and relate it with the index to retrieve data faster.

哈希在这里起作用。我们可以对键进行哈希处理并将其与索引相关联，以更快地检索数据。

Hash comes with a problem too, collision. It is always recommended to use a better hash function that can reduce chances of collision.

哈希也带来了一个问题，冲突。始终建议使用更好的哈希函数，以减少冲突的机会。

Multiple hash can have same hash key. For that reason, there is a bucket or container for each key where all the values are store if collision occurs.

多个哈希可以具有相同的哈希键。因此，每个键都有一个存储桶或容器，如果发生冲突，则存储所有值。

Let’s dive into a basic implementation of our hashmap.

让我们深入研究哈希图的基本实现。

Firstly, we need an array to store all the keys, a bucket model to store all the entry and a wrapper for our key, value pair.

首先，我们需要一个用于存储所有键的数组，一个用于存储所有条目的存储区模型以及一个用于键，值对的包装器。

public class MyKeyValueEntry<K, V> {
    private K key;
    private V value;
    public MyKeyValueEntry(K key, V value) {
        this.key = key;
        this.value = value;
    }    // getters & setters
    // hashCode & equals
}

Bucket to store all the key values

存储所有键值的存储桶

public class MyMapBucket {
    private List<MyKeyValueEntry> entries;
    public MyMapBucket() {
        if(entries == null) {
            entries = new LinkedList<>();
        }
    }
    public List<MyKeyValueEntry> getEntries() {
        return entries;
    }
    public void addEntry(MyKeyValueEntry entry) {
        this.entries.add(entry);
    }
    public void removeEntry(MyKeyValueEntry entry) {
        this.entries.remove(entry);
    }
}

Lastly, implementation of our hashmap

最后，实现我们的哈希图

public class MyHashMap<K, V> {
    private int CAPACITY = 10;
    private MyMapBucket[] bucket;
    private int size = 0;
    public MyHashMap() {
        this.bucket = new MyMapBucket[CAPACITY];
    }
    private int getHash(K key) {
        return (key.hashCode() & 0xfffffff) % CAPACITY;
    }
    private MyKeyValueEntry getEntry(K key) {
        int hash = getHash(key);
        for (int i = 0; i < bucket[hash].getEntries().size(); i++) {
            MyKeyValueEntry myKeyValueEntry = bucket[hash].getEntries().get(i);
            if(myKeyValueEntry.getKey().equals(key)) {
                return myKeyValueEntry;
            }
        }
        return null;
    }    public void put(K key, V value) {
        if(containsKey(key)) {
            MyKeyValueEntry entry = getEntry(key);
            entry.setValue(value);
        } else {
            int hash = getHash(key);
            if(bucket[hash] == null) {
                bucket[hash] = new MyMapBucket();
            }
            bucket[hash].addEntry(new MyKeyValueEntry<>(key, value));
            size++;
        }
    }
    public V get(K key) {
        return containsKey(key) ? (V) getEntry(key).getValue() : null;
    }
    public boolean containsKey(K key) {
        int hash = getHash(key);
        return !(Objects.isNull(bucket[hash]) || Objects.isNull(getEntry(key)));
    }
    public void delete(K key) {
        if(containsKey(key)) {
            int hash = getHash(key);
            bucket[hash].removeEntry(getEntry(key));
            size--;
        }
    }
    public int size() {
        return size;
    }
}

Put into map:

放入地图：

If key already exists, then update value of that key.
如果密钥已经存在，则更新该密钥的值。
Otherwise, add an entry to the bucket.
否则，将一个条目添加到存储桶。

Get from map:

从地图获取：

Check if the key exists, and return data.
检查密钥是否存在，并返回数据。

Contains:

包含：

Check if the bucket is null
检查存储桶是否为空
If not, then the bucket contains the key.
如果不是，则存储桶包含密钥。

Performance:

性能：

It has the performance of O(1) in best case and O(n) in worst case.

最好的情况下具有O(1)的性能，最坏的情况下具有O(n)的性能。

Java Improvement:

Java改进：

From Java version 8, All of the hashing based Map implementations: HashMap, Hashtable, LinkedHashMap, WeakHashMap and ConcurrentHashMap are modified to use an enhanced hashing algorithm for string keys when the capacity of the hash table has ever grown beyond 512 entries. The enhanced hashing implementation uses the murmur3 hashing algorithm along with random hash seeds and index masks. These enhancements mitigate cases where colliding String hash values could result in a performance bottleneck. Alternative String hashing implementation
从Java版本8开始，当哈希表的容量已经超过512个条目时，所有基于哈希的Map实现：HashMap，Hashtable，LinkedHashMap，WeakHashMap和ConcurrentHashMap都被修改为对字符串键使用增强的哈希算法。增强的哈希实现使用murmur3哈希算法以及随机哈希种子和索引掩码。这些增强功能缓解了冲突的字符串哈希值可能导致性能瓶颈的情况。备用字符串哈希实现
From Java version 8, once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n). Handle Frequent HashMap Collisions with Balanced Trees.
从Java版本8开始，一旦哈希存储桶中的项目数超过某个阈值，该存储桶就会从使用链接的条目列表切换到平衡树。在高哈希冲突的情况下，这将改善从O(n)到O(log n)的最坏情况下的性能。用平衡树处理频繁的HashMap冲突。

Java Hashmap features:

Java Hashmap功能：

The default initial capacity is 16
默认初始容量为16

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;

The load factor used when none specified in constructor.
在构造函数中未指定时使用的负载系数。

static final float DEFAULT_LOAD_FACTOR = 0.75f;

The bin count threshold for using a tree rather than list for a bin. Bins are converted to trees when adding an element to a bin with at least this many nodes.
使用树而不是列表列出容器的容器计数阈值。将元素添加到至少具有这么多节点的bin中时，bin会转换为树。

static final int TREEIFY_THRESHOLD = 8;

The bin count threshold for untreeifying a (split) bin during a resize operation.
在调整大小操作期间用于取消树状化(拆分的)箱的箱计数阈值。

static final int UNTREEIFY_THRESHOLD = 6;

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor.
HashMap的实例具有两个影响其性能的参数： 初始容量和负载因子 。

* The
*的

capacity is the number of buckets in the hash table, and
容量是哈希表中的存储桶数，并且

* the initial capacity is simply the capacity at the time the hash table is created.
*初始容量只是创建哈希表时的容量。

The
的

load factors a measure of how full the hash table is allowed to get before its capacity is automatically increased.
负载因子 s：衡量散列表在其容量自动增加之前的填充程度的度量。

When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is
当哈希表中的条目数超过负载因子和当前容量的乘积时，哈希表为

rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
重新哈希(即重建内部数据结构)，以使哈希表的存储桶数大约是两倍。
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.
通常，默认负载因子(.75)在时间和空间成本之间提供了很好的折衷。

* Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).
*更高的值减少了空间开销，但增加了查找成本(在HashMap类的大多数操作中都得到反映，包括get和put)。

The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.
设置其初始容量时，应考虑映射中的预期条目数及其负载因子，以最大程度地减少重新哈希操作的次数。

* If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
*如果初始容量大于最大条目数除以负载因子，则将不会进行任何哈希操作。