Hashing Table(笔记)

最新推荐文章于 2020-05-05 11:43:48 发布

RMCYork

最新推荐文章于 2020-05-05 11:43:48 发布

阅读量887

点赞数

分类专栏： Data Structure

本文链接：https://blog.csdn.net/ysyyork/article/details/39767077

版权

Data Structure 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Hashing Tables include two parts:

1. Hash function

2. Collision-resolution

Motivation of hash table: Limited memory and time.

Hash function: transform keys into array indices.

1. Positive integers.

We choose the array size M to be prime and, for any positive integer key k, compute the remainder when dividing k by M. (k % M in Java)

2. Floating-point numbers.

If the keys are real numbers between 0 and 1, we might just multiply by M and round off to the nearest integer to get an index between 0 and M-1

3. Strings.

Modular hashing works for long keys such as strings.

int hash = 0;
for (int i = 0; i < s.length(); i++)
    hash = (R * hash + s.charAt(i))% M;

4. Compound keys.

If the key type has multiple integer fields, we can typically mix them together in the way just described for String values.

5. Java conventions.

hashCode() returns a 32-bit integer. The implementation of hashCode() for a data type must be consistent with equals. That is, if a.equals(b) is true, then a.hashCode() must have the same numerical value as b.hashCode(). Conversely, if the hasCode() values are different, then we know that the objects are not equal. However, if the hashCode() values are the same, the objects may or may not be equal, and we must use equals() to decide which condition holds.

6. Converting a hashCode() to an array index.

Since our goal is an array index, not a 32-bit integer, we combine hasCode() with modular hashing in our implementations to produce integers between 0 and M - 1, as follows:

private int has(Key x)
{    return (x.hashCode() & 0x7fffffff) % M}

Collision resolution

1. Separate chaining

Two-step process: hash to find the list that could contain the key, then sequentially search through that list for the key.

public class SeperateChainingHashST<Key, Value>
{
	private int N;  //number of key-value pairs
	private int M;  //hash table size
	private SequentialSearchST<Key, Value>[] st;  //array of ST objects

	public SeparateChainingHashST()
	{    this(997);    }

	public SeparateChainingHashST(int M)
	{    //Create M linekec lists.
        this.M = M;
        st = (SequentialSearchST<Key, Value>) new SequentialSearchST[M];
        for (int i = 0; i < M; i++)
        	st[i] = new SequentialSearchST();
	}

	private int hash(Key key)
	{
		return (key.hashCode() & 0x7fffffff) % M;
	}

	private Value get(Key key) 
	{
		return (Value) st[hash(key)].get(key);
	}

	private void put(Key key, Value value) 
	{
		st[hash(key)].put(key, value);
	}

	public Iterable<Key> keys()
	//todo

}

2. Linear probing

Open-addressing hashing: relying on empty entries in the table to help with collision resolution.

The simplest open-addressing method is called linear probing:

when there is a collision(when we hash to a table index that is already occupied with a key different from the search key), then we just check the next entry in the table(by incrementing the index).

public class LinearProbingHashST<Key, Value>
{
	private int N;//number of key-value pairs in the table
	private int M; //size of linear-probing table
	private Key[] keys; //the keys
	private Value[] vals; //the values

	public LinearProbingHashST(int cap)
	{
		this.M = cap;
		keys = (Key[]) new Object[cap];
		vals = (Value[]) new Object[cap];
	}

	private int hash(Key key)
	{
		return (key.hashCode() & 0x7fffffff) % M;
	}

	private void resize(int cap)
	{
		LinearProbingHashST<Key, Value> t;
		t = new LinearProbingHashST<Key, Value>(cap);
		for (int i = 0; i < M; i++)
			if (keys[i] != null)
				t.put(keys[i], vals[i]);
		keys = t.keys;
		vals = t.vals;
		M = t.M
	}

	public void put(Key key, Value val) 
	{
		if (N >= M/2) resize(2*M); //double M

		int i;
		for (i = hash(key); keys[i] != null; i = (i + 1) % M)
			if (keys[i].equals(key)) { vals[i] = val; return; }
		keys[i] = key;
		vals[i] = val;
		N++;
	}

	public Value get(Key key)
	{
		for (int i = hash(key); keys[i] != null; i = (i + 1) % M)
			if (keys[i].equals(key))
				return vals[i];
		return null;
	}

	public void delete(Key key)
	{
		if (!contains(key)) return;
		int i = hash(key);
		while (!key.equals(keys[i]))
			i = (i+1) % M;
		keys[i] = null;
		vals[i] = null;
		i = (i + 1) % M;
		while (keys[i] != null) {
			Key keyToRedo = keys[i];
			Value valToRedo = vals[i];
			keys[i] = null;
			vals[i] = null;
			N--;
			put(keyToRedo, valToRedo);
			i = (i+1) % M;
		}
		N--;
		if (n >0 && N == M/8) resize(M/2);
	}
}

Memory

Our implementation SeparateChainingHashST uses memory for M references to SequentialSearchST objects plus M SequentialSearchST object. Each SequentialSearchST object has the usual 16 bytes of object overhead plus one 8-byte reference(first), and there are a total of N node objects, each with 24 bytes of object overhead plus 3 references(key, value, and next).

With array resizing to ensure that the table is between one-eighth and one-half full, linear probing uses between 4N and 16N references.

method	space usage for N items (reference types)
separate chaining	~48N + 32M
linear probing	~32N and ~128N

48N + 32M = 8*M + (16 + 8) * M + (24 + 3 * 8) * N

RMCYork

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Hashing Table(笔记)

Hashing Tables include two parts: 1. Hash function 2. Collision-resolutionMotivation of hash table: Limited memory and time. Hash function: transform keys into
复制链接

扫一扫