Hashing Table(笔记)

Hashing Tables include two parts:

    1. Hash function
    2. Collision-resolution

Motivation of hash table: Limited memory and time. 

Hash function: transform keys into array indices. 

    1. Positive integers. 
        We choose the array size M to be prime and, for any positive integer key k, compute the remainder when dividing k by M. (k % M in Java)  

    2. Floating-point numbers. 
        If the keys are real numbers between 0 and 1, we might just multiply by M and round off to the nearest integer to get an index between 0 and M-1

    3. Strings.
        Modular hashing works for long keys such as strings. 
        
int hash = 0;
for (int i = 0; i < s.length(); i++)
    hash = (R * hash + s.charAt(i))% M;


    4. Compound keys.
        If the key type has multiple integer fields, we can typically mix them together in the way just described for String values. 

    5. Java conventions. 
        hashCode() returns a 32-bit integer. The implementation of hashCode() for a data type must be consistent with equals. That is, if a.equals(b) is true, then a.hashCode() must have the same numerical value as b.hashCode(). Conversely, if the hasCode() values are different, then we know that the objects are not equal. However, if the hashCode() values are the same, the objects may or may not be equal, and we must use equals() to decide which condition holds. 

    6. Converting a hashCode() to an array index. 
        Since our goal is an array index, not a 32-bit integer, we combine hasCode() with modular hashing in our implementations to produce integers between 0 and M - 1, as follows:
private int has(Key x)
{    return (x.hashCode() & 0x7fffffff) % M}

Collision resolution

1. Separate chaining
    Two-step process: hash to find the list that could contain the key, then sequentially search through that list for the key. 
public class SeperateChainingHashST<Key, Value>
{
	private int N;  //number of key-value pairs
	private int M;  //hash table size
	private SequentialSearchST<Key, Value>[] st;  //array of ST objects

	public SeparateChainingHashST()
	{    this(997);    }

	public SeparateChainingHashST(int M)
	{    //Create M linekec lists.
        this.M = M;
        st = (SequentialSearchST<Key, Value>) new SequentialSearchST[M];
        for (int i = 0; i < M; i++)
        	st[i] = new SequentialSearchST();
	}

	private int hash(Key key)
	{
		return (key.hashCode() & 0x7fffffff) % M;
	}

	private Value get(Key key) 
	{
		return (Value) st[hash(key)].get(key);
	}

	private void put(Key key, Value value) 
	{
		st[hash(key)].put(key, value);
	}

	public Iterable<Key> keys()
	//todo

}

2. Linear probing
    Open-addressing hashing: relying on empty entries in the table to help with collision resolution. 
    The simplest open-addressing method is called linear probing:
    when there is a collision(when we hash to a table index that is already occupied with a key different from the search key), then we just check the next entry in the table(by incrementing the index). 
public class LinearProbingHashST<Key, Value>
{
	private int N;//number of key-value pairs in the table
	private int M; //size of linear-probing table
	private Key[] keys; //the keys
	private Value[] vals; //the values

	public LinearProbingHashST(int cap)
	{
		this.M = cap;
		keys = (Key[]) new Object[cap];
		vals = (Value[]) new Object[cap];
	}

	private int hash(Key key)
	{
		return (key.hashCode() & 0x7fffffff) % M;
	}

	private void resize(int cap)
	{
		LinearProbingHashST<Key, Value> t;
		t = new LinearProbingHashST<Key, Value>(cap);
		for (int i = 0; i < M; i++)
			if (keys[i] != null)
				t.put(keys[i], vals[i]);
		keys = t.keys;
		vals = t.vals;
		M = t.M
	}

	public void put(Key key, Value val) 
	{
		if (N >= M/2) resize(2*M); //double M

		int i;
		for (i = hash(key); keys[i] != null; i = (i + 1) % M)
			if (keys[i].equals(key)) { vals[i] = val; return; }
		keys[i] = key;
		vals[i] = val;
		N++;
	}

	public Value get(Key key)
	{
		for (int i = hash(key); keys[i] != null; i = (i + 1) % M)
			if (keys[i].equals(key))
				return vals[i];
		return null;
	}

	public void delete(Key key)
	{
		if (!contains(key)) return;
		int i = hash(key);
		while (!key.equals(keys[i]))
			i = (i+1) % M;
		keys[i] = null;
		vals[i] = null;
		i = (i + 1) % M;
		while (keys[i] != null) {
			Key keyToRedo = keys[i];
			Value valToRedo = vals[i];
			keys[i] = null;
			vals[i] = null;
			N--;
			put(keyToRedo, valToRedo);
			i = (i+1) % M;
		}
		N--;
		if (n >0 && N == M/8) resize(M/2);
	}
}

Memory
Our implementation SeparateChainingHashST uses memory for M references to SequentialSearchST objects plus M SequentialSearchST object. Each SequentialSearchST object has the usual 16 bytes of object overhead plus one 8-byte reference(first), and there are a total of N node objects, each with 24 bytes of object overhead plus 3 references(key, value, and next). 

With array resizing to ensure that the table is between one-eighth and one-half full, linear probing uses between 4N and 16N references. 

methodspace usage for N items
(reference types)
separate chaining~48N + 32M
linear probing~32N and ~128N

48N + 32M = 8*M + (16 + 8) * M + (24 + 3 * 8) * N

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值