【Data Structures】 11. HashTable—Simple Implementation

HashTableInterface interface

// HashTable interface that takes only positive integers.
// No mapping, just keys
public interface HashTableInterface {
    // Return true when the key is found.
    boolean search(int key);

    // Delete and return an int key from the table.
    int delete(int key);

    // Insert an int key to the table.
    void insert(int key);
}

HashTable class

public class HashTable implements HashTableInterface {
    private static final DataItem DELETED = new DataItem(-1);
    private DataItem[] hashArray;

    // precondition: initialCapacity is a positive int
    public HashTable(int initialCapacity) {
        hashArray = new DataItem[initialCapacity];
    }

    // static nested class
    private static class DataItem {
        private int key;
        DataItem(int k) {
            key = k;
        }
}

First things first: Hashing method

// private helper method for hashing a key value
private int hashFunc(key) {
    return key % hashArray.length;
}

Searching for a key

@Override
public boolean search(int key) {
    int hashVal = hashFunc(key);
    while (hashArray[hashVal] != null) {
        if (hashArray[hashVal].key == key) {
            return true;  // found
        }
        hashVal++;
        // wrap around
        hashVal = hashVal % hashArray.length;
    }
    return false;  // cannot find
}

Deleting a key

@Override
public int delete(int key) {
    int hashVal = hashFunc(key);
    while (hashArray[hashVal] != null) {
        if (hashArray[hashVal].key == key) {
            int temp = hashArray[hashVal].key;
            hashArray[hashVal] = DELETED;
            return temp;
        }
        hashVal++;
        hashVal = hashVal % hashArray.length;
    }
    return -1;
}

Inserting a key

@Override
public void insert(int key) {
    DataItem item = new DataItem(key);
    int hashVal = hashFunc(key);
    while (hashArray[hashVal] != null && hashArray[hashVal] != DELETED) {
        hashVal++;
        // wrap around
        hashVal = hashVal % hashArray.length;
    }
    hashArray[hashVal] = item;
}

What makes a good hash function?

Quick computation is the key to a good hash function. Thus, a hash function with many multiplications and divisions is NOT a good idea.

The propose of a hash function is to take a range of key values and transform them into index values in a way that the key values are distributed randomly across all the indices of the hash table.

Key values may be completely random or not so random.

1. Random key values

If the key values are random and positive, then we can simply find index values by the following simple operation just like our code before.

index = key % hashArray.length;

2. Non-random key values

For example, there is a database that uses car-part numbers as key values.

033-400-03-94-05-0-535

This is interpreted as follow:

Digits 0-2: Supplier number (1 to 999, currently up to 70)

Digits 3-5: Category code (100, 150, 200, 250, up to 850)

Digits 6-7: Month of introduction (1 to 12)

Digits 8-9: Year of introduction (00 to 99)

Digits 10-11: Serial number (1 to 99, never exceeds 100)

Digits 12: Toxic risk flag (0 or 1)

Digits 13-15: Checksum (sum of the other fields)

Based on the interpretations provided, the key value should be 0,334,000,394,050,535 for the particular part number shown above.

However, we can say that there is no guarantee that we will have random numbers between 0 to 9,999,999,999,999,999.

Some work should be done to have these part numbers to form a range of more random numbers.

1. Don't Use Non-Data: The key values should be squeezed as much as it could. For example, category code has to be changed to be from 0 to 15. Also, the checksum should be removed because it is derived number from other information and does not add any new information.

2. Use All the Data: Other than the non-data values, we need to use all of the data values. Don't just use the first four digits, etc.

3. Use a Prime Number for the Modulo Base: Which means the table length should be a prime number. For example, if the table array length is 50, then all of the multiples of 50 in our car-part numbers will be hashed into the same index.

4. Use Folding: Another reasonable hash function involves breaking keys into groups of digits and adding the groups.

SSN example: 123-45-6789

In case table length is 1009: Break the number into three groups of three digits. (123+456+789 = 1368 % 1009 = 359)

In case table length is 101: Break the number into four two-digit numbers and one one-digit number. (12+34+56+78+9 = 189 % 101 = 88).

This way you can distribute the numbers better.

The basic idea is to examine your key values carefully and implement your hash function to remove any irregularity in the distribution of the key values.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值