自己动手实现一个抽象的字典（java语言）

小小潘不放弃

已于 2022-09-25 16:53:01 修改

阅读量1.1k

点赞数

分类专栏： ADT

于 2022-09-25 16:36:34 首次发布

本文链接：https://blog.csdn.net/qq_57220516/article/details/127033866

版权

ADT 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

这里写目录标题

字典
散列

介绍ADT字典的抽象框架是自己写的，目前还有很多不足。Java类库中有一个类似的java.util.Map，可以好好对比以下。

字典

ADT（抽象数据结构）字典又叫映射，表或关联数组，包含两部分：查找键key和对应值value。

字典ADT的相关操作

我们思考一下，对于一个抽象的数据结构，一般包括插入，删除，查找，获取值，判断是否为空，获取大小等基本操作；

方法(K和V都是泛型)	方法说明
+add(key:K,value:V):void	向字典中添加一个键值对，其实返回值也可以是boolean类型的，添加成功返回true，失败返回false
+remove(key:K):V	通过键值来移除某个数据，返回被移除的数据
+getValue(key:K)：V	通过键值获取数据并返回该数据
+contains(key:K):boolean	判断该键值是否存在
+getKeyIterator():Iterator< K >	获取键值的迭代器
+getValueIterator():Iterator< V>	获取值的迭代器
+isEmpty():boolean	判断字典是否为空
+getSize():integer	获取字典的大小
+clear():void	清除所有项

查找键key的设定不同，具体的实现方法也不同，key的设定有以下几种情况：

唯一：如果该ADT中的查找键是唯一的，那么在添加相同的key时可以拒绝添加，也可以用新值替换掉原来的value；
重复：如果查找键key允许重复，那么我们就要考虑getValue（）方法的实现，是返回第一个value值呢，还是返回所有key相对应的value值；remove方法是移除第一个值呢，还是移除key对应得所有值；
第二查找键：当字典中已有查找键，添加的时候可以考虑设置第二个查找键；

当然我们这里只讨论查找键唯一的情况。

字典ADT框架

import java.util.Iterator;
public interface DictionaryInteface <K,V>{

   public void add(K key,V value);
   public V remove(K key);
   public V getValue(K key);
   public boolean contains(K key);
   public Iterator<K> getKeyIterator();
   public Iterator<V> getValueIterator();
   public boolean isEmpty();
   public Integer getSize();
   public void clear();
}

java.util中就含有Map接口，和我们这里的Dictionary类似。

具体实现

字典的实现可以基于数组，也可以基于链表，都可分有序和无序来讨论。

基于数组实现

无序数组

我们用一个类实现这个接口，以下是对该类方法的一些说明：

首先我们需要一个私有的内部类Entry，给它设置构造方法和get，set方法，但注意key一旦添加就不可变，所以没有setKey()方法；
ensureCapacity（）：基于数组实现，在添加完一项之后需要确保是否还有空间进行下一个元素的添加，如果空间不够，则扩大数组容量，这里将数组扩大为原来的两倍；
locateIndex（）：后序会有很多获取数组下标的工作，这里把这个方法分离出来能让我们的类的结构更加清晰；
构造方法：这里我们用了两个构造方法，第一个构造方法调用第二个构造方法；
add（）方法：因为字典不允许出现重复的key值，所以在添加前必须先进行查找，如果存在，则更新；不存在，则插入，插入完成之后还要调用ensureCapacity方法检查数组容量；

import java.util.Arrays;
import java.util.Iterator;

public class ArrayDictionary<K,V> implements DictionaryInteface<K,V>{

   private Entry<K,V>[]dictionary ;//字典数组
   private int numberOfEntries;//数组的元素个数
   private final static int DEFAULT_CAPACITY=25;//初始容量
   private final static int MAX_CAPACITY=10000;//最大容量

   public ArrayDictionary(){
       this(DEFAULT_CAPACITY);
   }
   public ArrayDictionary(int inialCapacity){
       Entry<K,V>[]tempDic=(Entry<K, V>[]) new Entry[inialCapacity];
       dictionary=tempDic;
       numberOfEntries=0;
   }
   public void ensureCapacity(){
       if(dictionary.length>=DEFAULT_CAPACITY){
           int newLength=2*dictionary.length;
           dictionary= Arrays.copyOf(dictionary,newLength);
       }
   }
   //获取下标
   public int locateIndex(K key){
       int index=0;
       while(index<numberOfEntries&&!key.equals(dictionary[index].getKey())){
           index++;
       }
       return index;
   }
   @Override
   public void add(K key, V value) {
       int keyindex=locateIndex(key);
       //字典数组中存在该key
       if(keyindex<numberOfEntries){
           dictionary[keyindex].setValue(value);
       }
       //不存在，则添加
       else{
           dictionary[keyindex]=new Entry<>(key,value);
           numberOfEntries++;
           ensureCapacity();
       }
   }

   @Override
   public V remove(K key) {
       int keyindex=locateIndex(key);
       V result=dictionary[keyindex].getValue();
       dictionary[keyindex]=null;
       numberOfEntries--;
       return result;
   }

   @Override
   public V getValue(K key) {
       return dictionary[locateIndex(key)].getValue();
   }

   @Override
   public boolean contains(K key) {
       int keyindex=locateIndex(key);
       return keyindex<numberOfEntries;
   }

   @Override
   public Iterator getKeyIterator() {
       return null;
   }

   @Override
   public Iterator getValueIterator() {
       return null;
   }

   @Override
   public boolean isEmpty() {
       return numberOfEntries>0;
   }

   @Override
   public Integer getSize() {
       return dictionary.length;
   }

   @Override
   public void clear() {
       for(int i=0;i<dictionary.length;i++){
           dictionary[i]=null;
       }
       numberOfEntries=0;
   }

   private class Entry<K,V>{
       private K key;
       private V value;
       //构造方法
       private Entry(K searchKey,V dataValue){
           key=searchKey;
           value=dataValue;
       }
//        获取Key
     private K getKey(){
           return key;
     }
//      获取value
     private V getValue(){
           return value;
       }
//        设置value
     private void setValue(V dataValue){
         value=dataValue;
     }
   }
}

有序数组

有序数组和无序数组在以下几个方法上略有区别：

locateIndex()：因为数组是有序的，所以在查找下标的时候可以采用二分法，时间复杂度可以从线性减小为O(logn);
add():和基于无序数组不一样，有序数组不是随便添加在数组末尾，而是找到应该在的位置；
makeRoom():找到添加元素的位置后，应该移动有序数组中元素的位置，为待添加元素腾出空间，这就是makeRoom（）方法要做的事情；
remove():移除一个元素之后数组会空出来，后面的元素要补上，添加一个私有方法removeRoom()来完成这件事情；

import java.util.Arrays;
import java.util.Iterator;

public class SortedArrayDictionary<K extends Comparable<? super K>,V> implements DictionaryInteface<K,V>{

    private Entry<K,V>[]dictionary ;//字典数组
    private int numberOfEntries;//数组的元素个数
    private final static int DEFAULT_CAPACITY=25;//初始容量
    private final static int MAX_CAPACITY=10000;//最大容量

    public SortedArrayDictionary(){
        this(DEFAULT_CAPACITY);
    }
    public SortedArrayDictionary(int inialCapacity){
        Entry<K,V>[]tempDic=(Entry<K, V>[]) new Entry[inialCapacity];
        dictionary=tempDic;
        numberOfEntries=0;
    }
    public void ensureCapacity(){
        if(dictionary.length>=DEFAULT_CAPACITY){
            int newLength=2*dictionary.length;
            dictionary= Arrays.copyOf(dictionary,newLength);
        }
    }
    //获取下标
    public int locateIndex(K key){
        int index=0;
        while(index<numberOfEntries&&key.compareTo(dictionary[index].getKey())<0){
            index++;
        }
        return index;
    }

    @Override
    public void add(K key, V value) {
        int keyindex=locateIndex(key);
        //字典数组中存在该key
        if(keyindex<numberOfEntries){
            dictionary[keyindex].setValue(value);
        }
        //不存在，则添加
        else{
            makeRoom(keyindex);
            dictionary[keyindex]=new Entry<>(key,value);
            numberOfEntries++;
            ensureCapacity();
        }
    }
    private void makeRoom(int keyIndex){
        for(int i=dictionary.length-1;i>=keyIndex;i--){
            dictionary[i+1]=dictionary[i];
        }
    }

    @Override
    public V remove(K key) {
        int keyindex=locateIndex(key);
        dictionary[keyindex]=null;
        removeRoom(keyindex);
        numberOfEntries--;
        return null;
    }
    private void removeRoom(int keyIndex){
        for(int i=keyIndex;i<dictionary.length;i++){
            dictionary[i]=dictionary[i+1];
        }
    }

    @Override
    public V getValue(K key) {
        return dictionary[locateIndex(key)].getValue();
    }

    @Override
    public boolean contains(K key) {
        return false;
    }

    @Override
    public Iterator getKeyIterator() {
        return null;
    }

    @Override
    public Iterator getValueIterator() {
        return null;
    }

    @Override
    public boolean isEmpty() {
        return numberOfEntries>0;
    }

    @Override
    public Integer getSize() {
        return dictionary.length;
    }

    @Override
    public void clear() {
        for(int i=0;i<dictionary.length;i++){
            dictionary[i]=null;
        }
    }

    private class Entry<K,V>{
        private K key;
        private V value;
        //构造方法
        private Entry(K searchKey,V dataValue){
            key=searchKey;
            value=dataValue;
        }
        //        获取Key
        private K getKey(){
            return key;
        }
        //      获取value
        private V getValue(){
            return value;
        }
        //        设置value
        private void setValue(V dataValue){
            value=dataValue;
        }
    }
}

基于链表实现

链式实现每一个内部类应该都含有key，value和下一个节点的指针

 private class Node{
        private K key;
        private V value;
        private Node next;
        //构造方法
        private Node(K searchKey,V dataValue,Node next){
            key=searchKey;
            value=dataValue;
            this.next=next;
        }
        private Node getNext(){
            return next;
        }
        private void setNext(Node next){
            this.next=next;
        }
        //        获取Key
        private K getKey(){
            return key;
        }
        //      获取value
        private V getValue(){
            return value;
        }
        //        设置value
        private void setValue(V dataValue){
            value=dataValue;
        }
    }

无序链表

对于无序链表，只有添加元素的复杂度为O(1),其他的操作都需要遍历，时间复杂度为O(n)。代码很简单，就不写了。

有序链表

有序链表和数组不一样，在获取时不能通过下标直接获得，而是先要获得前一个节点，再修改指针。

import java.util.Iterator;

public class SortedLinkedDictionary<K extends Comparable<? super K>,V> implements DictionaryInteface<K,V>{

    private int numberOfEntries=0;//节点个数
    public Node firstNode;

    public SortedLinkedDictionary(){
    }
    
    @Override
    public void add(K key, V value) {
        Node currentNode=firstNode;
        Node beforeNode=null;
        while(currentNode!=null&&key.compareTo(currentNode.getKey())>0){
            beforeNode=currentNode;
            currentNode=currentNode.getNext();
        }
        //字典数组中存在该key
        if(currentNode!=null&&key.equals(currentNode.getKey())){
            currentNode.setValue(value);
        }
        //不存在，则添加
        else {
            Node newNode=new Node(key,value);
            if(beforeNode==null){
                newNode.setNext(currentNode);
                firstNode=newNode;
            }else{
                newNode.setNext(currentNode);
                beforeNode.setNext(newNode);
            }
            numberOfEntries++;
        }
    }

    @Override
    public V remove(K key) {
        Node currentNode=firstNode;
        Node beforeNode=null;
        while(currentNode!=null&&key.compareTo(currentNode.getKey())>0){
            beforeNode=currentNode;
            currentNode=currentNode.getNext();
        }
        if(beforeNode==null){
            firstNode=currentNode.next;//删除的是第一个节点
        }else if(currentNode.getNext()==null){
            beforeNode.setNext(null);//删除的是最后一个节点
        }else{//删除中间的节点
            beforeNode.setNext(currentNode.getNext();
        }
        numberOfEntries--;
        return null;
    }
    @Override
    public V getValue(K key) {
        Node currentNode=firstNode;
        while(currentNode!=null&&key.compareTo(currentNode.getKey())>0){
            currentNode=currentNode.getNext();
        }
        return currentNode.getValue();
    }

    @Override
    public boolean contains(K key) {
        return false;
    }

    @Override
    public Iterator getKeyIterator() {
        return null;
    }

    @Override
    public Iterator getValueIterator() {
        return null;
    }

    @Override
    public boolean isEmpty() {
        return numberOfEntries>0;
    }

    @Override
    public Integer getSize() {
        return numberOfEntries;
    }

    @Override
    public void clear() {

    }

    private class Node{
        private K key;
        private V value;
        private Node next;
        //构造方法
        private Node(K searchKey,V dataValue,Node next){
            key=searchKey;
            value=dataValue;
            this.next=next;
        }
        private Node(K searchKey,V dataValue){
            key=searchKey;
            value=dataValue;
        }
        private Node getNext(){
            return next;
        }
        private void setNext(Node next){
            this.next=next;
        }
        //        获取Key
        private K getKey(){
            return key;
        }
        //      获取value
        private V getValue(){
            return value;
        }
        //        设置value
        private void setValue(V dataValue){
            value=dataValue;
        }
    }
}

附：以上只是最简单的框架，还有很多安全性，比如异常等都没有处理，比较抽象。

散列

数组可以为字典项提供地方，如果知道下标，就可以直接访问。散列（又叫哈希）就是只利用查找键就能知道下标的一个技术，这个下标就叫做散列索引。
散列函数：f(查找键)=散列索引
如果每个查找键通过散列函数得到的都是唯一且不重复的散列索引，那么这个散列就是完美的。但是上帝怎么可能允许完美的事物存在呢？几乎所有的散列值都有可能重复，这些是典型散列。

冲突

什么是冲突？举个例子，我们说散列就是通过key能够直接获取到value的技术，比如f(身份证号码)=特定的人，如果不同的身份证号码经过散列函数后得到的是同一个人，人人相撞，这就是冲突。
所以一个好的散列函数应该具有最少冲突且计算要快。解决冲突，就是散列的关键。

解决冲突

开放地址法

说白了就是发现冲突后进行遍历，知道找到一个未用元素为止。

线性探查

 //线性探查
    public int LinearIndex(K key,int index){
        boolean found=false;
        while(!found&&hashTable[index]!=null){
            if(key.equals(hashTable[index].getKey)){
                found=true;
            }else{
                index=(index+1)%hashTable.length;
            }
        }
       return index; 
    }

线性探查会导致散列表中一组连续的位置被占用，每一组称为一个簇，这种现象叫基本聚集。

二次探查

 //二次探查
    public int LinearIndex(K key,int index){
        boolean found=false;
        int i=0;
        while(!found&&hashTable[index]!=null){
            if(key.equals(hashTable[index].getKey)){
                found=true;
            }else{
                index=(index+i*i)%hashTable.length;
                i++;
            }
        }
       return index; 
    }