《数据结构》学习笔记（4）——串

最新推荐文章于 2022-03-13 19:47:57 发布

零号元素

最新推荐文章于 2022-03-13 19:47:57 发布

阅读量428

点赞数

分类专栏：数据结构文章标签：串

本文链接：https://blog.csdn.net/weixin_36904568/article/details/88593537

版权

数据结构专栏收录该内容

9 篇文章 0 订阅

订阅专栏

一：基本概念

串是由零个或多个字符组成的有限序列，又名字符串。

串的长度n是一个有限的数值，零个字符的串称为空串，它的长度为零。

串的相邻字符之间具有前驱和后继的关系。

子串与主串：串中任意个数的连续字符组成的子序列称为该串的子串，相应地，包含子串的串称为主申。子串在主串中的位置就是子串的第一个字符在主串中的序号。

二：串的比较

比较组成串的字符的编码。
如果编码相同，则比较长度。‘

三：抽象数据类型

StrAssign ( T，* chars ) ：生成一个其值等于字符串常量chars的串T。
StrCopy(T，S)：串S存在，由串S复制得到T。
ClearString(S)：若S存在，将串清空。
StringEmpty ( S ):若串 S 为空，返回true,否则返回 false。
StrLength ( S )：返回串S的元素个教，即串的长度。
StrCompare(S,T)：若 S>T,返回值>0，若 S=T,返回 0,若 S<T,返回值<0。
Concat (T，S1，S2)：用T返回由S1和S2联接而成的新串。
SubString ( Sub, S, pos，len )：串 S 存在，1 <=pos<=StrLength ( S ), 且 0<=len<=StrLength ( S )-pos+1,用 Sub 返回串S的第pos个字符起长度为len的子串。
Index ( S,T,pos )：串S和T存在，T是非空串，1 <=pos<=StrLength ( S ) 。若主串S中存在和串T值相同的子串，则返回它在主串S中第pos个字符之后第一次出现的位置，否则返回0。
Replace (S,T,V):串S、T和V存在，T是非空串。用V替换主串S中出现的所有与T相等的不重叠的子串。
StrInsert (S,pos,T) : 串S和T存在，1 <=pos<=StrLength ( S ) -len+1，在串S的第pos个字符之前插入串T。
StrDelete(S，pos, len):串 S 存在，1 <=pos<=StrLength ( S ) -len+1，从串S中删除第pos个字符起长度为len的子串。

四：存储结构

1. 顺序存储结构

（1）定义

用一组地址连续的存储单元来存储串中的字符序列的。按照预定义的大小，为每个定义的串变量分配一个固定长度的存储区。一般是用定长数组来定义。

（2）属性

预定义的最大串长度
实际的串长度值

2. 链式存储结构

（1）定义

一个结点可以存放一个字符，也可以存放多个字符，最后一个结点若是未被占满时，可以用其他字符补全

（2）特点

连接串与串操作方便。
没有顺序存储灵活，性能也不如顺序存储结构好。

3. 单词查找树

（1）定义

由链接的结点组成，链接可能指向空或其他结点，每个链接对应一个字符。每个结点都由父结点指向该结点，它包含了父节点表示的字符串前缀，每个结点有N个链接（N为字母表的大小）。

（2）属性

预定义的串的字符个数
包含键值对和指向各个字符链接的结点

（3）特点

对于任意一组键，它的单词查找树是唯一的。
查找树时访问数组最多为键的长度+1次
查找未命中的成本与键的长度无关
所需空间与键的长度和字母表的长度有关

/**
 * 单词查找树
 * @param <T>
 */
public class StringSearchTree<T>{

    private int R;

    private Alphabet alphabet;

    private StringTreeNode<T> root;

    public StringSearchTree() {
        alphabet = Alphabet.EXTENDED_ASCII;
        R = alphabet.R();
        root = new StringTreeNode<>(R);
    }

    public StringSearchTree(Alphabet alphabet) {
        this.alphabet = alphabet;
        R = alphabet.R();
        root = new StringTreeNode<>(R);
    }

    //获取键值
    public T get(String key){
        StringTreeNode<T> node = get(key,root,0);
        if (node == null)
            return null;
        return node.getValue();
    }

    private StringTreeNode<T> get(String key,StringTreeNode<T> node,int digit){
        //在串的范围内查找
        if (node == null)
            return null;
        if (digit == key.length())
            return node;
        int index = alphabet.toIndex(key.charAt(digit));
//        char x = alphabet.toChar(index);
        return  get(key,node.getNodes()[index],digit+1);//在对应的子树查找
    }

    //添加键值
    public void put(String key,T value){
        root = put(root,key,value,0);
    }

    private StringTreeNode put(StringTreeNode<T> node,String key,T value,int digit){
        if (node == null)
            node = new StringTreeNode<>(R);
        if (digit == key.length())
        {
            node.setValue(value);
            return node;
        }
        int index = alphabet.toIndex(key.charAt(digit));
//        char x = alphabet.toChar(index);
         node.getNodes()[index] = put (node.getNodes()[index],key,value,digit+1);
         return node;
    }

    public void delete(String key){
        root = delete(key,root,0);
    }

    public StringTreeNode<T> delete(String key,StringTreeNode<T> node,int digit){
        //找不到目标结点
        if (node == null)
            return null;
        //找到目标结点，将其删除，记得继续检查子节点
        if (digit == key.length())
            node.setValue(null);
        else
        {
            int index = alphabet.toIndex(key.charAt(digit));
//            char x = alphabet.toChar(index);
            node.getNodes()[index] = delete(key,node.getNodes()[index],digit+1);
        }
        //被删除结点之前的结点。。
        if (node.getValue() != null)
            return node;
        //检查被删除结点含有非空子结点
        for (int i = 0; i < R; i++) {
            if (node.getNodes()[i] != null)
                return node;
        }
        //如果子节点都是空的，就把这个结点也删除
        return null;
    }

    //所有键
    public Iterable<String> keys(){
        return keysWithPrefix("");
    }

    //匹配前缀的所有键
    public Iterable<String> keysWithPrefix(String prefix){
        //用队列保存
        LinkedQueue<String> queue = new LinkedQueue<>();
        //从前缀开始查找
        collect(get(prefix,root,0),prefix,queue);
        return queue;
    }

    //匹配通配符的所有键
    public Iterable<String> keysWithMatch(String pattern){
        //用队列保存
        LinkedQueue<String> queue = new LinkedQueue<>();
        //开始查找
        collect(root,"",pattern,queue);
        return queue;
    }

    //收集键
    private void collect(StringTreeNode<T> node,String prefix,LinkedQueue<String> queue){
        //如果没有此结点，直接结束
        if (node == null)
            return;
        //如果有这个结点，加入前缀
        if (node.getValue() != null)
            queue.enQueue(prefix);
        //从结点继续向下查找
        for (int i = 0; i < R; i++)
        {
            char c = alphabet.toChar(i);
            collect(node.getNodes()[i],prefix+c,queue);
        }
    }

    private void collect(StringTreeNode<T> node,String prefix,String pattern,LinkedQueue<String> queue){
        //如果没有此结点，直接结束
        if (node == null)
            return;
        int len = prefix.length();
        //到达长度，看看是否匹配
        if (len == pattern.length() && node.getValue() != null)
            queue.enQueue(prefix);
        //到达长度，结束
        if (len == pattern.length())
            return;
        char x = pattern.charAt(len);
        //从结点开始，根据通配符继续向下查找
        for (int i = 0; i < R; i++)
        {
            char c = alphabet.toChar(i);
            if (x == c || x == '.')
                collect(node.getNodes()[i],prefix+c,queue);
        }
    }

    //根据字符串找出最长前缀
    public String longestPrefix(String alphabet){
        int len = search(root,alphabet,0,0);
        return alphabet.substring(0,len);
    }

    private int search(StringTreeNode<T> node,String str,int digit,int len){
        //如果没有此结点，直接结束
        if (node == null)
            return len;
        //如果有这个结点，则加长
        if (node.getValue() != null)
            len = digit;
        if (digit == str.length())
            return len;
        return search(node.getNodes()[str.charAt(digit)],str,digit+1,len);
    }

    public void print(){
        for (String key:
             keys()) {
            System.out.printf("%s[%d] ",key,(Integer)get(key));
        }
    }
}

4. 三向单词查找树

（1）定义

每个结点含有一个字符，一个值，以及三个链接，指向小于、等于、大于结点字符的键

（2）属性

预定义的串的单词组成个数
包含键值对和三个链接的结点

（3）特点

占用空间小
比较次数较少

/**
 * 三向单词查找树
 * @param <T>
 */
public class ThreeStringTree<T> {

    private int R;

    private Alphabet alphabet;

    private ThreeStringTreeNode<T> root;

    public ThreeStringTree() {
        this.alphabet = Alphabet.EXTENDED_ASCII;
        this.R = alphabet.R();
    }

    public ThreeStringTree(Alphabet alphabet) {
        this.alphabet = alphabet;
        this.R = alphabet.R();
    }

    //获取键值
    public T get(String key){
        ThreeStringTreeNode<T> node = get(key,root,0);
        if (node == null)
            return null;
        return node.getValue();
    }

    public ThreeStringTreeNode<T> get(String key,ThreeStringTreeNode<T> node,int digit){
        if (key.length() == 0)
            return root;
        if (node == null)
            return null;
        int index = alphabet.toIndex(key.charAt(digit));
        //对三个子节点进行比较
        if (alphabet.toChar(index) < node.getKey())
            return get(key, node.getLeft(), digit);
        else if (alphabet.toChar(index) > node.getKey())
            return get(key, node.getRight(), digit);
        //如果是中间结点，看看是不是够长度了
        else if (digit < key.length() - 1)
            return get(key, node.getMid(), digit+1);
        else
            return node;
    }

    public void put(String key,T value){
        root = put(key,value,root,0);
    }

    public ThreeStringTreeNode<T> put(String key,T value,ThreeStringTreeNode<T> node,int digit){
        if (node == null)
            node =  new ThreeStringTreeNode<T>(key.charAt(digit));
        int index = alphabet.toIndex(key.charAt(digit));
        //对三个子节点进行比较
        if (alphabet.toChar(index) < node.getKey())
            node.setLeft(put(key,value, node.getLeft(), digit));
        else if (alphabet.toChar(index) > node.getKey())
            node.setRight(put(key,value, node.getRight(), digit));
            //如果是中间结点，看看是不是够长度了
        else if (digit < key.length() - 1)
            node.setMid(put(key,value, node.getMid(), digit+1));
        else
            node.setValue(value);
        return node;
    }


    //所有键
    public Iterable<String> keys(){
        return keysWithPrefix("");
    }


    //匹配前缀的所有键
    public Iterable<String> keysWithPrefix(String prefix){
        //用队列保存
        LinkedQueue<String> queue = new LinkedQueue<>();
        //从前缀开始查找
        collect(get(prefix,root,0),prefix,queue);
        return queue;
    }
    //匹配通配符的所有键
    public Iterable<String> keysWithMatch(String pattern){
        //用队列保存
        LinkedQueue<String> queue = new LinkedQueue<>();
        //开始查找
        collect(root,"",pattern,queue,0);
        return queue;
    }

    //收集键
    private void collect(ThreeStringTreeNode<T> node,String prefix,LinkedQueue<String> queue){
        //如果没有此结点，直接结束
        if (node == null)
            return;
        //从左到右遍历一次
        collect(node.getLeft(),prefix,queue);
        //如果有这个结点，加入前缀
        if (node.getValue() != null)
            queue.enQueue(prefix + node.getKey());
        //从结点继续向下查找
        collect(node.getMid(),prefix+node.getKey(),queue);
        collect(node.getRight(),prefix,queue);
    }

    private void collect(ThreeStringTreeNode<T> node,String prefix,String pattern,LinkedQueue<String> queue,int digit){
        //如果没有此结点，直接结束
        if (node == null)
            return;
        char x = pattern.charAt(digit);
        //从结点开始，根据通配符继从左到右查找
        char key = node.getKey();
        if (x < key || x == '.')
            collect(node.getLeft(),prefix,pattern,queue,digit);
        if (x == key || x == '.')
            //到达长度，看看是否匹配
        {
            if (digit == pattern.length() - 1 && node.getValue() != null)
                queue.enQueue(prefix + node.getKey());
            if (digit < pattern.length() - 1)
                collect(node.getMid(),prefix + node.getKey(),pattern,queue,digit + 1);
        }
        if (x > key || x == '.')
            collect(node.getRight(),prefix,pattern,queue,digit);
    }

    //根据字符串找出最长前缀
    public String longestPrefix(String alphabet){
        int len = search(root,alphabet,0,0);
        return alphabet.substring(0,len);
    }

    private int search(ThreeStringTreeNode<T> node,String str,int digit,int len){
        //如果没有此结点，直接结束
        if (node == null)
            return len;
        //如果有这个结点，则加长
        if (node.getValue() != null)
            len = digit + 1;
        if (digit == str.length())
            return len;

        char c = str.charAt(digit);
        char key = node.getKey();
        if (c < key)
            return search(node.getLeft(),str,digit,len);
        else if (c > key)
            return search(node.getRight(),str,digit,len);
        else
            return search(node.getMid(),str,digit+1,len);
    }

    public void print(){
        for (String key:
                keys()) {
            System.out.printf("%s[%d] ",key,(Integer)get(key));
        }
        System.out.println();
    }
}

五：匹配字符串

1. 模式匹配算法

对主串的每一个字符作为子串开头，与要匹配的字符串进行匹配。对主串做大循环，每个字符开头做T的长度的小循环，直到匹配成功或全部遍历完成为止。

用 i 表示主串下标，用 j 表示子串下标。
判断 i 和 j 是否在串的范围内
如果主串和子串的字母相等，则下标增加。
否则 i 退回原来的下一位， j 退回原位

2. KMP算法

（1）：定义

对于在子串中有重复的字符，可以省略一部分不必要的判断步骤。
i可以不用回溯，j可以回溯到合适的地方。

（2）：获取 next[j] 数组

获取字符串T从1到 j-1 中的子串S
如果S = 0，则next[j] = 0
如果S中没有相同的字符，则next[j] =1
如果S中前缀字符串与后缀字符串相同，则next[j] = 相同的字符个数+1

零号元素

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
《数据结构》学习笔记（4）——串

一：基本概念串是由零个或多个字符组成的有限序列，又名字符串。串的长度n是一个有限的数值，零个字符的串称为空串，它的长度为零。串的相邻字符之间具有前驱和后继的关系。子串与主串：串中任意个数的连续字符组成的子序列称为该串的子串，相应地，包含子串的串称为主申。子串在主串中的位置就是子串的第一个字符在主串中的序号。二：串的比较比较组成串的字符的编码。如果编码相同，则比较长度。‘三：抽象...
复制链接

扫一扫