算法：trie 树（字典树）

最新推荐文章于 2024-07-16 18:41:16 发布

Don't_Touch_Me

最新推荐文章于 2024-07-16 18:41:16 发布

阅读量371

点赞数

分类专栏： algorithm 文章标签： tire 字典树数据结构算法

本文链接：https://blog.csdn.net/assiduous_me/article/details/105509938

版权

algorithm 专栏收录该内容

10 篇文章 1 订阅

订阅专栏

概念：
字典树又称单词查找树，Trie树，是一种树形结构，是一种哈希树的变种。典型应用是用于统计，排序和保存大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。它的优点是：利用字符串的公共前缀来减少查询时间，最大限度地减少无谓的字符串比较，查询效率比哈希树高。（百度百科）

图解：

上图为一颗字典树，它表示字符串：aa、aaeb、aaec、ad、b、ba、cb、ca、cac
蓝色表示一个字符串的结束标记节点

设计：
节点定义：
如何定义上面的字典树中的一个节点呢？

class TrieNode {
    TrieNode[] children;
    boolean isEnd;
}

children 表示当前节点的孩子们节点引用，isEnd 表示当前节点是否为结束标记节点
另外说一下，TrieNode[] children 孩子们节点引用是有指定大小的，假如我们存储的字符串都是小写字母，这个大小为 26
每个不同的数组下标表示一个字符，例如：children[0] 表示小写 a 的节点引用

代码实现：

public class Trie {

    private TrieNode root;

    private class TrieNode {
        TrieNode[] children;
        boolean isEnd;
    }

    Trie() {
        root = new TrieNode();
        root.children = new TrieNode[26];
        root.isEnd = false;
    }

    /**
     * 将单词插入到 trie 树中
     * @param word 单词
     */
    public void insert(String word) {
        int index = 0;
        TrieNode ptr = root;
        while (index < word.length()) {
            int position = word.charAt(index) - 'a';
            if (ptr.children[position] == null) {
                TrieNode node = new TrieNode();
                node.children = new TrieNode[26];
                node.isEnd = false;
                ptr.children[position] = node;
            }
            ptr = ptr.children[position];
            index++;
        }
        ptr.isEnd = true;
    }

    /**
     * 查找 trie 树中是否存在该单词
     * @param word 单词
     * @return 存在返回 true，否则返回 false
     */
    public boolean search(String word) {
        int index = 0;
        TrieNode ptr = root;
        while (index < word.length()) {
            int position = word.charAt(index) - 'a';
            if (ptr.children[position] == null) {
                return false;
            }
            ptr = ptr.children[position];
            index++;
        }
        return ptr.isEnd;
    }

    /**
     * 查找 trie 树中是否有给定前缀开头的单词
     * @param prefix 前缀
     * @return 存在返回 true，否则返回 false
     */
    public boolean startsWith(String prefix) {
        int index = 0;
        TrieNode ptr = root;
        while (index < prefix.length()) {
            int position = prefix.charAt(index) - 'a';
            if (ptr.children[position] == null) {
                return false;
            }
            ptr = ptr.children[position];
            index++;
        }
        return true;
    }

}

查找 trie 树中所有单词：

/**
 * 查找 trie 树中所有单词
 * @param trieNode 当前节点
 * @param wordList 单词列表
 * @param word 临时字母存储区域
 */
public void getAllWord(TrieNode trieNode, List<String> wordList, StringBuilder word) {
	for (int i = 0; i < 26; i++) {
		if (trieNode.children[i] != null) {
			word.append((char) (i + 'a'));
			if (trieNode.children[i].isEnd) {
				wordList.add(word.toString());
			}
			getAllWord(trieNode.children[i], wordList, word);
			word.deleteCharAt(word.length() - 1);
		}
	}
}

查找 trie 树中以给定前缀的单词：

/**
 * 查找 trie 树中以给定前缀的单词
 * @param trieNode 当前节点
 * @param wordList 单词列表
 * @param word 临时字母存储区域
 * @param prefix 给定前缀
 * @param index 前缀当前下标
 */
public void searchByPrefix(TrieNode trieNode, List<String> wordList, StringBuilder word, String prefix, int index) {
	if (index < prefix.length()) {
		char ch = prefix.charAt(index);
		int position = ch - 'a';
		if (trieNode.children[position] != null) {
			word.append((char) (position + 'a'));
			if (index == prefix.length() - 1 && trieNode.children[position].isEnd) {
				wordList.add(word.toString());
			}
			searchByPrefix(trieNode.children[position], wordList, word, prefix, index + 1);
			word.deleteCharAt(word.length() - 1);
		}
	} else {
		for (int i = 0; i < 26; i++) {
			if (trieNode.children[i] != null) {
				word.append((char) (i + 'a'));
				if (trieNode.children[i].isEnd) {
					wordList.add(word.toString());
				}
				searchByPrefix(trieNode.children[i], wordList, word, prefix, index + 1);
				word.deleteCharAt(word.length() - 1);
			}
		}
	}
}

测试：

public static void main(String[] args) {
	Trie trie = new Trie();
	trie.insert("aa");
	trie.insert("aaeb");
	trie.insert("aaec");
	trie.insert("ad");
	trie.insert("b");
	trie.insert("ba");
	trie.insert("cb");
	trie.insert("ca");
	trie.insert("cac");
	System.out.println("获取 trie 树中所有单词：");
	LinkedList<String> wordList = new LinkedList<>();
	trie.getAllWord(trie.root, wordList, new StringBuilder());
	for (String word : wordList) {
		System.out.println(word);
	}
	System.out.println("获取 trie 树中给定前缀 aa 的单词：");
	wordList = new LinkedList<>();
	trie.searchByPrefix(trie.root, wordList, new StringBuilder(), "aa", 0);
	for (String word : wordList) {
		System.out.println(word);
	}
}