字典树Trie
字典树简述
- 专门为处理
字符串
设计的
字典 | Trie |
---|---|
如果有n 个条目,使用树结构 ,查询的时间复杂度是O(logn) | 查询每个条目的时间复杂度,和字典中一共有多少条目无关 |
如果有100万个条目 (
2
20
2^{20}
220个),logn 大约为20 | 时间复杂度为O(w) ,w 为查询单词的长度,大多数单词的长度小于10 |
Trie中节点的定义
考虑到语言,语境的不同,26个指针可能不够或富余。(考虑到大小写,字符,26个可能不够)
让每个节点有若干指针指向下个节点的指针,用Map
实现next[26]的动态表示。
class Node{
char c;
Map<char, Node> next;
}
当前节点要考虑是不是某个单词的结尾,如panda
单词前缀单词pan
也是一个单词。
Trie自实现
Trie实现接口功能
int getSize() //获得Trie中存储的单词数量
void add(String word) //向Trie中添加一个新的单词word
boolean contains(String word) //查询单词word是否在Trie中
boolean isPrefix(String prefix) //查询是否在Trie中有单词以prefix为前缀
java代码实现
import java.util.TreeMap;
//多叉树
public class Trie {
private class Node{
public boolean isWord;
public TreeMap<Character, Node> next; //每一个对象都是Character
public Node(boolean isWord) {
this.isWord = isWord;
next = new TreeMap<>();
}
public Node() {
this(false);
}
}
private Node root;
private int size;
public Trie() {
root = new Node();
size = 0;
}
//获得Trie中存储的单词数量
public int getSize() {
return size;
}
//向Trie中添加一个新的单词word
public void add(String word) {
Node cur = root;
for (int i = 0; i < word.length(); i++) {
char c = word.charAt(i);
if(cur.next.get(c) == null)
cur.next.put(c, new Node());
cur = cur.next.get(c);
}
if (!cur.isWord){ //当前节点是否已经在路径中
cur.isWord = true;
size ++;
}
}
//查询单词word是否在Trie中
public boolean contains(String word) {
Node cur = root;
for (int i = 0; i < word.length(); i++) {
char c = word.charAt(i);
if (cur.next.get(c) == null)
return false;
cur = cur.next.get(c);
}
return cur.isWord; //判断Trie是否含有该单词,该字符是否代表一个单词的结尾
}
//查询是否在Trie中有单词以prefix为前缀
public boolean isPrefix(String prefix) {
Node cur = root;
for (int i = 0; i < prefix.length(); i++) {
char c = prefix.charAt(i);
if(cur.next.get(c) == null)
return false;
cur = cur.next.get(c);
}
return true;
}
}
模糊匹配
参考LeetCode211-添加与搜索单词
查找d..r
类型字符串,进行模糊匹配
import java.util.TreeMap;
public class WordDictionary {
private class Node {
public boolean isWord;
public TreeMap<Character, Node> next;
public Node(boolean isWord ) {
this.isWord = isWord;
next = new TreeMap<>();
}
public Node() {
this(false);
}
}
private Node root;
/** Initialize your data structure here. */
public WordDictionary () {
root = new Node();
}
public void addWord(String word) {
Node cur = root;
for (int i=0; i<word.length();i++) {
char c = word.charAt(i)
if (cur.next.get(i) == null)
cur.next.put(c, new Node());
cur = cur.next.get(c);
}
cur.isWord = true;
}
public boolean search(String word) {
return match(root, word, 0);
}
private boolean match(Node root, String Word, int index) {
if (index == word.length())
return node.isword;
char c = word.charAt(index);
if (c != '.') {
if (node.next.get(c) == null)
return false;
return match(root.next.get(c), word, index+1);
}
else {
for (char nextChar : node.next.keySet())
if(match(node.next.get(nextChar), word, index+1))
return true;
return false;
}
}
}
前缀字符串键值求和
参考LeetCode677-键值映射
查找以ap
为前缀的字符串,若已经添加了apple
,application
等,进行前缀匹配后,将含ap
前缀的每个字符串对应的值求和。
import java.util.TreeMap;
public class MapSum {
public class Node {
private int value;
public TreeMap<Character, Node> next;
//这里因为不需要判断当前节点是否已经在路径中,若之前出现,可能新建覆盖,故可将其省去
public Node (int value) {
this.value = value;
next = new TreeMap<>();
}
public Node() {
this(0);
}
}
private Node root;
/** Initialize your data structure here. */
public MapSum() {
root = new Node();
}
public void insert(String key, int val) {
Node cur = root;
for (int i=0; i<key.length(); i++) {
char c = key.charAt(i);
if (cur.next.get(c) == null)
cur.next.put(c, new Node());
cur = cur.next.get(c);
}
cur.value = val;
}
public int Sum(String prefix) {
//首先查找prefix前缀的字符串
Node cur = root;
for (int i =0; i<prefix.length();i++) {
char c = prefix.charAt(i);
if (cur.next.get(c) == null)
return 0;
cur = cur.next.get(c);
}
//此时cur到达prefix末尾,接下来将所有含prefix的字符串的值求和
return sum(cur);
}
private int sum(Node node) {
int res = node.value;
for (char c: node.next.keySet())
res += sum(node.next.get(c));
return res;
}
}