基础知识
概念
前缀树,字典树,又称单词查找树或键树。树形结构,哈希树的变种。树的路径上存储的是字符,节点上存储的是以当前节点为结尾的字符串的个数。
性质:
- 根节点不包含字符,除根节点外每一个节点都只包含一个字符。
- 从根节点到某一节点,路径上经过的字符连接起来,为该节点对应的字符串。
- 每个节点的所有子节点包含的字符都不相同。
典型应用
- 可以检验是否含有某个字符串(在节点上添加属性,用于统计以当前节点为结尾的字符串的次数)
- 统计以某个字符串为前缀的字符串的个数(在节点上添加属性用于统计经过该节点的次数)
- 用于统计和排序大量的字符串(但不仅限于字符串),经常被搜索引擎系统用于文本词频统计。
优点
- 最大限度地减少无谓的字符串比较,查询效率比哈希表高。
- 可以极大压缩空间,复用了很多空间,复杂度和样本量有关系
核心思想
空间换时间。利用字符串的公共前缀来降低查询时间的开销以达到提高效率的目的。
构建思路
每一个节点假如都是字符,则每一个节点可能与26个字母相连,为此每一个节点上需要记录路径上的节点和结尾节点,以及一个连接数组。
应用题目
1、208. Implement Trie (Prefix Tree)
Implement a trie with insert
, search
, and startsWith
methods.
Example:
Trie trie = new Trie(); trie.insert("apple"); trie.search("apple"); // returns true trie.search("app"); // returns false trie.startsWith("app"); // returns true trie.insert("app"); trie.search("app"); // returns true
Note:
- You may assume that all inputs are consist of lowercase letters
a-z
. - All inputs are guaranteed to be non-empty strings.
class Trie {
public class Node{
int path; //记录多少字符串经过这个位置
int end; //记录多少字符串以当前字符结尾
Node[] nexts; //每个字符节点下一个可能的节点数组
public Node(){
path = 0;
end = 0;
nexts = new Node[26];
}
}
private Node node;
/** Initialize your data structure here. */
public Trie() {
node = new Node();
}
/** Inserts a word into the trie. */
public void insert(String word) {
if(word == null)
return;
char[] str = word.toCharArray();
Node root = node;
int index = 0;
for(int i = 0; i < str.length; i++){
index = str[i] - 'a';
if(root.nexts[index] == null)
root.nexts[index] = new Node();
root.path++;
root = root.nexts[index];
}
root.end++;
}
/** Returns if the word is in the trie. */
public boolean search(String word) {
if(word == null)
return false;
char[] str = word.toCharArray();
Node root = node;
int index = 0;
for(int i = 0; i < str.length; i++){
index = str[i] - 'a';
if(root.nexts[index] == null)
return false;
root = root.nexts[index];
}
return root.end == 0 ? false : true;
}
/** Returns if there is any word in the trie that starts with the given prefix. */
public boolean startsWith(String prefix) {
if(prefix == null)
return false;
char[] str = prefix.toCharArray();
Node root = node;
int index = 0;
for(int i = 0; i < str.length; i++){
index = str[i] - 'a';
if(root.nexts[index] == null)
return false;
root = root.nexts[index];
}
return true;
}
}
211. Add and Search Word - Data structure design
Design a data structure that supports the following two operations:
void addWord(word) bool search(word)
search(word) can search a literal word or a regular expression string containing only letters a-z
or .
. A .
means it can represent any one letter.
Example:
addWord("bad") addWord("dad") addWord("mad") search("pad") -> false search("bad") -> true search(".ad") -> true search("b..") -> true
Note:
You may assume that all words are consist of lowercase letters a-z
.
class WordDictionary{
private Node root;
public WordDictionary() {
root = new Node((char) 0);
}
//*** Create Trie tree
public void addWord(String word) {
Node node = root;
char[] arr = word.toCharArray();
for(char ch : arr){
Node child = node.children.get(ch);
if(child == null){
child = new Node(ch);
node.children.put(ch, child);
}
node = child;
}
//*** Mark the end ot the worn in tree
node.children.put(null, null);
}
public boolean search(String word) {
char[] arr = word.toCharArray();
return search(arr, 0, root);
}
//*** Recursive method to find word in the Trie tree
private boolean search(char[] arr, int pos, Node node){
//*** False -> if this is not the end of the word in the tree
if(node == null || node.children == null)
return false;
if(pos == arr.length)
return node.children.containsKey(null);
//*** If it's a point -> find all possibilities
if(arr[pos] == '.'){
for(Character ch : node.children.keySet()){
if(search(arr, pos+1, node.children.get(ch)))
return true;
}
return false;
} else {
//*** If it's not a point -> find the next character
return search(arr, pos+1, node.children.get(arr[pos]));
}
}
private class Node {
char ch;
Map<Character, Node> children;
public Node(char ch){
this.ch = ch;
children = new HashMap<>();
}
}
}