trie树的数据结构
介绍 (Introduction )
The word trie is an inflix of the word “retrieval”, because the trie can find a single word in a dictionary with only a prefix of the word.
单词trie是单词“ re trie val”的词缀,因为trie可以在字典中找到只有该单词前缀的单个单词。
Trie is an efficient data retrieval data structure. Using trie, search complexities can be brought to an optimal limit, i.e. length of the string.
Trie是一种有效的数据检索数据结构。 使用特里,搜索复杂度可以达到一个最佳的限制,即字符串的长度。
It is a multi-way tree structure useful for storing strings over an alphabet, when we are storing them. It has been used to store large dictionaries of English, say, words in spell-checking programs. However, the penalty on tries is the storage requirements.
它是一种多向树结构,用于在存储字符串时将其存储在字母上。 它已被用来存储大型英语词典,例如拼写检查程序中的单词。 但是,尝试的代价是存储要求。
什么是特里? (What is a trie?)
A trie is a tree like data structure which stores strings, and helps you find the data associated with that string using the prefix of the string.
特里树是一种类似树的数据结构,用于存储字符串,并帮助您使用字符串的前缀查找与该字符串关联的数据。
For example, say you plan on building a dictionary to store strings along with their meanings. You must be wondering why can’t I simply use a hash table, to get the information.
例如,假设您计划构建一个字典来存储字符串及其含义。 您一定想知道为什么我不能简单地使用哈希表来获取信息。
Yes, you can get information using a hash table, but the hash tables can only find data where the string exactly matches the one we’ve added. But the trie will give us the capability to find strings with common prefixes, a missing character etc in lesser time, in comparison to a hash table.
是的,您可以使用哈希表获取信息,但是哈希表只能在字符串与我们添加的字符串完全匹配的地方找到数据。 但是,与哈希表相比,trie将使我们能够在较短的时间内找到具有公共前缀,缺少字符等的字符串。
A trie typically, looks something like this,
通常一个特里,看起来像这样,
This is an image of a Trie, which stores the words {assoc, algo, all, also, tree, trie}.
这是Trie的图像,其中存储了单词{assoc,algo,all,also,tree,trie}。
如何实现特里 (How to implement a trie)
Let’s implement a trie in python, for storing words with their meanings from an English dictionary.
让我们在python中实现一个trie,用于存储来自英语词典的单词及其含义。
ALPHABET_SIZE = 26 # For English
class TrieNode:
def __init__(self):
self.edges = [None]*(ALPHABET_SIZE) # Each index respective to each character.
self.meaning = None # Meaning of the word.
self.ends_here = False # Tells us if the word ends here.
As you can see, edges are 26 in length, each index referring to each character in the alphabet. ‘A’ corresponds to 0, ‘B’ to 1, ‘C’ to 2 … ‘Z’ to the 25th index. If the character you are looking for is pointing to None
, that implies the word is not there in the trie.
如您所见,边的长度为26,每个索引都指向字母表中的每个字符。 “ A”对应于0,“ B”对应于1,“ C”对应于2…“ Z”对应于第25个索引。 如果您要查找的字符指向None
,则表示该单词在trie中不存在。
A typical Trie should implement at least these two functions:
典型的Trie应该至少实现以下两个功能:
add_word(word,meaning)
add_word(word,meaning)
search_word(word)
search_word(word)
delete_word(word)
delete_word(word)
Additionally, one can also add something like
此外,还可以添加类似
get_all_words()
get_all_words()
get_all_words_with_prefix(prefix)
get_all_words_with_prefix(prefix)
将单词添加到特里 (Adding Word to the trie)
def add_word(self,word,meaning):
if len(word)==0:
self.ends_here = True # Because we have reached the end of the word
self.meaning = meaning # Adding the meaning to that node
return
ch = word[0] # First character
# ASCII value of the first character (minus) the ASCII value of 'a'-> the first character of our ALPHABET gives us the index of the edge we have to look up.
index = ord(ch) - ord('a')
if self.edges[index] == None:
# This implies that there's no prefix with this character yet.
new_node = TrieNode()
self.edges[index] = new_node
self.edges[index].add(word[1:],meaning) #Adding the remaining word
检索数据 (Retrieving data)
def search_word(self,word):
if len(word)==0:
if self.ends_here:
return True
else:
return "Word doesn't exist in the Trie"
ch = word[0]
index = ord(ch)-ord('a')
if self.edge[index]== None:
return False
else:
return self.edge[index].search_word(word[1:])
The search_word
function will tell us if the word exists in the Trie or not. Since ours is a dictionary, we need to fetch the meaning as well, now lets declare a function to do that.
search_word
函数将告诉我们单词在Trie中是否存在。 由于我们的字典是字典,因此我们也需要获取其含义,现在让我们声明一个函数来执行此操作。
def get_meaning(self,word):
if len(word)==0 :
if self.ends_here:
return self.meaning
else:
return "Word doesn't exist in the Trie"
ch = word[0]
index = ord(ch) - ord('a')
if self.edges[index] == None:
return "Word doesn't exist in the Trie"
else:
return self.edges[index].get_meaning(word[1:])
删除资料 (Deleting data)
By deleting data, you just need to change the variable ends_here
to False
. Doing that doesn’t alter the prefixes, but stills deletes the meaning and the existence of the word from the trie.
通过删除数据,您只需要将ends_here
变量ends_here
为False
。 这样做不会更改前缀,但是静止图像会从trie中删除单词的含义和存在。
def delete_word(self,word):
if len(word)==0:
if self.ends_here:
self.ends_here = False
self.meaning = None
return "Deleted"
else:
return "Word doesn't exist in the Trie"
ch = word[0]
index = ord(ch) - ord('a')
if self.edges[index] == None:
return "Word doesn't exist in the Trie"
else:
return self.edges[index].delete_word(word[1:])
翻译自: https://www.freecodecamp.org/news/trie-data-structure-implementation/
trie树的数据结构