problem
Design a data structure that supports the following two operations:
void addWord(word) bool search(word) search(word) can search a literal
word or a regular expression string containing only letters a-z or ..
A . means it can represent any one letter.For example:
addWord(“bad”) addWord(“dad”) addWord(“mad”) search(“pad”) -> false
search(“bad”) -> true search(“.ad”) -> true search(“b..”) -> true
Note: You may assume that all words are consist of lowercase letters
a-z.
solution
这个问题是Trie 前缀树的一个扩展,即在前缀树中加入一个模糊搜索,因此我们只需对Trie中的search做一点修改即可。
#超过80%
def search1(word, d):
if word == '':
return '$' in d
if word[0] != '.':
if word[0] in d:
return search1(word[1:], d[word[0]])
else:
return False
else:
for v in d.values():
if v==True:
#跳过结束标志'$'
continue
if search1(word[1:], v):
return True
return False
class WordDictionary(object):
def __init__(self):
self.root = {}
def addWord(self, word):
p = self.root
for i in word:
p = p.setdefault(i, {})
p['$'] = True
def search(self, word):
return search1(word, self.root)
discussion
在leetcode上的提交中还看到这样一种解法,他把相同长度的单词存到一起,然后search时进行比对。这样的解法在规模较小时要强于Trie。
#超过了91%
class WordDictionary(object):
def __init__(self):
"""
Initialize your data structure here.
"""
self.word_dict = collections.defaultdict(list)
def addWord(self, word):
"""
Adds a word into the data structure.
:type word: str
:rtype: void
"""
if word:
self.word_dict[len(word)].append(word)
def search(self, word):
"""
Returns if the word is in the data structure. A word could contain the dot character '.' to represent any one letter.
:type word: str
:rtype: bool
"""
if not word:
return False
if '.' not in word:
return word in self.word_dict[len(word)]
for v in self.word_dict[len(word)]:
for i, ch in enumerate(word):
if v[i] !=ch and ch != '.':
break
else:
return True
return False
把defaultdict(list)修改成defaultdict(set)后超过了96%的提交。
总结
这是一个Trie树的典型应用,但是我们看到第二种解法的性能甚至超越了Trie,这是因为Trie在查找时无法对长度进行检查,所以会有一些无效的查找(把人对比两个单词的角度转化为计算机能理解的方式),而第二种解法考虑到虽然'.'
可以匹配任何字符,但是两个单词长度不同它们两个肯定就不同,这个方法的缺点就是1. 在大量单词前缀相同时耗费内存,2. 同时也对相同前缀做了重复的匹配。