642 hard
Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character ‘#’).
You are given a string array sentences and an integer array times both of length n where sentences[i] is a previously typed sentence and times[i] is the corresponding number of times the sentence was typed. For each input character except ‘#’, return the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed.
Here are the specific rules:
The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same hot degree, use ASCII-code order (smaller one appears first).
If less than 3 hot sentences exist, return as many as you can.
When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.
Implement the AutocompleteSystem class:
AutocompleteSystem(String[] sentences, int[] times) Initializes the object with the sentences and times arrays.
List<String> input(char c) This indicates that the user typed the character c.
Returns an empty array [] if c == '#' and stores the inputted sentence in the system.
Returns the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed. If there are fewer than 3 matches, return them all.
Example 1:
Input
[“AutocompleteSystem”, “input”, “input”, “input”, “input”]
[[[“i love you”, “island”, “iroman”, “i love leetcode”], [5, 3, 2, 2]], [“i”], [" “], [“a”], [”#"]]
Output
[null, [“i love you”, “island”, “i love leetcode”], [“i love you”, “i love leetcode”], [], []]
Explanation
AutocompleteSystem obj = new AutocompleteSystem([“i love you”, “island”, “iroman”, “i love leetcode”], [5, 3, 2, 2]);
obj.input(“i”); // return [“i love you”, “island”, “i love leetcode”]. There are four sentences that have prefix “i”. Among them, “ironman” and “i love leetcode” have same hot degree. Since ’ ’ has ASCII code 32 and ‘r’ has ASCII code 114, “i love leetcode” should be in front of “ironman”. Also we only need to output top 3 hot sentences, so “ironman” will be ignored.
obj.input(" "); // return [“i love you”, “i love leetcode”]. There are only two sentences that have prefix “i “.
obj.input(“a”); // return []. There are no sentences that have prefix “i a”.
obj.input(”#”); // return []. The user finished the input, the sentence “i a” should be saved as a historical sentence in system. And the following input will be counted as a new search.
Constraints:
n == sentences.length
n == times.length
1 <= n <= 100
1 <= sentences[i].length <= 100
1 <= times[i] <= 50
c is a lowercase English letter, a hash '#', or space ' '.
Each tested sentence will be a sequence of characters c that end with the character '#'.
Each tested sentence will have a length in the range [1, 200].
The words in each input sentence are separated by single spaces.
At most 5000 calls will be made to input.
解决方案:
1.
class TrieNode:
def __init__(self):
self.children = defaultdict(TrieNode)
self.word_count = defaultdict(int) # counter of words
self.is_word = False # Not required for this problem ; just adding to have full system; useful to implement search
class AutocompleteSystem:
def __init__(self, sentences: List[str], times: List[int]):
self.root = TrieNode()
self.curr = self.root
self.sentence = ''
for i in range(len(sentences)):
self.insert(sentences[i],times[i])
def insert(self, sentence, cnt):
node = self.root
#iterate entire string
for char in sentence:
node = node.children[char] # iterate for each trie node
# counter is also needed; for each level count of each word is tracked
node.word_count[sentence] += cnt # whole word or sentence count - Update all nodes and its children!
node.is_word=True
def input(self, c: str) -> List[str]: # which adds a letter one by one
if c == "#": # lets use '#' as special characater here
# insert entire sentence and count is incremented by 1
self.insert(self.sentence, 1)
# reset the search iterator. oNCE THERE IS #, I CANT GO DOWN ANY MORE, SO RESETTING TO ROOT
self.curr = self.root
self.sentence = '' # whole resetting
return []
else:
self.sentence += c # buffer for how many characaters being inserted; add to buffer
self.curr = self.curr.children[c] # and going one level down
# how are we going to return the result?
results = []
record_heap = []
for st, cnt in self.curr.word_count.items():
# we need heap datastructure because we need top 3
heapq.heappush(record_heap,(-cnt, st)) # because python has Minheap and we need max heap
for i in range(3): # we need top 3
if record_heap:
c,s = heapq.heappop(record_heap)
results.append(s)
return results