9月16日这两天事情有点多，但还是尽量记录一下做过的事情（前缀树，字典树，深搜剪枝）

最新推荐文章于 2024-09-29 14:13:49 发布

时俗之俗

最新推荐文章于 2024-09-29 14:13:49 发布

阅读量80

点赞数

分类专栏：日记文章标签：剪枝 python 算法

本文链接：https://blog.csdn.net/weixin_40857308/article/details/120339205

版权

日记专栏收录该内容

24 篇文章 0 订阅

订阅专栏

这篇博客探讨了如何使用深度优先搜索（DFS）和字典树（Trie）解决单词搜索问题。作者首先介绍了如何通过DFS实现基本解决方案，然后优化为使用Trie树提高效率。Trie树通过利用字符串的公共前缀减少了查询时间，并且详细解释了其结构和操作。博客还提到了在实现过程中需要注意的边界条件和优化点，如处理单词的前缀重复情况。最后，给出了使用Trie树优化后的代码实现，强调了代码优化的几个关键点。

摘要由CSDN通过智能技术生成

今天的每日一题是
单词搜索2
一开始想先试试深搜能过多少，结果就过了。。。

class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        #先用深搜试试
        wordSet = set(words)
        direction= [(0,1),(0,-1),(1,0),(-1,0)]
        m,n = len(board),len(board[0])
        vistied=[[False]* n for _ in range(m)]
        def dfs(word,res,x,y):
            if len(word)>10:
                return 
            if word in wordSet:
                res.append(word)
                wordSet.remove(word)
            for d in direction:
                nx,ny=x+d[0],y+d[1]
                if nx<0 or nx>=m or ny<0 or ny>=n:
                    continue
                if vistied[nx][ny]:
                    continue
                vistied[nx][ny]=True
                dfs(word+board[nx][ny],res,nx,ny)
                vistied[nx][ny]=False
            return 
        res=[]
        for i in range(m):
            for j in range(n):
                vistied[i][j]=True
                dfs(board[i][j],res,i,j)
                vistied[i][j]=False
        return res

这里的python深度搜索要注意的是输入值，特别是边界条件的判断，除此之外应该就没什么大问题了。
然后是这道题目实际上最正规的解法是使用字典树来解决。
正好复习一下字典树：
字典树
又称单词查找树，Trie树，是一种树形结构，是一种哈希树的变种。典型应用是用于统计，排序和保存大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。它的优点是：利用字符串的公共前缀来减少查询时间，最大限度地减少无谓的字符串比较，查询效率比哈希树高（引用百度百科）树的每条边上恰好对应一个字符，每个顶点代表从根到该结点的路径所对应的字符串(将所有经过的边上的字符按顺序连接起来)。有时我们也称 Trie 上的边为转移，顶点为状态。
如下图：
这个是集合{“AAA” “AAG” “T” “TCA” “TG”}
顶点上还能存储额外的信息，例如，上图中从根到加粗圆圈经过的边上字母组成的字符串是实际字符串集合中的元素。这种结点也可以称为单词节点，在实际代码中可以用一个 bool 类型的数组去记录。实际上，任意一个线结点所代表的字符串，都是实际字符串集合中某些串的前缀。特别的，根节点表示空串。
我们可以看见，对于任意一个结点，它到它的子结点边上的字符都互不相同。Trie 很好地利用了串的公共前缀，节约了存储空间。
若将字符集看做是小写英文字母，则 Trie 也可以看做是一个 26 叉树，在插入询问新字符串时与树一样，找到对应的边往下走。
我自己的理解是将他作为前缀和的一种变种

class Trie:

    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.children = [None]*26
        self.isEnd = False


    def insert(self, word: str) -> None:
        """
        Inserts a word into the trie.
        """
        node=self
        for ch in word:
            ch = ord(ch)-ord('a')
            if not node.children[ch]:
                node.children[ch] = Trie()
            node = node.children[ch]
        node.isEnd=True


    def search(self, word: str) -> bool:
        """
        Returns if the word is in the trie.
        """
        node=self
        for ch in word:
            ch = ord(ch)-ord('a')
            if not node.children[ch]:
                return False
            node = node.children[ch]
        if node.isEnd:
            return True
        return False
            


    def startsWith(self, prefix: str) -> bool:
        """
        Returns if there is any word in the trie that starts with the given prefix.
        """
        node=self
        for ch in prefix:
            ch = ord(ch)-ord('a')
            if not node.children[ch]:
                return False
            node = node.children[ch]
        return True


# Your Trie object will be instantiated and called as such:
# obj = Trie()
# obj.insert(word)
# param_2 = obj.search(word)
# param_3 = obj.startsWith(prefix)

然后是前缀树版本的深搜加深一下记忆:
前缀树版本，比之前的快了1 /5：

class Trie:
    def __init__(self):
        self.children = [None]*26
        self.endWord=''
    def insert(self,word):
        node = self
        for ch in word:
            ch = ord(ch)-ord('a')
            if not node.children[ch]:
                node.children[ch] = Trie()
            node= node.children[ch]
        node.endWord=word
class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        trie = Trie()
        for word in words:
            trie.insert(word)
        m,n = len(board),len(board[0])
        visited = [[False]*n for _ in range(m)]
        direction = [(1,0),(-1,0),(0,-1),(0,1)]
        res=[]
        def dfs(trie,x,y):
            cur = trie
            if cur.endWord:
                res.append(cur.endWord)
                cur.endWord=''
                #return
            for d in direction:
                nx,ny = x+d[0],y+d[1]
                if nx<0 or nx>=m or ny<0 or ny>=n:
                    continue
                if visited[nx][ny]:
                    continue
                visited[nx][ny]=True
                ch = ord(board[nx][ny])-ord('a')
                if cur.children[ch]:
                    dfs(cur.children[ch],nx,ny)
                visited[nx][ny]=False
            return
        for i in range(m):
            for j in range(n):
                visited[i][j]=True
                ch = ord(board[i][j])-ord('a')
                if trie.children[ch]:
                    dfs(trie.children[ch],i,j)
                visited[i][j]=False
        return res

这个比我之前想象的还要花时间：主要是几点：
1.是开始状态下的条件没有处理好，即第一个输入值没有处理
2.没有考虑到有单词的前缀重复的情况，直接return

对比答案：发现还有一些优化可以做
1.在面对非小写字母补全的时候可以用字典
2.visited可以通过换成改变borad实现

from collections import defaultdict


class Trie:
    def __init__(self):
        self.children = defaultdict(Trie)
        self.word = ""

    def insert(self, word):
        cur = self
        for c in word:
            cur = cur.children[c]
        cur.is_word = True
        cur.word = word


class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        trie = Trie()
        for word in words:
            trie.insert(word)
		
        def dfs(now, i1, j1):
            if board[i1][j1] not in now.children:
                return

            ch = board[i1][j1]

            nxt = now.children[ch]
            if nxt.word != "":
                ans.append(nxt.word)
                nxt.word = ""

            if nxt.children:
                board[i1][j1] = "#"
                for i2, j2 in [(i1 + 1, j1), (i1 - 1, j1), (i1, j1 + 1), (i1, j1 - 1)]:
                    if 0 <= i2 < m and 0 <= j2 < n:
                        dfs(nxt, i2, j2)
                board[i1][j1] = ch

            if not nxt.children:
                now.children.pop(ch)

        ans = []
        m, n = len(board), len(board[0])

        for i in range(m):
            for j in range(n):
                dfs(trie, i, j)

        return ans
复杂度分析

作者：LeetCode-Solution
链接：https://leetcode-cn.com/problems/word-search-ii/solution/dan-ci-sou-suo-ii-by-leetcode-solution-7494/
来源：力扣（LeetCode）
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。