Peter Norvig提供了一篇很棒的文章,其中包含完整的源代码(21行),内容涉及拼写纠正。
[http://norvig.com/spell-correct.html]
这个想法是要对您的单词进行所有可能的编辑,
hello - helo - deletes
hello - helol - transpose
hello - hallo - replaces
hello - heallo - inserts
def edits1(word):
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in splits if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)>1]
replaces = [a + c + b[1:] for a, b in splits for c in alphabet if b]
inserts = [a + c + b for a, b in splits for c in alphabet]
return set(deletes + transposes + replaces + inserts)
现在,在列表中查找所有这些编辑。
彼得的文章读得很好,值得一读。