以前在做关键词或脏字过滤的时候都是使用的TrieTree,后来随便搜索发现了yeerh的这篇文章:http://www.cnblogs.com/yeerh/archive/2011/10/20/2219035.html,比较了一下自己的实现和yeerh的TrieTree实现,发现作者trie node增加一个end能够增快搜索,确实优于自己的实现。所以把网站的关键词搜索替换成了yeerh的实现,并替换成了java版本。
先引入接口,还没怎么想好:
public interface Check {
public void addWord(String word);
public boolean hasBadWord(String text);
public String replaceWith(String text,char mark);
}
TrieTree实现:
import java.util.*;
/**
* User: fafu
* Date: 14-7-25
* Time: 下午5:37
* This class is
*/
public class TrieCheck implements Check {
private TrieNode root;
@Override
public void addWord(String word) {
if (word == null || word.length() == 0) return;
TrieNode current = root;
for (int i = 0; i < word.length(); i++) {
char code = word.charAt(i);
current = current.add(code);
}
current.end = true;
}
@Override
public boolean hasBadWord(String text) {
IndexWordPair pair = getBaddWord(text);
if (pair == null) return false;
return true;
}
private IndexWordPair getBaddWord(String text) {
if (text == null || text.length() == 0) return null;
List<Character> chlist = new ArrayList<Character>();
for (int i = 0; i < text.length(); i++) {
TrieNode current = root;
int index = i;