</?a[^>]*>[|]
<script[^>]*>.*</script>[|]
</?iframe[^>]*>[|]
<table[^>]*>.*</table>[|]
</?span[^>]*>[|]
<object[^>]*>.*</object>[|]
<embed[^>]*>.*</embed>[|]
<param[^>]*>.*</param>[|]
<style[^>]*>.*</style>[|]
<p [^>]*>[|]<p>
<strong [^>]*>[|]<strong>
<b [^>]*>[|]<b>
\s{2,}[|]
</?div[^>]*>[|]
<!--[^>]*-->[|]
<input[^>]*>[|]
<img[^>]*>[|]
^\s*</[^>]+>[|]
\s+$[|]
^\s+[|]
<p>(\s+|(\&bnsp\;)*)</p>[|]
<[^/>]+>\s*$[|]
基本上是只保留p与br标签;其它全部过滤;
清理空白
清理错误标签