def textparse(bigstring):
import re
listOfTokens = re.split(r'\W*',bigstring)
return [ i.lower() for i in listOfTokens if len(i)>2]
mySent ='This book is the best book on python, or M.L. I have ever laid eyes upon'
#mySent = 'this book'
print(textparse(mySent))
运行结果```
运行以后只剩下一个方括号,原因是该正则化项分割到单个字母。
def textparse(bigstring):
import re
regex = re.compile('(?:\W)')
listOfTokens = regex.split(bigstring)
return [ i.lower() for i in listOfTokens if len(i)>2]
mySent ='This book is the best book on python, or M.L. I have ever laid eyes upon'
#mySent = 'this book'
print(textparse(mySent))
经查阅资料,不想保留分隔符,以(?:...)的形式指定。
![在这里插入图片描述](https://img-blog.csdnimg.cn/20190517005552918.png)
机器学习实战 第四章贝叶斯过滤垃圾邮件
最新推荐文章于 2020-11-24 14:51:48 发布