因为Python2转Python3的原因,《机器学习实战》第4章的
emailText = open('email/ham/6.txt').read()
需要改为
emailText = open('email/ham/6.txt','rb').read()
但之后调用
listOfTokens = regEx.split(emailText)
会出现
TypeError: cannot use a string pattern on a bytes-like object
上网搜索后发现可以用下面方式解决:
emailText = emailText.decode('GBK')
或直接
emailText = open('email/ham/6.txt','rb').read().decode('GBK')