注意python2中
with open("test.in", 'rb') as f:
而python3中
with open("test.in", 'r') as f:
另外,从网上直接下载的.txt文件可能会有其他字符,最好重新再复制一下新建一个.txt,否则可能报错
UnicodeDecodeError: ‘XXX' codec can't decode bytes in position 2-5: illegal multibyte sequence
这是因为遇到了非法字符,例如:全角空格往往有多种不同的实现方式,比如\xa3\xa0,或者\xa4\x57,
这些字符,看起来都是全角空格,但它们并不是“合法”的全角空格
真正的全角空格是\xa1\xa1,因此在转码的过程中出现了异常。
还有一点需要注意的是:
python 2 中dict的iteritems()在python 3中对应的是items(),会报错
AttributeError: 'dict' object has no attribute 'items'
Python 2.7中以下代码能成功运行
from sklearn.feature_extraction import DictVectorizer import csv from sklearn import preprocessing from sklearn import tree from sklearn.externals.six import StringIO allElectronicsData =open(r'C:\Users\asus\Desktop\Machine Learing\Decision Tree\2.csv','rb') reader=csv.reader(allElectronicsData) headers=readers.next() print(headers)
Python 3中运行报错信息:
Traceback (most recent call last): File "F:\Techonolgoy\Python\file\blog\csv_open.py", line 8, in <module> header = reader.next() AttributeError: '_csv.reader' object has no attribute 'next'
正确如下:
from sklearn.feature_extraction import DictVectorizer import csv from sklearn import preprocessing from sklearn import tree from sklearn.externals.six import StringIO allElectronicsData =open(r'C:\Users\asus\Desktop\Machine Learing\Decision Tree\2.csv','r')#此处有改动的,之前最后一项是'rb' reader=csv.reader(allElectronicsData) headers=next(reader) #此处有改动,之前是headers=readers.next() print(headers)
调用系统predict()函数时,python2与python3的predict函数不同,需将list转化为数组,
python3中predict的参数是数组,传进list会waring,正确如下:
list1=[2,0] array=np.array(list1).reshape(1,-1) print(clf.predict(array))#predict的参数是数组,传进list会waring