在机器学习中,常常需要读取txt文本中的数据,这里主要整理了两种读取数据的方式
数据内容
共有四列数据,前三列为特征值,最后一列为数据标签
409208.3269760.9539523
144887.1534691.6739042
260521.4418710.8051241
7513613.1473940.4289641
383441.6697880.1342961
7299310.1417401.0329551
359486.8307921.2131923
4266613.2763690.5438803
674978.6315770.7492781
3548312.2731691.5080533
方式一:手动读取
from numpy import *
import operator
from os import listdir
def file2matrix(filename):
fr = open(filename)
numberOfLines = len(fr.readlines()) #get the number of lines in the file
returnMat = zeros((numberOfLines,3)) #prepare matrix to return
classLabelVector = [] #prepare labels return
fr = open(filename)
index = 0
for line in fr.readlines():
line = line.strip()
listFromLine = line.split(' ')