111111111111:将数据集中所有信息异常的行删除。
比如上面的样例中第4行数据只有3个元素,而其他行都有6个元素,所以第4行是信息异常的行,将其删除。
数据集中可能还会存在一些其他异常。
将全部信息处理之后,每行的元素以逗号为分隔符,写入文件`test1`
csv标准库中的writerow在写入文件时会加入’\r\n’作为换行符
当写文件时newline=None,csv先是将’a\r\nb\r\n’写入内存,再写入文件时,universal newlines mode工作,换行符’\n’被翻译为’\r\n’;
当写文件时newline=’’,程序写入’a\r\nb\r\n’
import csv
with open('data','r') as f:
with open('Data.csv','w') as fnew:
for row in f:
row=row.replace(' ',',')
fnew.write(row)
#打开数据,创建写入文件;
with open('Data.csv','r',newline='')as inf:
with open('data1.csv','w',newline='')as outf:
filereader = csv.reader(inf)
filewriter = csv.writer(outf)
#只保存数据长度为6且第三个数据为0 的;
for i in filereader:
if len(i) == 6 and int(i[2]) !=0 :
filewriter.writerow(i)
2222222222:统计文件`test1`的数据中所有动作的数目并打印到屏幕,然后将动
作数目对100取整后写入`test2`文件,多余的信息行抛弃。
import csv
d= {}
list = []
inf= open('temp1_1.csv','r',newline='')
filereader = csv.reader(inf)
for row in filereader:
D[row[1]] = D.get(row[1],0)+1
#每种动作数量进行统计
for move, D[move] in D.items():
print('Movement: %5s\t' % move, end='')
print('Amount: %d' % D[move])
D[move] = D[move] // 100 * 100
list.append(D[move])
#将指针移动到开始位置
inf.seek(0, 0)
outf= open('temp2_2.csv', 'w', newline='')
filewriter = csv.writer(outf)
Walking = 0
Jogging = 0
Downstairs = 0
Upstairs = 0
Sitting = 0
Standing = 0
for row_list in filereader:
if row_list[1] == 'Jogging' and Jogging < list[1]:
Jogging += 1
filewriter.writerow(row_list)
if row_list[1] == 'Upstairs' and Upstairs < list[2]:
Upstairs += 1
filewriter.writerow(row_list)
if row_list[1] == 'Downstairs' and Downstairs < list[3]:
Downstairs += 1
filewriter.writerow(row_list)
if row_list[1] == 'Standing' and Standing < list[4]:
Standing += 1
filewriter.writerow(row_list)
if row_list[1] == 'Sitting' and Sitting < list[4]:
Sitting += 1
filewriter.writerow(row_list)
if row_list[1] == 'Walking' and Walking < list[0]:
Walking += 1
filewriter.writerow(row_list)
outf.close()
inf.close()
33333333333:读取文件`test2`的数据,取每行
的后3列元素,以空格为分隔符写入文件`test3`。
import csv
f=open('text2.csv','r')
rf=csv.reader(f)
ff=open('text3.csv','w',encoding='utf-8',newline='')
wff=csv.writer(ff)
for i in rf:
wff.writerow(i[3:6])
f.close()
ff.close()
44444444444:读取文件`test3`的数据,每行数据为一组,每组组内的元素以
空格为分隔符,组与组之间的数据以逗号为分隔符,每20组元素为一行,
count = 0
fp = open('text3','r')
fp_new = open('finally','w')
for row in fp:
count = count + 1
if count % 20 != 0:
row = row.replace('\n', ',')
fp_new.write(row)
fp.close()
fp_new.close()