1.基础python
vi 4csv_reader_value_in_set.py
#encoding=utf-8
#!/usr/bin/env python3
import csv
import sys
input_file=sys.argv[1]
output_file=sys.argv[2]
important_dates=['1/20/2014','1/30/2014']
with open(input_file,'rb') as csv_in_file:
with open(output_file,'wb') as csv_out_file:
filereader=csv.reader(csv_in_file)
filewriter=csv.writer(csv_out_file)
header=next(filereader)
filewriter.writerow(header)
for row_list in filereader:
a_date=row_list[4]
if a_date in important_dates: #某个单元格的值等于列表中的值。
filewriter.writerow(row_list)
#结果。
[root@mysql51 python_scripts]# python 4csv_reader_value_in_set.py supplier_data.csv 5output_csv.csv
[root@mysql51 python_scripts]# more 5output_csv.csv
Supplier Name,Invoice Number,Part Number,Cost,Purchase Date
Supplier x,001-1001,2341,$500 ,1/20/2014
Supplier x,001-1001,2341,$501 ,1/20/2014
Supplier x,001-1001,5467,$750 ,1/20/2014
Supplier x,001-1001,5467,$750 ,1/20/2014
Supplier y,50-9501,7009,$250 ,1/30/2014
Supplier y,50-9501,7009,$250 ,1/30/2014
#用于判断单元格的值是否属于列表。
2.pandas 当行中的某个值属于集合时,使用pandas;
vi pandas_value_in_set.py
#encoding=utf-8
#!/usr/bin/env python3
import pandas as pd
import sys
input_file=sys.argv[1]
output_file=sys.argv[2]
data_frame=pd.read_csv(input_file)
important_dates=['1/20/2014','1/30/2014']
data_frame_value_in_set=data_frame.loc[data_frame['Purchase Date'].isin(important_dates),:]
data_frame_value_in_set.to_csv(output_file,index=False)
#结果
python C:\Users\4201.HJSC\PycharmProjects\pythonProject\pandas_value_in_set.py \
C:\Users\4201.HJSC\Desktop\Python_exercise\supplier_data.csv \
C:\Users\4201.HJSC\Desktop\Python_exercise\6output_csv.csv
cat 6output_csv.csv
Supplier Name,Invoice Number,Part Number,Cost,Purchase Date
Supplier x,001-1001,2341,$500 ,1/20/2014
Supplier x,001-1001,2341,$501 ,1/20/2014
Supplier x,001-1001,5467,$750 ,1/20/2014
Supplier x,001-1001,5467,$750 ,1/20/2014
Supplier y,50-9501,7009,$250 ,1/30/2014
Supplier y,50-9501,7009,$250 ,1/30/2014
3.总结
pandas提供了:read_csv,loc,isin,to_csv等函数可以读取csv文件,
定位单元格的位置,判断值是否在集合中,转换为csv文件,非常方便。