python compare 打印差异_如何按值比较python值中的2个CSV文件并打印差异?

I have 2 CSV files of same dimensions. In the below example used the dimensions is 3*3 (3 comma separated values and 3 rows). It could be files of dimensions 100*10000

File1.csv:

Name, ID, Profession

Tom, 1, Teacher

Dick, 2, Actor

File2.csv:

Name, ID, Profession

Dick, 2, Actor

Tom, 1, Police

I want to compare the files element wise (e.g: Teacher == Police)

It would be great if I could compare the lists using primary key (ID) in case the list is not in order. I would like to have output something like below:

Profession of ID = 1 does not match, i.e Teacher <> Police

ID in the output above is the primary key.

Note: file may be very huge (100 columns * 10000 records)

Below is the code I used to get the lists A and B from 2 csv files. But it's very tedious and I could get only 2 lines using such long code.

source_file = open('File1.csv', 'r')

file_one_line_1 = source_file.readline()

file_one_line_1_str = str(file_one_line_1)

file_one_line_1_str_replace = file_one_line_1_str.replace('\n', '')

file_one_line_1_list = list(file_one_line_1_str_replace.split(','))

file_one_line_2 = source_file.readline()

file_one_line_2_str = str(file_one_line_2)

file_one_line_2_str_replace = file_one_line_2_str.replace('\n', '')

file_one_line_2_list = list(file_one_line_2_str_replace.split(','))

file_one_line_3 = source_file.readline()

file_one_line_3_str = str(file_one_line_3)

file_one_line_3_str_replace = file_one_line_3_str.replace('\n', '')

file_one_line_3_list = list(file_one_line_3_str_replace.split(','))

A = [file_one_line_1_list, file_one_line_2_list, file_one_line_3_list]

target_file = open('File2.csv', 'r')

file_two_line_1 = target_file.readline()

file_two_line_1_str = str(file_two_line_1)

file_two_line_1_str_replace = file_two_line_1_str.replace('\n', '')

file_two_line_1_list = list(file_two_line_1_str_replace.split(','))

file_two_line_2 = source_file.readline()

file_two_line_2_str = str(file_two_line_2)

file_two_line_2_str_replace = file_two_line_2_str.replace('\n', '')

file_two_line_2_list = list(file_two_line_2_str_replace.split(','))

file_two_line_3 = source_file.readline()

file_two_line_3_str = str(file_two_line_3)

file_two_line_3_str_replace = file_two_line_3_str.replace('\n', '')

file_two_line_3_list = list(file_two_line_3_str_replace.split(','))

B = [file_two_line_1_list, file_two_line_2_list, file_two_line_3_list]

Used below code and it's working smooth:

source_file = 'Book1.csv'

target_file = 'Book2.csv'

primary_key = 'id'

# read source and target files

with open(source_file, 'r') as f:

reader = csv.reader(f)

A = list(reader)

with open(target_file, 'r') as f:

reader = csv.reader(f)

B = list(reader)

# get the number of the 'ID' column

column_names = A[0]

column_id = column_names.index(primary_key)

# get the column names without 'ID'

values_name = column_names[0:column_id] + column_names[column_id + 1:]

# create a dictionary with keys in column `column_id`

# and values the list of the other column values

A_dict = {a[column_id]: a[0:column_id] + a[column_id + 1:] for a in A}

B_dict = {b[column_id]: b[0:column_id] + b[column_id + 1:] for b in B}

# iterate on the keys and on the other columns and print the differences

for id in A_dict.keys():

for column in range(len(column_names) - 1):

if A_dict[id][column] != B_dict[id][column]:

print(f"{primary_key} = {id}\t{values_name[column]}: {A_dict[id][column]} != {B_dict[id][column]}")```

Thanks.

解决方案

For reading csv and store the content as nested lists, see https://stackoverflow.com/a/35340988/12669658

For comparing the lists element-wise, refer to your dedicated question: https://stackoverflow.com/a/59633822/12669658

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值