python compare 打印差异_如何按值比较python值中的2个CSV文件并打印差异？

最新推荐文章于 2023-01-06 17:33:30 发布

weixin_39950083

最新推荐文章于 2023-01-06 17:33:30 发布

阅读量189

点赞数

文章标签： python compare 打印差异

I have 2 CSV files of same dimensions. In the below example used the dimensions is 3*3 (3 comma separated values and 3 rows). It could be files of dimensions 100*10000

File1.csv:

Name, ID, Profession

Tom, 1, Teacher

Dick, 2, Actor

File2.csv:

Name, ID, Profession

Dick, 2, Actor

Tom, 1, Police

I want to compare the files element wise (e.g: Teacher == Police)

It would be great if I could compare the lists using primary key (ID) in case the list is not in order. I would like to have output something like below:

Profession of ID = 1 does not match, i.e Teacher <> Police

ID in the output above is the primary key.

Note: file may be very huge (100 columns * 10000 records)

Below is the code I used to get the lists A and B from 2 csv files. But it's very tedious and I could get only 2 lines using such long code.

source_file = open('File1.csv', 'r')

file_one_line_1 = source_file.readline()

file_one_line_1_str = str(file_one_line_1)

file_one_line_1_str_replace = file_one_line_1_str.replace('\n', '')

file_one_line_1_list = list(file_one_line_1_str_replace.split(','))

file_one_line_2 = source_file.readline()

file_one_line_2_str = str(file_one_line_2)

file_one_line_2_str_replace = file_one_line_2_str.replace('\n', '')

file_one_line_2_list = list(file_one_line_2_str_replace.split(','))

file_one_line_3 = source_file.readline()

file_one_line_3_str = str(file_one_line_3)

file_one_line_3_str_replace = file_one_line_3_str.replace('\n', '')

file_one_line_3_list = list(file_one_line_3_str_replace.split(','))

A = [file_one_line_1_list, file_one_line_2_list, file_one_line_3_list]

target_file = open('File2.csv', 'r')

file_two_line_1 = target_file.readline()

file_two_line_1_str = str(file_two_line_1)

file_two_line_1_str_replace = file_two_line_1_str.replace('\n', '')

file_two_line_1_list = list(file_two_line_1_str_replace.split(','))

file_two_line_2 = source_file.readline()

file_two_line_2_str = str(file_two_line_2)

file_two_line_2_str_replace = file_two_line_2_str.replace('\n', '')

file_two_line_2_list = list(file_two_line_2_str_replace.split(','))

file_two_line_3 = source_file.readline()

file_two_line_3_str = str(file_two_line_3)

file_two_line_3_str_replace = file_two_line_3_str.replace('\n', '')

file_two_line_3_list = list(file_two_line_3_str_replace.split(','))

B = [file_two_line_1_list, file_two_line_2_list, file_two_line_3_list]

Used below code and it's working smooth:

source_file = 'Book1.csv'

target_file = 'Book2.csv'

primary_key = 'id'

# read source and target files

with open(source_file, 'r') as f:

reader = csv.reader(f)

A = list(reader)

with open(target_file, 'r') as f:

reader = csv.reader(f)

B = list(reader)

# get the number of the 'ID' column

column_names = A[0]

column_id = column_names.index(primary_key)

# get the column names without 'ID'

values_name = column_names[0:column_id] + column_names[column_id + 1:]

# create a dictionary with keys in column `column_id`

# and values the list of the other column values

A_dict = {a[column_id]: a[0:column_id] + a[column_id + 1:] for a in A}

B_dict = {b[column_id]: b[0:column_id] + b[column_id + 1:] for b in B}

# iterate on the keys and on the other columns and print the differences

for id in A_dict.keys():

for column in range(len(column_names) - 1):

if A_dict[id][column] != B_dict[id][column]:

print(f"{primary_key} = {id}\t{values_name[column]}: {A_dict[id][column]} != {B_dict[id][column]}")```

Thanks.

解决方案

For reading csv and store the content as nested lists, see https://stackoverflow.com/a/35340988/12669658

For comparing the lists element-wise, refer to your dedicated question: https://stackoverflow.com/a/59633822/12669658

weixin_39950083

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python compare 打印差异_如何按值比较python值中的2个CSV文件并打印差异？

I have 2 CSV files of same dimensions. In the below example used the dimensions is 3*3 (3 comma separated values and 3 rows). It could be files of dimensions 100*10000File1.csv:Name, ID, ProfessionTom...
复制链接

扫一扫