python将两个excel匹配_希望通过ID将两个Excel文件合并为一个使用Python 2.7的Excel文件...

最新推荐文章于 2024-04-29 10:05:43 发布

weixin_39904809

最新推荐文章于 2024-04-29 10:05:43 发布

阅读量427

点赞数

文章标签： python将两个excel匹配

本文链接：https://blog.csdn.net/weixin_39904809/article/details/111755795

版权

I am new to the Python family and have been trying to solve merge two Excel files for days. I have researched merging endlessly and tried to adapt my code to fit my needs, but it hasn't been working. I was wondering if I could get any help of why my code isn't working. I feel that this could be a common problem to others using Python, so hopefully this will help out others as well. I appreciate any comments!

I have two excel files, 'Chinese Scores3.csv' and 'Chinese Scores4.csv' which I am trying to merge by an ID, which is unique to each company. Other than the company ID, there are no matching columns for each excel file. Also, not all companies are listed on both files. Some are listed both, but others are listed on either one or the other. I would like to attach all the information for a company ID together in one row on an excel sheet. i.e. the first excel file columns are ID, JanSales, FebSales, etc. and the second excel file columns are ID, CreditScore, EMMAScore, etc. The excel file I would like to create has columns: ID, JanSales, FebSales, CreditScore, EMMAScore

all according to company ID.

Is this making sense? It's like using VLOOKUP in excel, but I would like to do this using Python. Anyway, here is my coding, which isn't working. I try manipulating it, but it isn't working. I hope to get feedback!

import sys

import csv

def main(arg):

headers= []

for arg in 'Chinese Scores3.csv':

with open(arg) as f:

curr = 'Chinese Scores3.csv'.reader(f).next()

headers.append(curr)

try:

keys=list( set(keys) & set (curr))

except NameError:

keys = curr

header = list(keys)

for h in headers:

header += [ k for k in h if k not in keys ]

data = {}

for arg in 'Chinese Scores4.csv':

with open(arg) as f:

reader = 'Chinese Scores4.csv'.DictReader(f)

for line in reader:

data_key = tuple([ line[k] for k in keys ])

if not data_key in data: data[data_key] = {}

for k in header:

try:

data[data_key][k] = line[k]

except KeyError:

pass

for key in data.keys():

for col in header:

if key in data and not col in data[key]:

del( data[key] )

print ','.join(header)

for key in sorted(data):

row = [ data[key][col] for col in header ]

print ','.join(row)

if __name__ == '__main__':

sys.exit( main( sys.argv[1:]) )

解决方案

While we could fix your code, I'd strongly recommend looking into the pandas library if you're going to be doing this sort of work instead. It makes life a lot easier, and often borderline trivial.

For example, if we had two csv files (although we could have started straight from Excel files if we wanted):

>>> !cat scores3.csv

ID,JanSales,FebSales

1,100,200

2,200,500

3,300,400

>>> !cat scores4.csv

ID,CreditScore,EMMAScore

2,good,Watson

3,okay,Thompson

4,not-so-good,NA

We could read these into objects called DataFrames (think of them sort of like Excel sheets):

>>> import pandas as pd

>>> s3 = pd.read_csv("scores3.csv")

>>> s4 = pd.read_csv("scores4.csv")

>>> s3

ID JanSales FebSales

0 1 100 200

1 2 200 500

2 3 300 400

>>> s4

ID CreditScore EMMAScore

0 2 good Watson

1 3 okay Thompson

2 4 not-so-good NaN

And then we can merge them on the ID column:

>>> merged = s3.merge(s4, on="ID", how="outer")

>>> merged

ID JanSales FebSales CreditScore EMMAScore

0 1 100 200 NaN NaN

1 2 200 500 good Watson

2 3 300 400 okay Thompson

3 4 NaN NaN not-so-good NaN

After which we could save it to a csv file or to an Excel file:

>>> merged.to_csv("merged.csv")

>>> merged.to_excel("merged.xlsx")

weixin_39904809

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python将两个excel匹配_希望通过ID将两个Excel文件合并为一个使用Python 2.7的Excel文件...

I am new to the Python family and have been trying to solve merge two Excel files for days. I have researched merging endlessly and tried to adapt my code to fit my needs, but it hasn't been working....
复制链接

扫一扫