Python处理excel常用知识

目录

1. DataFrame一次增加多列

2. Python实现分层抽样

3. 把几个DataFrame合并成一个DataFrame

4. 保存多个DataFrame到一个excel里

5. Excel的列宽自动调整

6. 用Office自动打开excel文件 

7. 读取excel文件路径中带中文

8. 空DataFrame一行行加数据

9. 写入Excel单元格中可以多行


1. DataFrame一次增加多列

def add_subtract_series(a, b):
  return pd.Series((a + b, a - b))

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_list(row['a'], row['b']), axis=1)

2. Python实现分层抽样

#分层抽样
gbr = data.groupby("area")

gbr.groups

typicalFracDict = {
    1: 0.2, 
    2: 0.4, 
    3: 0.6
}

def typicalSampling(group, typicalFracDict):
    name = group.name
    frac = typicalFracDict[name]
    return group.sample(frac=frac)

result = data.groupby(
    'area', group_keys=False
).apply(typicalSampling, typicalFracDict)

3. 把几个DataFrame合并成一个DataFrame

https://blog.csdn.net/qq_42707449/article/details/81116656

4. 保存多个DataFrame到一个excel里

from pandas import ExcelWriter
def save_xls(list_dfs, xls_path):
    with ExcelWriter(xls_path) as writer:
        for n, df in enumerate(list_dfs):
            df.to_excel(writer,'sheet%s' % n)
        writer.save()

5. Excel的列宽自动调整

# Given a dict of dataframes, for example:
# dfs = {'gadgets': df_gadgets, 'widgets': df_widgets}

writer = pd.ExcelWriter(filename, engine='xlsxwriter')
for sheetname, df in dfs.items():  # loop through `dict` of dataframes
    df.to_excel(writer, sheet_name=sheetname)  # send df to writer
    worksheet = writer.sheets[sheetname]  # pull worksheet object
    for idx, col in enumerate(df):  # loop through all columns
        series = df[col]
        max_len = max((
            series.astype(str).map(len).max(),  # len of largest item
            len(str(series.name))  # len of column name/header
            )) + 1  # adding a little extra space
        worksheet.set_column(idx, idx, max_len)  # set column width
writer.save()

6. 用Office自动打开excel文件 

import subprocess
#Open the workbook in MS Excel 
subprocess.Popen([filename], shell=True)

7. 读取excel文件路径中带中文

with open('文件\\ab.csv', encoding = 'utf-8') as f:
    file = pd.read_csv(f).fillna('')
    if len(file.columns) == 1:
        f.seek(0,0) #当csv以\t分隔,文件指针移回文件头
        file = pd.read_table(f).fillna()
 

8. 空DataFrame一行行加数据

for i in range(len(file)):
    row = file.irow(i)
    result = result.append({'col1':row[0], 'col2':row[1]}, ignore_index = True)

9. 写入Excel单元格中可以多行

def save_xls(list_df, list_sheet_name, xls_path):
    with ExcelWriter(xls_path, engine = 'xlsxwriter') as writer:
        workbook = writer.book
        #设置格式,可以一个单元格内通过 \r\n 换行
        book_format = workbook.add_format({'text_wrap':True})

    for i in range(len(list_df)):
        data = list_df[i]
        data.to_excel(writer, list_sheet_name[i], index = False)
        worksheet = writer.sheets[list_sheet_name[i]]

        for idx, col in enumerate(data):
            series = data[col]
            max_len = max(series.astype(str).apply(lambda x: x.split('\r\n')).map(len).max() + 15, len(str(series.name))*4 )
            worksheet.set_column(idx, idx, max_len, book_format)
    writer.save()

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Anyanyamy

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值