个人笔记:python批量处理excel,实现自动提取目标信息,并输出目标文档

代码使用了pandaseasyguiopenpyxl库。它创建了一个新的Excel文件,并在其中创建了一个名为"Regions"的工作表。然后,它在新工作表中添加了一些新的列标题。

接下来,代码通过弹出对话框选择源文件,使用easygui.fileopenbox()函数实现。获取到源文件后,代码会遍历每个源文件进行处理。

代码首先加载源文件的工作簿,然后根据条件从源工作表中提取数据,并将其复制到新的工作表中。具体而言,代码会复制设备名称、描述和区域列的数据。

region_table_generate:

#import pandas as pd
import easygui  # Import easygui library for displaying file selection dialog
import openpyxl  # Import openpyxl library for Excel file handling
from datetime import datetime  # Import the datetime class from the datetime module

new_workbook = openpyxl.Workbook()  # Create a new workbook object
new_sheet = new_workbook.active  # Get the active sheet of the new workbook
new_sheet.title = 'Regions'  # Set the title of the sheet to 'Regions'

# Add new column headings for the new columns
new_sheet['E4'].value = 'Mask Must Interact'
new_sheet['F4'].value = "Mask Can't Interact"
new_sheet['G4'].value = 'Design Must Interact'
new_sheet['H4'].value = "Design Can't Interact"

#source_files = ['Table_a.xlsx', 'Table_b.xlsx']
source_files = easygui.fileopenbox(msg='Please select source file', title='Select source file', multiple=True)  # Display file selection dialog to choose source file(s)
current_datetime = datetime.now()  # Get current date and time
formatted_datetime = current_datetime.strftime('%Y-%m-%d')  # Format current date and time as string in format 'YYYY-MM-DD'
save_file = easygui.filesavebox(msg='Please select the save file path', title='Select the save file path', default=f'regions_table_{formatted_datetime}.xlsx')  # Display file save dialog to choose save file path, with default file name as 'regions_table_current_date.xlsx'

然后,代码会迭代处理源工作表中的每一行数据,根据条件值判断是否需要相互作用,并将相应的数据字符串写入新的工作表的相应位置。

最后,代码保存新的工作簿为一个Excel文件,并显示一个消息框表示完成。

 

for source_file in source_files:
    source_workbook = openpyxl.load_workbook(source_file)  # Load the workbook of the source file
    
    source_row_index = 5  # Starting row index in the source file
    mask_must_interact_column = 5  # Column index for "Mask Must Interact"
    mask_not_interact_column = 6  # Column index for "Mask Can't Interact"
    design_must_interact_column = 7  # Column index for "Design Must Interact"
    design_not_interact_column = 8  # Column index for "Design Can't Interact"
    
    if 'DP Truth' in source_workbook.sheetnames:  # If the source workbook has a sheet named 'DP Truth'
        source_sheet = source_workbook['DP Truth']  # Get the sheet object
        
        # Copy the device name, description, and region columns from the old sheet to the new sheet
        column_numbers = range(2, 5)  # Column indexes for device name, description, and region
        for column_number in column_numbers:
            column_letter = openpyxl.utils.get_column_letter(column_number)  # Convert column index to letter format
            source_column = source_sheet[column_letter]  # Get the column data from the source sheet
            for i, cell in enumerate(source_column, start=1):
                new_cell = new_sheet.cell(row=i, column=column_number)  # Get the corresponding cell in the new sheet
                new_cell.value = cell.value  # Copy the value from the source cell to the new cell
                
        # Capture the header row from the source sheet.
        header_row_number = 4  # Header row index in the source sheet
        header_row = source_sheet[header_row_number]  # Get the header row data
        
        for row in source_sheet.iter_rows(min_row=5):  # Iterate through each row of data starting from row 5
            must_interact = []  # List to store conditions that must interact
            not_interact = []  # List to store conditions that can't interact
            for column, condition_cell in enumerate(row):  # Iterate through each cell in the row
                header_cell = header_row[column]  # Get the corresponding header cell
                if condition_cell.value == '1':  # If the condition cell value is 1
                    must_interact.append(header_cell.value)  # Add the header to the must interact list
                    must_string = ", ".join(must_interact)  # Join the elements in the list as a string
                    new_sheet.cell(row=source_row_index, column=mask_must_interact_column).value = must_string  # Write the string to the corresponding cell in the new sheet
                elif condition_cell.value == '0':  # If the condition cell value is 0
                    not_interact.append(header_cell.value)  # Add the header to the can't interact list
                    not_string = ", ".join(not_interact)  # Join the elements in the list as a string
                    new_sheet.cell(row=source_row_index, column=mask_not_interact_column).value = not_string  # Write the string to the corresponding cell in the new sheet
            source_row_index += 1  # Increment the source row index
    elif 'Design Truth' in source_workbook.sheetnames:  # If the source workbook has a sheet named 'Design Truth'
        source_sheet = source_workbook['Design Truth']  # Get the sheet object
        
        # Capture the header row from the source sheet.
        header_row_number = 4  # Header row index in the source sheet
        header_row = source_sheet[header_row_number]  # Get the header row data
        
        for row in source_sheet.iter_rows(min_row=5):  # Iterate through each row of data starting from row 5
            must_interact = []  # List to store conditions that must interact
            not_interact = []  # List to store conditions that can't interact
            for column, condition_cell in enumerate(row):  # Iterate through each cell in the row
                header_cell = header_row[column]  # Get the corresponding header cell
                if condition_cell.value == '1':  # If the condition cell value is 1
                    must_interact.append(header_cell.value)  # Add the header to the must interact list
                    must_string = ", ".join(must_interact)  # Join the elements in the list as a string
                    new_sheet.cell(row=source_row_index, column=design_must_interact_column).value = must_string  # Write the string to the corresponding cell in the new sheet
                elif condition_cell.value == '0':  # If the condition cell value is 0
                    not_interact.append(header_cell.value)  # Add the header to the can't interact list
                    not_string = ", ".join(not_interact)  # Join the elements in the list as a string
                    new_sheet.cell(row=source_row_index, column=design_not_interact_column).value = not_string  # Write the string to the corresponding cell in the new sheet
            source_row_index += 1  # Increment the source row index
            
#new_workbook.save('regions table.xlsx')
new_workbook.save(save_file)  # Save the new workbook as an Excel file, the path is determined by the user-selected save file path
easygui.msgbox('finished!')  # Display a messagebox indicating completion

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值