python 查找excel内容所在的单元格_使用Python查找Excel单元格引用

1586010002-jmsa.png

Context: I am writing a program which can pull values from a PDF and put them in the appropriate cell in an Excel file.

Question: I want to write a function which takes a column value (e.g. 2014) and a row value (e.g. 'COGS') as arguments and return the cell reference where those two intersect (e.g. 'C3' for 2014 COGS).

def find_correct_cell(year=2014, item='COGS'):

#do something similar to what the =match function in Excel does

return cell_reference #returns 'C3'

col_num = '=match(2014, A1:E1)'

row_num = '=match("COGS", A1:A5)'

But I want to grab those values without having to arbitrarily write to those random empty cells. Plus, even with this method, when I read those cells (F5 and F6) it reads the formulae in those cells and not the face value of 3.

Any help is appreciated, thanks.

解决方案

There are a surprising number of details you need to get right to manipulate Excel files this way with openpyxl. First, it's worth knowing that the xlsx file contains two representations of each cell - the formula, and the current value of the formula. openpyxl can return either, and if you want values you should specify data_only=True when you open the file. Also, openpyxl is not able to calculate a new value when you change the formula for a cell - only Excel itself can do that. So inserting a MATCH() worksheet function won't solve your problem.

The code below does what you want, mostly in Python. It uses the "A1" reference style, and does some calculations to turn column numbers into column letters. This won't hold up well if you go past column Z. In that case, you may want to switch to numbered references to rows and columns. There's some more info on that here and here. But hopefully this will get you on your way.

Note: This code assumes you are reading a workbook called 'test.xlsx', and that 'COGS' is in a list of items in 'Sheet1!A2:A5' and 2014 is in a list of years in 'Sheet1!B1:E1'.

import openpyxl

def get_xlsx_region(xlsx_file, sheet, region):

""" Return a rectangular region from the specified file.

The data are returned as a list of rows, where each row contains a list

of cell values"""

# 'data_only=True' tells openpyxl to return values instead of formulas

# 'read_only=True' makes openpyxl much faster (fast enough that it

# doesn't hurt to open the file once for each region).

wb = openpyxl.load_workbook(xlsx_file, data_only=True, read_only=True)

reg = wb[sheet][region]

return [[cell.value for cell in row] for row in reg]

# cache the lists of years and items

# get the first (only) row of the 'B1:F1' region

years = get_xlsx_region('test.xlsx', 'Sheet1', 'B1:E1')[0]

# get the first (only) column of the 'A2:A6' region

items = [r[0] for r in get_xlsx_region('test.xlsx', 'Sheet1', 'A2:A5')]

def find_correct_cell(year, item):

# find the indexes for 'COGS' and 2014

year_col = chr(ord('B') + years.index(year)) # only works in A:Z range

item_row = 2 + items.index(item)

cell_reference = year_col + str(item_row)

return cell_reference

print find_correct_cell(year=2014, item='COGS')

# C3

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值