准备工作
首先按照如下代码,创建一个excel命名为test.xlsx
对于workbook以及worksheet的理解和相关操作,请见这篇笔记:【openpyxl】记录01 - Workbook、Worksheet的创建与相关操作
from openpyxl import Workbook
wb = Workbook()
ws1 = wb.create_sheet("Mysheet1")
# or
ws2 = wb.create_sheet("Mysheet2", 0)
# or
ws3 = wb.create_sheet("Mysheet3", -1)
ws = wb.active
一、访问一个单元格(cell)
1. 利用excel的坐标来访问单元格
通过c = ws['A2']
可以返回A2处的单元格,如下面的代码所示,当前A2位置的值为空
用Cell.value
可以查看当前cell的值
c = ws['A2'] # 将返回A2处的单元格
print(c)
print(c.value)
'''
<Cell 'Mysheet2'.A2>
None
'''
我们可以给单元格直接赋值
ws['A2'] = 4
c = ws['A2']
print(c.value) # 4
注意:当创建一个worksheet时,它不包含仍和cell。这些cell是在首次被访问时创建的。
2. 利用行列表示法来访问单元格
# 访问
d = ws.cell(row=3, column=2)
print(d.value) # None
# 赋值方法1
d.value = 10
print(d.value) # 10
# 赋值方法2
d = ws.cell(row=3, column=2, value=12)
print(d.value) # 12
二、访问多个单元格
1. 利用 切片 确定访问单元格的范围
# 利用切片确定单元格的访问范围
cell_range = ws['A1':'C2'] # A1单元格-C2单元格
for cell in cell_range:
print(cell)
'''
(<Cell 'Mysheet2'.A1>, <Cell 'Mysheet2'.B1>, <Cell 'Mysheet2'.C1>)
(<Cell 'Mysheet2'.A2>, <Cell 'Mysheet2'.B2>, <Cell 'Mysheet2'.C2>)
'''
#这里我们需要注意,由于cell是在首次被访问时创建的,所以目前worksheet ws里面只创建了2行3列,因此下面例子里输出了第C列,第C列里面有两个元素,即两行
# 利用切片确定列范围
col_range = ws['C:E'] # 第C列-第D列
for cell in col_range:
print(cell)
'''
(<Cell 'Mysheet2'.C1>, <Cell 'Mysheet2'.C2>)
(<Cell 'Mysheet2'.D1>, <Cell 'Mysheet2'.D2>)
(<Cell 'Mysheet2'.E1>, <Cell 'Mysheet2'.E2>)
'''
# 利用切片确定行范围
row_range = ws[5:10] # 第5行-第10行
for cell in row_range:
print(cell)
'''
(<Cell 'Mysheet2'.A5>, <Cell 'Mysheet2'.B5>, <Cell 'Mysheet2'.C5>, <Cell 'Mysheet2'.D5>)
(<Cell 'Mysheet2'.A6>, <Cell 'Mysheet2'.B6>, <Cell 'Mysheet2'.C6>, <Cell 'Mysheet2'.D6>)
(<Cell 'Mysheet2'.A7>, <Cell 'Mysheet2'.B7>, <Cell 'Mysheet2'.C7>, <Cell 'Mysheet2'.D7>)
(<Cell 'Mysheet2'.A8>, <Cell 'Mysheet2'.B8>, <Cell 'Mysheet2'.C8>, <Cell 'Mysheet2'.D8>)
(<Cell 'Mysheet2'.A9>, <Cell 'Mysheet2'.B9>, <Cell 'Mysheet2'.C9>, <Cell 'Mysheet2'.D9>)
(<Cell 'Mysheet2'.A10>, <Cell 'Mysheet2'.B10>, <Cell 'Mysheet2'.C10>, <Cell 'Mysheet2'.D10>)
'''
2. 利用行列表示法确定访问单元格的范围
Worksheet.iter_rows(min_row, max_col, max_row)
:返回行Worksheet.iter_cols(min_row, max_col, max_row)
:返回列
以上连个方法都是确定最小和最大行序号,以及确定列数
# Worksheet.iter_rows(min_row, max_col, max_row)
for row in ws.iter_rows(min_row=1, max_col=3, max_row=2): # 1-2行,共3列(A1-C2)
for cell in row: # 按行遍历
print(cell)
'''
<Cell 'Mysheet2'.A1>
<Cell 'Mysheet2'.B1>
<Cell 'Mysheet2'.C1>
<Cell 'Mysheet2'.A2>
<Cell 'Mysheet2'.B2>
<Cell 'Mysheet2'.C2>
'''
# Worksheet.iter_cols(min_row, max_col, max_row)
for col in ws.iter_cols(min_row=1, max_col=3, max_row=2): # 1-2行,共3列(A1-C2)
for cell in col: # 按列遍历
print(cell)
'''
<Cell 'Mysheet2'.A1>
<Cell 'Mysheet2'.A2>
<Cell 'Mysheet2'.B1>
<Cell 'Mysheet2'.B2>
<Cell 'Mysheet2'.C1>
<Cell 'Mysheet2'.C2>
'''
3. 循环访问文件的所有行或列
Worksheet.rows
循环访问文件的所有行Worksheet.columns
循环访问文件的所有列
ws.rows
# <generator object Worksheet._cells_by_row at 0x0000017E6D419A80>
print(tuple(ws.rows)) # 只能用元组输出
'''
((<Cell 'Mysheet2'.A1>, <Cell 'Mysheet2'.B1>, <Cell 'Mysheet2'.C1>),
(<Cell 'Mysheet2'.A2>, <Cell 'Mysheet2'.B2>, <Cell 'Mysheet2'.C2>))
'''
print(tuple(ws.columns))
'''
((<Cell 'Mysheet2'.A1>, <Cell 'Mysheet2'.A2>),
(<Cell 'Mysheet2'.B1>, <Cell 'Mysheet2'.B2>),
(<Cell 'Mysheet2'.C1>, <Cell 'Mysheet2'.C2>))
'''
三、仅访问工作表的值
1. Worksheet.values
属性
循环遍历工作表中的所有行,并且只返回单元格的值
for row in ws.values:
for value in row:
print(value)
'''
None
None
None
None
None
None
'''
2. 行列表示法访问工作表的值
Worksheet.iter_rows()
Worksheet.iter_cols()
以上两个函数的参数values_only=True则只返回单元格的值
for row in ws.iter_rows(min_row=1, max_col=3, max_row=2, values_only=True):
print(row)
'''
(None, None, None)
(None, None, None)
'''