使用openpyxl_image_loader下的sheet_image_loader读取excel单元格照片遇到的问题

H_Z_01

已于 2024-06-12 14:46:59 修改

阅读量431

点赞数 3

文章标签： excel python

于 2024-06-03 15:47:09 首次发布

本文链接：https://blog.csdn.net/weixin_46783197/article/details/139320666

版权

一、读取的表格如果存在图片在Z列后面会报“string index out of range”错误。

先看源代码

from openpyxl_image_loader import SheetImageLoader
class SheetImageLoader:
    """Loads all images in a sheet"""
    _images = {}

    def __init__(self, sheet):
        """Loads all sheet images"""
        sheet_images = sheet._images
        for image in sheet_images:
            row = image.anchor._from.row + 1
            col = string.ascii_uppercase[image.anchor._from.col]
            self._images[f'{col}{row}'] = image._data

原理是获取表格的全部图片信息，然后逐个遍历，对每个图片使用anchor._from.row提取所在行数，anchor._from.row提取所在列数，返回是int类型，并且从0开始。

源代码对row进行＋1就是为了使行数从1起，而col不加是因为string.ascii_uppercase可以得到ascii里的所有大写字母也就是是A-Z的26个字母，并以列表形式返回，而列表也是0起的，刚好A这个字母就在列表的0号位置。

这样就可以得到了图片的行数和列数所对应的excel列字母，最后用列字母拼接行数作为图片的键值存入_images这个一开始就声明的字典里。

分析提取的代码就可以知道为什么源代码只能提取A-Z列的图片，原因就在string.ascii_uppercase这里，col只能得到A-Z的26个字母，当超过Z列存在图片，得到的列数字就会超过25（索引0起），以这个列数作为索引去列表获取字母，就会报“string index out of range”，超出索引。

所以想要实现提取Z列后面的图片就要对col取值进行改造，我的想法是先得到提取图片的列数字，然后将列数转化为对应的excel列字母，理论上可以无限列读取图片，附上代码：

from openpyxl.utils import get_column_letter

'''
# 继承SheetImageLoader重写__init__，实现无限列图片的读取
'''

class SheetImageLoader_xw(SheetImageLoader):
    def __init__(self, sheet):
        """Loads all sheet images"""
        sheet_images = sheet._images
        for image in sheet_images:
            row = image.anchor._from.row + 1
            # 获取图片列
            col = get_column_letter(image.anchor._from.col + 1)
            # 获取ascii表中的所有大写字母
            #col = string.ascii_uppercase[image.anchor._from.col]
            self._images[f'{col}{row}'] = image._data

二、使用类的.get()函数获取图图片只能获取一次，再次获取会获取不到报错

解决办法就是重新读excel并重新提取图片

三、多次使用SheetImageLoader()读取图片，如果后续读取没有完全覆盖上次读取数据，那么没有覆盖的就会保留上次读取的数据，原因是sheet_image_loader读取完一个表格的照片后，不会自动清除已读入照片的缓存。

解决就是每次需要重新读取都使用_images.clear()函数格式化

from openpyxl import load_workbook
from openpyxl_image_loader import SheetImageLoader
wbb = load_workbook(path)
wss = wbb[wbb.sheetnames[0]]
image_loader = SheetImageLoader_xw(wss)
image_loader._images.clear()