RuntimeError: stack expects each tensor to be equal size, but got [3, 640, 1064] at entry 0 and [3,

最新推荐文章于 2025-03-21 11:08:40 发布

樊素与阿罗778

最新推荐文章于 2025-03-21 11:08:40 发布

阅读量1.2k

点赞数 20

文章标签： python pytorch 深度学习

本文链接：https://blog.csdn.net/qq_62201633/article/details/142209248

版权

在笔者复现DBnet时出现如题错误。

问题出在 torch.stack 的调用上

此时运行的是自己的数据集，解决方法为

统一尺寸，在数据预处理阶段将所有图像调整到相同的尺寸，或者可以直接将数据集处理成相同大小
如果你希望保持图像的原始尺寸，并且在模型中处理不同尺寸的图像，可以引入动态填充逻辑。这通常涉及到找到最大尺寸，并将所有图像填充到这一尺寸，此改动通常在DB\data_loader\__init__.py文件中，找到ICDARCollectFN改动

import torch
from torchvision.transforms.functional import pad

class ICDARCollectFN:
    def __init__(self, *args, **kwargs):
        pass

    def __call__(self, batch):
        data_dict = {}
        to_tensor_keys = []

        # 收集数据并找到最大尺寸
        max_shape = [0, 0]
        for sample in batch:
            for k, v in sample.items():
                if k not in data_dict:
                    data_dict[k] = []
                if isinstance(v, (np.ndarray, torch.Tensor, PIL.Image.Image)):
                    if k not in to_tensor_keys:
                        to_tensor_keys.append(k)
                    if isinstance(v, PIL.Image.Image):
                        v = transforms.ToTensor()(v)  # 将PIL Image转换为Tensor
                    if v.dim() == 3:
                        max_shape[0] = max(max_shape[0], v.size(1))  # height
                        max_shape[1] = max(max_shape[1], v.size(2))  # width
                data_dict[k].append(v)

        # 填充数据
        for k in to_tensor_keys:
            if k == 'img':  # 假设'img'是图像数据
                padded_tensors = []
                for img in data_dict[k]:
                    padding = (0, max_shape[1] - img.size(2), 0, max_shape[0] - img.size(1))
                    padded_img = pad(img, padding, fill=0, padding_mode='constant')
                    padded_tensors.append(padded_img)
                data_dict[k] = torch.stack(padded_tensors, 0)
            else:
                data_dict[k] = torch.stack(data_dict[k], 0)

        return data_dict

并且之后我又将数据集的配置文件改为了DB\config\icdar2015_resnet18_FPN_DBhead_polyLR.yaml，算法成功跑通！