【跑实验07】RuntimeError: Argument #6: Padding size should be less than the corresponding input dimension

最新推荐文章于 2024-08-23 17:36:30 发布

旅途中的宽~

最新推荐文章于 2024-08-23 17:36:30 发布

阅读量1.1k

点赞数

分类专栏：跑实验文章标签：深度学习

本文链接：https://blog.csdn.net/wzk4869/article/details/131857028

版权

跑实验专栏收录该内容

8 篇文章 1 订阅

订阅专栏

最近在尝试跑实验的时候，我们的部分代码为：

patch_h = 28
patch_w = 28
feat_dim = 768

transform = T.Compose([
    T.GaussianBlur(9, sigma=(0.1, 2.0)),
    T.Resize((patch_h * 14, patch_w * 14)),
    T.CenterCrop((patch_h * 14, patch_w * 14)),
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])

dinov2_vitb14 = torch.hub.load('', 'dinov2_vitb14',source='local').cuda()

features = torch.zeros(4, patch_h * patch_w, feat_dim)
imgs_tensor = torch.zeros(4, 3, patch_h * 14, patch_w * 14).cuda()

img_path = f'/home/wangzhenkuan/val_cropped/cropped_(25, 140, 39, 143)_obj365_val_000000685822.jpg'
img = Image.open(img_path).convert('RGB')
imgs_tensor[0] = transform(img)[:3]
with torch.no_grad():
    features_dict = dinov2_vitb14.forward_features(imgs_tensor)
    features = features_dict['x_norm_patchtokens']

features = features.reshape(4 * patch_h * patch_w, feat_dim).cpu()
pca = PCA(n_components=3)
pca.fit(features)
pca_features = pca.transform(features)
pca_features[:, 0] = (pca_features[:, 0] - pca_features[:, 0].min()) / (pca_features[:, 0].max() - pca_features[:, 0].min())
new_pca_features = pca_features.flatten()
print(new_pca_features, new_pca_features.shape)

遇到了这样的错误：

RuntimeError: Argument #6: Padding size should be less than the corresponding input dimension, but got: padding (4, 4) at dimension 2 of input 4

在这里插入图片描述

根据提供的代码和错误信息，问题出现在T.Resize()操作中。T.Resize()操作调整图像大小时，填充的大小超过了输入图像的对应维度。

在我的代码中，尝试将图像大小调整为(patch_h * 14, patch_w * 14)，但输入图像的尺寸仅为14x3。所以导致填充大小为(4, 4)，而这超过了图像的高度维度3。

为了解决这个问题，我需要确保图像的尺寸足够大，可以容纳patch_h * 14和patch_w * 14的大小。你可以调整输入图像的尺寸，或者修改T.Resize()操作的目标大小。

如果你想调整输入图像的大小，可以使用T.Resize()来设置一个合适的尺寸。如果你希望更改T.Resize()的目标大小，确保目标大小小于输入图像的尺寸。

我们看一下这张照片的尺寸大小：

from PIL import Image

image_path = "/home/wangzhenkuan/val_cropped/cropped_(25, 140, 39, 143)_obj365_val_000000685822.jpg"
img = Image.open(image_path)
width, height = img.size
print(f"图片尺寸：宽度 = {width}px，高度 = {height}px")

可以得到：

图片尺寸：宽度 = 14px，高度 = 3px

我尝试将图片尺寸调整为 14 * 14：

img = Image.open(img_path).convert('RGB').resize((14, 14))

这样的话输出便不再报错！

[ 7.44867728e-01  2.34980489e+00 -2.27559823e-02 ...  3.59724475e-01
  9.42175007e+00  2.56441818e+01] (9408,)

旅途中的宽~

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录