- 大坑,我在显示的时候,一个三通道图像显示出来就和二值化一样…
首先给出我有问题的处理过程:
rgb_img = cv2.imread(self.rgb_ids[index])
rgb_img = torch.from_numpy(rgb_img).permute(2, 0, 1).float()
...
s = rgb_img[0].permute(1, 2, 0).numpy()
cv2.imshow("tmd", s)
cv2.waitKey(25)
- 问题还是在于opencv的显示函数!
这里先把它支持的类型放上,我想大家就差不多知道原因了.
. - If the image is 8-bit unsigned, it is displayed as is.
. - If the image is 16-bit unsigned or 32-bit integer, the pixels are divided by 256. That is, the
. value range [0,255\*256] is mapped to [0,255].
. - If the image is 32-bit or 64-bit floating-point, the pixel values are multiplied by 255. That is, the
. value range [0,1] is mapped to [0,255].
.
我来斗胆解释一下,我们输入的时候,为了网络训练的需要给他转换成了float,但是显示的时候就不吃这一套了,此时输入对应第三条:32-bit or 64-bit floating-point
这些浮点数会x255,也就是说[0-1]映射成[0-255].私以为这种说还是不够严谨,如果我的输入大于1呢(大部分情况是这样的)?这时候也没办法看到处理之后到底变成了多少…
解决办法:
在显示之前,用numpy转换成"uint8"格式即可
一行代码来表示:
a = np.array(a, dtype='uint8')
其中,a是输入的tensor
/ / c o d i n g t i m e //codingtime //codingtime
A
p
p
e
n
d
i
x
Appendix
Appendix
给出一个完整的最小示例代码
import cv2
import torch
import numpy as np
a = cv2.imread("img_set/bg.jpg")
a = cv2.resize(a, (640, 430))
print("befor type: ", type(a[0][0][0]))
cv2.imshow("befor", a)
a = torch.from_numpy(a).float()
# a = a.numpy() # 用这个就可以看到错误显示结果 白不垃圾的...大坑
a = np.array(a, dtype='uint8')
print("after type: ", type(a[0][0][0]))
# cv2.rectangle(a, (0, 0), (100, 100), (0, 0, 200), 1)
cv2.imshow("after", a)
# print("max: ", a.mean()) #
cv2.waitKey(0)
# 2020.07.10
参考与致谢: