记录一下语义分割里面分割图片的算法以及膨胀预测
主要原因是因为在打中兴比赛 时,原始图片太大,GPU显存不够用,因此无奈只能将图片crop成为一份份。同时,为了节约开销,采用膨胀预测策略。
将图片按照步长分割为一块块。
# 将图片分片的代码
# 输入数据类型为 torch
def Im2Patch(img, win, stride=1):
k = 0
endc = img.shape[0]
endw = img.shape[1]
endh = img.shape[2]
patch = img[:, 0:endw - win + 0 + 1:stride, 0:endh - win + 0 + 1:stride]
TotalPatNum = patch.shape[1] * patch.shape[2]
Y = np.zeros([endc, win * win, TotalPatNum], np.float32)
for i in range(win):
for j in range(win):
patch = img[:, i:endw - win + i + 1:stride, j:endh - win + j + 1:stride]
Y[:, k, :] = np.array(patch[:]).reshape(endc, TotalPatNum)
k = k + 1
return Y.reshape([endc, win, win, TotalPatNum])
分块预测之膨胀预测
将大图片分成一块块预测,同时为了防止边缘的效果不好,因此每次只取预测中心的一部分,最后合成原始的图片。
具体思路请参照这个文章。
参考链接:(2条消息) 语义分割之膨胀预测_输出是为了学习的博客-CSDN博客_膨胀预测 语义分割
import torch
import torch.nn.functional as F
def image_restore(image,model,crop_size = 256,stride = 256):
"""
for input as (batch,C,H,W)
"""
width = image.shape[3]
height = image.shape[2]
img_list = []
predict_list = []
right_fill = stride - (width % stride)
bottom_fill = stride - (height % stride)
width_path_number = int((width + right_fill) / stride) # 横向切成的小图的数量
height_path_number = int((height + bottom_fill) / stride) # 纵向切成的小图的数量
#first pad
image_first_pad = F.pad(image,(0,right_fill,0,bottom_fill),'replicate') #(left,right,up,down)
#second pad
if crop_size == stride:
pad_size = 0
image_pad = image_first_pad
else:
pad_size = (crop_size - stride)//2
image_pad = F.pad(image_first_pad,(pad_size,pad_size,pad_size,pad_size),'replicate')
# padding down!
for h in range(height_path_number):
for w in range(width_path_number):
# sample
image_sample = image_pad[:,:,(h * stride):(h * stride + crop_size),
(w * stride):(w * stride + crop_size)]
img_list.append(image_sample)
count_img = 0
for image in img_list:
with torch.no_grad():
predict = model(image)
count_img = count_img + 1
# 保存覆盖小图片
predict_list.append(predict)
count_temp = 0
for h in range(height_path_number):
for w in range(width_path_number):
image_first_pad[:,:,(h * stride):(h * stride + stride),
(w * stride):(w * stride + stride)] = predict_list[count_temp][:,:,pad_size:pad_size+stride,pad_size:pad_size+stride]
count_temp += 1
temp = image_first_pad[:,:,0:height,0:width]
return temp