从下面的博客开始学习的,总结一些自己学到的东西。
https://blog.csdn.net/qq_34199326/article/details/84206079
看源码一个大的感慨就是维度问题,维度,维度,维度,会经常看到unsqueeze,repeat
1. opencv读取图片–>model能处理的格式,在darknet.py的第1个函数。
from torch.autograd import Variable
def get_test_input():
img = cv2.imread("dog-cycle-car.png")
img = cv2.resize(img, (416,416)) # Resize to the input dimension, img[416,416,3]
print(img.shape)
img_ = img[:,:,::-1].transpose((2,0,1)) # [:,:,::-1]第三个维度从后向前取所有元素,BGR -> RGB | img[3,416,416]
img_ = torch.from_numpy(img_).float() # Convert to float
img_ = Variable(img_) # Convert to Variable
return img_
2. 从.cfg读取yolov3网络,在darknet.py的第2个函数。
把.cfg文件按行读取,保存在list中,并将注释,空行,左右两边去掉。
file = open(cfgfile, 'r')
lines = file.read().split('\n') # store the lines in a list
lines = [x for x in lines if len(x) > 0] # get rid of the empty lines = save noozero lines
lines = [x for x in lines if x[0] != '#'] # get rid of comments
lines = [x.rstrip().lstrip() for x in lines] # 去掉左右两边的空格
总体代码如下,写的很简短,有效,看了半天才看明白。
def parse_cfg(cfgfile):
"""
Takes a configuration file
Returns a list of blocks. Each blocks describes a block in the neural
network to be built. Block is represented as a dictionary in the list
"""