PyTorch实现yolov3

最新推荐文章于 2024-05-28 22:15:27 发布

pawerd

最新推荐文章于 2024-05-28 22:15:27 发布

阅读量1.6k

点赞数 1

文章标签： python 人工智能 pytorch

本文链接：https://blog.csdn.net/weixin_45185432/article/details/111414858

版权

PyTorch实现yolov3

yolo系列是目标识别的重头戏为了更好的理解掌握它，我们必须从源码出发深刻理解代码。下面我们来讲解pytorch实现的yolov3源码。

创建YOLO网络
首先我们知道yolov3将resnet改造变成了具有更好性能的Darknet作为它的backbone，称为darknet。
配置文件
官方代码（authored in C）使用一个配置文件来构建网络，即 cfg 文件一块块地描述了网络架构。我们开始要做的就是用pytorch来读取网络结构形成自己的module进行前向与反向传播。

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

我们看到图中有两个卷积层与一个用于残差相加的跳转连接，下面我们逐一讲解darknet中所有的层级。

1.卷积层

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

普通的卷积操作，filters其实就是输出的通道数，激活函数用的是leaky，这里batch_normalize=1 相当于一个flag并不是bn的参数。
详细解释一下pytorch中的bn，BN这个类（BatchNorm2d）需要一些参数，num_features是feature map的通道数，因为bn最终优化的参数是γ和β，这两个参数是相对于每个维度来讲的，即一个维度对应一组参数。其他的参数不是很重要，想了解的话可以去官网查看。
2.跳转连接

[shortcut]
from=-3
activation=linear

它于残差网络中使用的结构类似，参数 from 为-3 表明该层的输出为前一层的输出加上前三层的输出（要求维度相同，两组输出直接相加）。
3.上采样

[upsamle]
stride=2

通过参数 stride 在前面层级中双线性上采样特征图。
4.路由层（Route）

[route]
layers=-4

[route]
layers-1，61

它的参数 layers 有一个或两个值。当只有一个值时，它输出这一层通过该值索引的特征图。在我们的实验中设置为了-4，所以层级将输出路由层之前第四个层的特征图。

当层级有两个值时，它将返回由这两个值索引的拼接特征图。在我们的实验中为-1 和61，因此该层级将输出从前一层级（-1）到第 61 层的特征图，并将它们按深度拼接。
5.YOLO

[yolo]
mask=0,1,2
anchors=10,13,16,30,33,23,30,61,62,45,59,119,116,90,156,198,373,326
classes=80
num=9
jitter=.3
ignore_thresh=.5
truth_thresh=1
random=1

这里的9个anchor是论文中提出用k-means求出来的，yolov3有三条预测之路（多尺度由此而来）mask=0，1，2代表第一条支路，分别对应其anchors。

还有一个net，其描述的不是层的信息，而是训练相关的参数。

[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16
width= 320
height = 320
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

解析配置文件
在开始之前，我们先在 darknet.py 文件顶部添加必要的导入项。

from __future__ import division

import torch 
import torch.nn as nn
import torch.nn.functional as F 
from torch.autograd import Variable
import numpy as np

我们定义一个函数 parse_cfg，该函数使用配置文件的路径作为输入。

def parse_cfg(cfgfile):
"""
 Takes a configuration file

 Returns a list of blocks. Each blocks describes a block in the neural
 network to be built. Block is represented as a dictionary in the list

 """

这里的思路是解析 cfg，将每个块存储为词典。这些块的属性和值都以键值对的形式存储在词典中。解析过程中，我们将这些词典（由代码中的变量 block 表示）添加到列表 blocks 中。我们的函数将返回该 block。
我们首先将配置文件内容保存在字符串列表中。下面的代码对该列表执行预处理：

file = open(cfgfile, 'r')
lines = file.read().split('\n') # store the lines in a list
lines = [x for x in lines if len(x) > 0] # get read of the empty lines 
lines = [x for x in lines if x[0] != '#'] # get rid of comments
lines = [x.rstrip().lstrip() for x in lines] # get rid of fringe whitespaces


block = {}   #遍历预处理的列表，得到块
blocks = []

for line in lines:
if line[0] == "[": # This marks the start of a new block
if len(block) != 0: # If block is not empty, implies it is storing values of previous block.
 blocks.append(block) # add it the blocks list
 block = {} # re-init the block
 block["type"] = line[1:-1].rstrip() 
else:
 key,value = line.split("=") 
 block[key.rstrip()] = value.lstrip()
blocks.append(block)

return blocks
#定义创建构建块
def create_modules(blocks):
 net_info = blocks[0] #Captures the information about the input and pre-processing 
 module_list = nn.ModuleList()
 prev_filters = 3
 output_filters = []

在迭代该列表之前，我们先定义变量 net_info，来存储该网络的信息。

nn.ModuleList

我们的函数将会返回一个 nn.ModuleList。这个类几乎等同于一个包含 nn.Module 对象的普通列表。然而，当添加 nn.ModuleList 作为 nn.Module 对象的一个成员时（即当我们添加模块到我们的网络时），所有 nn.ModuleList 内部的 nn.Module 对象（模块）的 parameter 也被添加作为 nn.Module 对象（即我们的网络，添加 nn.ModuleList 作为其成员）的 parameter。

当我们定义一个新的卷积层时，我们必须定义它的卷积核维度。虽然卷积核的高度和宽度由 cfg 文件提供，但卷积核的深度是由上一层的卷积核数量（或特征图深度）决定的。这意味着我们需要持续追踪被应用卷积层的卷积核数量。我们使用变量 prev_filter 来做这件事。我们将其初始化为 3，因为图像有对应 RGB 通道的 3 个通道。

路由层（route layer）从前面层得到特征图（可能是拼接的）。如果在路由层之后有一个卷积层，那么卷积核将被应用到前面层的特征图上，精确来说是路由层得到的特征图。因此，我们不仅需要追踪前一层的卷积核数量，还需要追踪之前每个层。随着不断地迭代，我们将每个模块的输出卷积核数量添加到 output_filters 列表上。

现在，我们的思路是迭代模块的列表，并为每个模块创建一个 PyTorch 模块。

 for index, x in enumerate(blocks[1:]):
 module = nn.Sequential()

#check the type of block
#create a new module for the block
#append to module_list

nn.Sequential 类被用于按顺序地执行 nn.Module 对象的一个数字。如果你查看 cfg 文件，你会发现，一个模块可能包含多于一个层。例如，一个 convolutional 类型的模块有一个批量归一化层、一个 leaky ReLU 激活层以及一个卷积层。我们使用 nn.Sequential 将这些层串联起来，得到 add_module 函数。例如，以下展示了我们如何创建卷积层和上采样层的例子：

 if (x["type"] == "convolutional"):
#Get the info about the layer
 activation = x["activation"]
try:
 batch_normalize = int(x["batch_normalize"])
 bias = False
except:
 batch_normalize = 0
 bias = True

 filters= int(x["filters"])
 padding = int(x["pad"])
 kernel_size = int(x["size"])
 stride = int(x["stride"])

if padding:
 pad = (kernel_size - 1) // 2
else:
 pad = 0

#Add the convolutional layer
 conv = nn.Conv2d(prev_filters, filters, kernel_size, stride, pad, bias = bias)
 module.add_module("conv_{0}".format(index), conv)

#Add the Batch Norm Layer
if batch_normalize:
 bn = nn.BatchNorm2d(filters)
 module.add_module("batch_norm_{0}".format(index), bn)

#Check the activation. 
#It is either Linear or a Leaky ReLU for YOLO
if activation == "leaky":
 activn = nn.LeakyReLU(0.1, inplace = True)
 module.add_module("leaky_{0}".format(index), activn)

#If it's an upsampling layer
#We use Bilinear2dUpsampling
elif (x["type"] == "upsample"):
 stride = int(x["stride"])
 upsample = nn.Upsample(scale_factor = 2, mode = "bilinear")
 module.add_module("upsample_{}".format(index), upsample)

路由层/捷径层

接下来，我们来写创建路由层（Route Layer）和捷径层（Shortcut Layer）的代码：

 #If it is a route layer
elif (x["type"] == "route"):
 x["layers"] = x["layers"].split(',')
#Start of a route
 start = int(x["layers"][0])
#end, if there exists one.
try:
 end = int(x["layers"][1])
except:
 end = 0
#Positive anotation
if start > 0: 
 start = start - index
if end > 0:
 end = end - index
 route = EmptyLayer()
 module.add_module("route_{0}".format(index), route)
if end < 0:
 filters = output_filters[index + start] + output_filters[index + end]
else:
 filters= output_filters[index + start]

#shortcut corresponds to skip connection
elif x["type"] == "shortcut":
 shortcut = EmptyLayer()
 module.add_module("shortcut_{}".format(index), shortcut)

创建路由层的代码需要做一些解释。首先，我们提取关于层属性的值，将其表示为一个整数，并保存在一个列表中。

然后我们得到一个新的称为 EmptyLayer 的层，顾名思义，就是空的层。

route = EmptyLayer()

#定义如下
class EmptyLayer(nn.Module):
def __init__(self):
 super(EmptyLayer, self).__init__()

一个空的层可能会令人困惑，因为它没有做任何事情。而 Route Layer 正如其它层将执行某种操作（获取之前层的拼接）。在 PyTorch 中，当我们定义了一个新的层，我们在子类 nn.Module 中写入层在 nn.Module 对象的 forward 函数的运算。

对于在 Route 模块中设计一个层，我们必须建立一个 nn.Module 对象，其作为 layers 的成员被初始化。然后，我们可以写下代码，将 forward 函数中的特征图拼接起来并向前馈送。最后，我们执行网络的某个 forward 函数的这个层。

但拼接操作的代码相当地短和简单（在特征图上调用 torch.cat），像上述过程那样设计一个层将导致不必要的抽象，增加样板代码。取而代之，我们可以将一个假的层置于之前提出的路由层的位置上，然后直接在代表 darknet 的 nn.Module 对象的 forward 函数中执行拼接运算。（如果感到困惑，我建议你读一下 nn.Module 类在 PyTorch 中的使用）。

在路由层之后的卷积层会把它的卷积核应用到之前层的特征图（可能是拼接的）上。以下的代码更新了 filters 变量以保存路由层输出的卷积核数量。

if end < 0:
#If we are concatenating maps
 filters = output_filters[index + start] + output_filters[index + end]
else:
 filters= output_filters[index + start]

捷径层也使用空的层，因为它还要执行一个非常简单的操作（加）。没必要更新 filters 变量，因为它只是将前一层的特征图添加到后面的层上而已。
YOLO 层

 #Yolo is the detection layer
elif x["type"] == "yolo":
 mask = x["mask"].split(",")
 mask = [int(x) for x in mask]

 anchors = x["anchors"].split(",")
 anchors = [int(a) for a in anchors]
 anchors = [(anchors[i], anchors[i+1]) for i in range(0, len(anchors),2)]
 anchors = [anchors[i] for i in mask]

 detection = DetectionLayer(anchors)
 module.add_module("Detection_{}".format(index), detection)

检测层的定义如下：

class DetectionLayer(nn.Module):
def __init__(self, anchors):
 super(DetectionLayer, self).__init__()
 self.anchors = anchors

 module_list.append(module)
 prev_filters = filters
 output_filters.append(filters)

测试代码

你可以在 darknet.py 后通过输入以下命令行测试代码，运行文件。

blocks = parse_cfg("cfg/yolov3.cfg")
print(create_modules(blocks))

实现网络的前向传播
决条件

阅读本教程前两部分；

PyTorch 基础知识，包括如何使用 nn.Module、nn.Sequential 和 torch.nn.parameter 创建自定义架构；

在 PyTorch 中处理图像。

定义网络

如前所述，我们使用 nn.Module 在 PyTorch 中构建自定义架构。这里，我们可以为检测器定义一个网络。在 darknet.py 文件中，我们添加了以下类别：

class Darknet(nn.Module):
def __init__(self, cfgfile):
 super(Darknet, self).__init__()
 self.blocks = parse_cfg(cfgfile)
 self.net_info, self.module_list = create_modules(self.blocks)

这里，我们对 nn.Module 类别进行子分类，并将我们的类别命名为 Darknet。我们用 members、blocks、net_info 和 module_list 对网络进行初始化。

实现该网络的前向传播

该网络的前向传播通过覆写 nn.Module 类别的 forward 方法而实现。

forward 主要有两个目的。一，计算输出；二，尽早处理的方式转换输出检测特征图（例如转换之后，这些不同尺度的检测图就能够串联，不然会因为不同维度不可能实现串联）。

def forward(self, x, CUDA):
 modules = self.blocks[1:]
 outputs = {} #We cache the outputs for the route layer

forward 函数有三个参数：self、输入 x 和 CUDA（如果是 true，则使用 GPU 来加速前向传播）。

这里，我们迭代 self.block[1:] 而不是 self.blocks，因为 self.blocks 的第一个元素是一个 net 块，它不属于前向传播。

由于路由层和捷径层需要之前层的输出特征图，我们在字典 outputs 中缓存每个层的输出特征图。关键在于层的索引，且值对应特征图。

正如 create_module 函数中的案例，我们现在迭代 module_list，它包含了网络的模块。需要注意的是这些模块是以在配置文件中相同的顺序添加的。这意味着，我们可以简单地让输入通过每个模块来得到输出。

 if module_type == "convolutional" or module_type == "upsample":
 x = self.module_list[i](x)

路由层／捷径层

如果你查看路由层的代码，我们必须说明两个案例（正如第二部分中所描述的）。对于第一个案例，我们必须使用 torch.cat 函数将两个特征图级联起来，第二个参数设为 1。这是因为我们希望将特征图沿深度级联起来。（在 PyTorch 中，卷积层的输入和输出的格式为`B X C X H X W。深度对应通道维度）。

 elif module_type == "route":
 layers = module["layers"]
 layers = [int(a) for a in layers]

if (layers[0]) > 0:
 layers[0] = layers[0] - i

if len(layers) == 1:
 x = outputs[i + (layers[0])]

else:
if (layers[1]) > 0:
 layers[1] = layers[1] - i

 map1 = outputs[i + layers[0]]
 map2 = outputs[i + layers[1]]

 x = torch.cat((map1, map2), 1)

elif module_type == "shortcut":
 from_ = int(module["from"])
 x = outputs[i-1] + outputs[i+from_]

YOLO（检测层）

YOLO 的输出是一个卷积特征图，包含沿特征图深度的边界框属性。边界框属性由彼此堆叠的单元格预测得出。因此，如果你需要在 (5,6) 处访问单元格的第二个边框，那么你需要通过 map[5,6, (5+C): 2*(5+C)] 将其编入索引。这种格式对于输出处理过程（例如通过目标置信度进行阈值处理、添加对中心的网格偏移、应用锚点等）很不方便。

另一个问题是由于检测是在三个尺度上进行的，预测图的维度将是不同的。虽然三个特征图的维度不同，但对它们执行的输出处理过程是相似的。如果能在单个张量而不是三个单独张量上执行这些运算，就太好了。
为了解决这些问题，我们引入了函数 predict_transform。

变换输出

函数 predict_transform 在文件 util.py 中，我们在 Darknet 类别的 forward 中使用该函数时，将导入该函数。

在 util.py 顶部添加导入项：

from __future__ import division

import torch 
import torch.nn as nn
import torch.nn.functional as F 
from torch.autograd import Variable
import numpy as np
import cv2

predict_transform 使用 5 个参数：prediction（我们的输出）、inp_dim（输入图像的维度）、anchors、num_classes、CUDA flag（可选）。

def predict_transform(prediction, inp_dim, anchors, num_classes, CUDA = True):

predict_transform 函数把检测特征图转换成二维张量，张量的每一行对应边界框的属性

 batch_size = prediction.size(0)
 stride = inp_dim // prediction.size(2)
 grid_size = inp_dim // stride
 bbox_attrs = 5 + num_classes
 num_anchors = len(anchors)

 prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size)
 prediction = prediction.transpose(1,2).contiguous()
 prediction = prediction.view(batch_size, grid_size*grid_size*num_anchors, bbox_attrs)

对 (x,y) 坐标和 objectness 分数执行 Sigmoid 函数操作。

 #Sigmoid the centre_X, centre_Y. and object confidencce
 prediction[:,:,0] = torch.sigmoid(prediction[:,:,0])
 prediction[:,:,1] = torch.sigmoid(prediction[:,:,1])
 prediction[:,:,4] = torch.sigmoid(prediction[:,:,4])

将网格偏移添加到中心坐标预测中：

 #Add the center offsets
 grid = np.arange(grid_size)
 a,b = np.meshgrid(grid, grid)

 x_offset = torch.FloatTensor(a).view(-1,1)
 y_offset = torch.FloatTensor(b).view(-1,1)

if CUDA:
 x_offset = x_offset.cuda()
 y_offset = y_offset.cuda()

 x_y_offset = torch.cat((x_offset, y_offset), 1).repeat(1,num_anchors).view(-1,2).unsqueeze(0)

 prediction[:,:,:2] += x_y_offset

将锚点应用到边界框维度中：

 #log space transform height and the width
 anchors = torch.FloatTensor(anchors)

if CUDA:
 anchors = anchors.cuda()

 anchors = anchors.repeat(grid_size*grid_size, 1).unsqueeze(0)
 prediction[:,:,2:4] = torch.exp(prediction[:,:,2:4])*anchors

将 sigmoid 激活函数应用到类别分数中：

 prediction[:,:,5: 5 + num_classes] = torch.sigmoid((prediction[:,:, 5 : 5 + num_classes]))
 #将检测图的大小调整到与输入图像大小一致。边界框属性根据特征图的大小而定（如 13 x 13）。如果输入图像大小是 416 x 416，那么我们将属性乘 32，或乘 stride 变量
 prediction[:,:,:4] *= stride

重新访问的检测层

我们已经变换了输出张量，现在可以将三个不同尺度的检测图级联成一个大的张量。注意这必须在变换之后进行，因为你无法级联不同空间维度的特征图。变换之后，我们的输出张量把边界框表格呈现为行，级联就比较可行了。

一个阻碍是我们无法初始化空的张量，再向其级联一个（不同形态的）非空张量。因此，我们推迟收集器（容纳检测的张量）的初始化，直到获得第一个检测图，再把这些检测图级联起来。

注意 write = 0 在函数 forward 的 loop 之前。write flag 表示我们是否遇到第一个检测。如果 write 是 0，则收集器尚未初始化。如果 write 是 1，则收集器已经初始化，我们只需要将检测图与收集器级联起来即可。

现在，我们具备了 predict_transform 函数，我们可以写代码，处理 forward 函数中的检测特征图。

在 darknet.py 文件的顶部，添加以下导入项：

from util import *

然后在 forward 函数中定义：

 elif module_type == 'yolo': 

 anchors = self.module_list[i][0].anchors
#Get the input dimensions
 inp_dim = int (self.net_info["height"])

#Get the number of classes
 num_classes = int (module["classes"])

#Transform 
 x = x.data
 x = predict_transform(x, inp_dim, anchors, num_classes, CUDA)
if not write: #if no collector has been intialised. 
 detections = x
 write = 1

else: 
 detections = torch.cat((detections, x), 1)

 outputs[i] = x

现在，只需返回检测结果。

 return detections

测试前向传播

下面的函数将创建一个伪造的输入，我们可以将该输入传入我们的网络。在写该函数之前，我们可以使用以下命令行将这张图像保存到工作目录：

wget https://github.com/ayooshkathuria/pytorch-yolo-v3/raw/master/dog-cycle-car.png

也可以直接下载图像：https://github.com/ayooshkathuria/pytorch-yolo-v3/raw/master/dog-cycle-car.png

现在，在 darknet.py 文件的顶部定义以下函数：

def get_test_input():
 img = cv2.imread("dog-cycle-car.png")
 img = cv2.resize(img, (416,416)) #Resize to the input dimension
 img_ = img[:,:,::-1].transpose((2,0,1)) # BGR -> RGB | H X W C -> C X H X W 
 img_ = img_[np.newaxis,:,:,:]/255.0 #Add a channel at 0 (for batch) | Normalise
 img_ = torch.from_numpy(img_).float() #Convert to float
 img_ = Variable(img_) # Convert to Variable
return img_

我们需要键入以下代码：

model = Darknet("cfg/yolov3.cfg")
inp = get_test_input()
pred = model(inp)
print (pred)

你将看到如下输出：

( 0 ,.,.) = 
16.0962 17.0541 91.5104 ... 0.4336 0.4692 0.5279
15.1363 15.2568 166.0840 ... 0.5561 0.5414 0.5318
14.4763 18.5405 409.4371 ... 0.5908 0.5353 0.4979
 ⋱ ... 
411.2625 412.0660 9.0127 ... 0.5054 0.4662 0.5043
412.1762 412.4936 16.0449 ... 0.4815 0.4979 0.4582
412.1629 411.4338 34.9027 ... 0.4306 0.5462 0.4138
[torch.FloatTensor of size 1x10647x85]

张量的形状为 1×10647×85，第一个维度为批量大小，这里我们只使用了单张图像。对于批量中的图像，我们会有一个 100647×85 的表，它的每一行表示一个边界框（4 个边界框属性、1 个 objectness 分数和 80 个类别分数）。

现在，我们的网络有随机权重，并且不会输出正确的类别。我们需要为网络加载权重文件，因此可以利用官方权重文件。

下载预训练权重

下载权重文件并放入检测器目录下，我们可以直接使用命令行下载：

wget https://pjreddie.com/media/files/yolov3.weights

也可以通过该地址下载：https://pjreddie.com/media/files/yolov3.weights

官方的权重文件是一个二进制文件，它以序列方式储存神经网络权重。

我们必须小心地读取权重，因为权重只是以浮点形式储存，没有其它信息能告诉我们到底它们属于哪一层。所以如果读取错误，那么很可能权重加载就全错了，模型也完全不能用。因此，只阅读浮点数，无法区别权重属于哪一层。因此，我们必须了解权重是如何存储的。

首先，权重只属于两种类型的层，即批归一化层（batch norm layer）和卷积层。这些层的权重储存顺序和配置文件中定义层级的顺序完全相同。所以，如果一个 convolutional 后面跟随着 shortcut 块，而 shortcut 连接了另一个 convolutional 块，则你会期望文件包含了先前 convolutional 块的权重，其后则是后者的权重。

当批归一化层出现在卷积模块中时，它是不带有偏置项的。然而，当卷积模块不存在批归一化，则偏置项的「权重」就会从文件中读取。。

加载权重

我们写一个函数来加载权重，它是 Darknet 类的成员函数。它使用 self 以外的一个参数作为权重文件的路径。

def load_weights(self, weightfile):

第一个 160 比特的权重文件保存了 5 个 int32 值，它们构成了文件的标头。

 #Open the weights file
 fp = open(weightfile, "rb")

#The first 5 values are header information 
# 1. Major version number
# 2. Minor Version Number
# 3. Subversion number 
# 4,5. Images seen by the network (during training)
 header = np.fromfile(fp, dtype = np.int32, count = 5)
 self.header = torch.from_numpy(header)
 self.seen = self.header[3]

之后的比特代表权重，按上述顺序排列。权重被保存为 float32 或 32 位浮点数。我们来加载 np.ndarray 中的剩余权重。

weights = np.fromfile(fp, dtype = np.float32)

现在，我们迭代地加载权重文件到网络的模块上。

 ptr = 0
for i in range(len(self.module_list)):
 module_type = self.blocks[i + 1]["type"]

#If module_type is convolutional load weights
#Otherwise ignore.

在循环过程中，我们首先检查 convolutional 模块是否有 batch_normalize（True）。基于此，我们加载权重。

 if module_type == "convolutional":
 model = self.module_list[i]
try:
 batch_normalize = int(self.blocks[i+1]["batch_normalize"])
except:
 batch_normalize = 0

 conv = model[0]

我们保持一个称为 ptr 的变量来追踪我们在权重数组中的位置。现在，如果 batch_normalize 检查结果是 True，则我们按以下方式加载权重：

 if (batch_normalize):
 bn = model[1]

#Get the number of weights of Batch Norm Layer
 num_bn_biases = bn.bias.numel()

#Load the weights
 bn_biases = torch.from_numpy(weights[ptr:ptr + num_bn_biases])
 ptr += num_bn_biases

 bn_weights = torch.from_numpy(weights[ptr: ptr + num_bn_biases])
 ptr += num_bn_biases

 bn_running_mean = torch.from_numpy(weights[ptr: ptr + num_bn_biases])
 ptr += num_bn_biases

 bn_running_var = torch.from_numpy(weights[ptr: ptr + num_bn_biases])
 ptr += num_bn_biases

#Cast the loaded weights into dims of model weights. 
 bn_biases = bn_biases.view_as(bn.bias.data)
 bn_weights = bn_weights.view_as(bn.weight.data)
 bn_running_mean = bn_running_mean.view_as(bn.running_mean)
 bn_running_var = bn_running_var.view_as(bn.running_var)

#Copy the data to model
 bn.bias.data.copy_(bn_biases)
 bn.weight.data.copy_(bn_weights)
 bn.running_mean.copy_(bn_running_mean)
 bn.running_var.copy_(bn_running_var)

如果 batch_normalize 的检查结果不是 True，只需要加载卷积层的偏置项。

 else:
#Number of biases
 num_biases = conv.bias.numel()

#Load the weights
 conv_biases = torch.from_numpy(weights[ptr: ptr + num_biases])
 ptr = ptr + num_biases

#reshape the loaded weights according to the dims of the model weights
 conv_biases = conv_biases.view_as(conv.bias.data)

#Finally copy the data
 conv.bias.data.copy_(conv_biases)

最后，我们加载卷积层的权重。

#Let us load the weights for the Convolutional layers
num_weights = conv.weight.numel()

#Do the same as above for weights
conv_weights = torch.from_numpy(weights[ptr:ptr+num_weights])
ptr = ptr + num_weights

conv_weights = conv_weights.view_as(conv.weight.data)
conv.weight.data.copy_(conv_weights)

该函数的介绍到此为止，你现在可以通过调用 darknet 对象上的 load_weights 函数来加载 Darknet 对象中的权重。

model = Darknet("cfg/yolov3.cfg")
model.load_weights("yolov3.weights")

通过模型构建和权重加载，我们终于可以开始进行目标检测了。未来，我们还将介绍如何利用 objectness 置信度阈值和非极大值抑制生成最终的检测结果。

pawerd

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
3
评论
PyTorch实现yolov3

PyTorch实现yolov3yolo系列是目标识别的重头戏为了更好的理解掌握它，我们必须从源码出发深刻理解代码。下面我们来讲解pytorch实现的yolov3源码。创建YOLO网络首先我们知道yolov3将resnet改造变成了具有更好性能的Darknet作为它的backbone，称为darknet。配置文件官方代码（authored in C）使用一个配置文件来构建网络，即 cfg 文件一块块地描述了网络架构。我们开始要做的就是用pytorch来读取网络结构形成自己的module进行前向与反向
复制链接

扫一扫

PyTorch实现yolov3

“相关推荐”对你有帮助么？