深度学习的训练的理论是梯度下降,这个理论推导很复杂。发展历程中,起源是hinton于1986年提出的BP算法,后续在卷积神经网络和激活函数变为ReLU函数后,梯度下降的算法又有所变化。对梯度下降训练深度网络的理论的学习,重点关注hinton的论文《Learning internal representations by error propagation》。
深度学习中,各个框架的使用,如tf,pytorch等,很重要的一点,就是解决了训练问题。训练的本质是,另损失函数,对各个待训练参数求偏导,寻找梯度下降方向,更新参数。这个求导很复杂,参数数量也极大。但是,借助框架,你需要做的只是完成损失函数的编写,然后调用优化器,框架会自动帮你完成优化过程——梯度下降更新参数。
Faster-RCNN基于tf的训练过程,核心代码如下:
# RPN # classification loss rpn_cls_score = tf.reshape(self.net.get_output('rpn_cls_score_reshape'),[-1,2]) rpn_label = tf.reshape(self.net.get_output('rpn-data')[0],[-1]) rpn_cls_score = tf.reshape(tf.gather(rpn_cls_score,tf.where(tf.not_equal(rpn_label,-1))),[-1,2]) rpn_label = tf.reshape(tf.gather(rpn_label,tf.where(tf.not_equal(rpn_label,-1))),[-1]) rpn_cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=rpn_cls_score, labels=rpn_label)) # bounding box regression L1 loss rpn_bbox_pred = self.net.get_output('rpn_bbox_pred') rpn_bbox_targets = tf.transpose(self.net.get_output('rpn-data')[1],[0,2,3,1]) rpn_bbox_inside_weights = tf.transpose(self.net.get_output('rpn-data')[2],[0,2,3,1]) rpn_bbox_outside_weights = tf.transpose(self.net.get_output('rpn-data')[3],[0,2,3,1]) rpn_smooth_l1 = self._modified_smooth_l1(3.0, rpn_bbox_pred, rpn_bbox_targets, rpn_bbox_inside_weights, rpn_bbox_outside_weights) rpn_loss_box = tf.reduce_mean(tf.reduce_sum(rpn_smooth_l1, reduction_indices=[1, 2, 3])) # R-CNN # classification loss cls_score = self.net.get_output('cls_score') label = tf.reshape(self.net.get_output('roi-data')[1],[-1]) cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=cls_score, labels=label)) # bounding box regression L1 loss bbox_pred = self.net.get_output('bbox_pred') bbox_targets = self.net.get_output('roi-data')[2] bbox_inside_weights = self.net.get_output('roi-data')[3] bbox_outside_weights = self.net.get_output('roi-data')[4] smooth_l1 = self._modified_smooth_l1(1.0, bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights) loss_box = tf.reduce_mean(tf.reduce_sum(smooth_l1, reduction_indices=[1])) # final loss loss = cross_entropy + loss_box + rpn_cross_entropy + rpn_loss_box # optimizer and learning rate global_step = tf.Variable(0, trainable=False) lr = tf.train.exponential_decay(cfg.TRAIN.LEARNING_RATE, global_step, cfg.TRAIN.STEPSIZE, 0.1, staircase=True) momentum = cfg.TRAIN.MOMENTUM train_op = tf.train.MomentumOptimizer(lr, momentum).minimize(loss, global_step=global_step)
求出了损失函数中四部分的表达式,进而得到损失函数的表达式:
loss = cross_entropy + loss_box + rpn_cross_entropy + rpn_loss_box
然后,用损失函数对网络进行优化,更新网络参数:
train_op = tf.train.MomentumOptimizer(lr, momentum).minimize(loss, global_step=global_step)