安全操作标定文件与训练模型

设计思路

最开始是考虑训练模型,现在数据集网上找到开源的有300W_LP和AFLW2000和BIWI,所以需要先基于这些数据文件进行标定人脸特征点的方案,但是如果使用传统的方案(周工和我前期已经尝试过了通过ImageMe进行标定,但是有几千张照片),如果标定文件选用自己的,标定特征点倒是没问题,但是标定mat文件就不行,所以不许得用别人标定好的文件,尤其是位姿,旋转多少度,不过遗憾的是我们不知道应该标定多少度,且mat文件形式数据大致pose数据比较多,这个我就不留图片了,总之,我经过思考,决定通过画眼镜和口罩的方式解决现有数据缺少标定眼镜与口罩的困境。
本次设计按照两条线路进行,两台设备分别训练300W_LP和AFLW2000,这两个原来开源项目已经提供部分robust文件,
后续升级决定研究基于深度的BIWI模型,运行train_hopenet.py,发现源数据集缺少ddockerface-frame_00001_rgb相关的txt文件,也就是5个特征点坐标,查询相关资料得知,需要下载dockerface软件

遇到的难点

1、跑了5次失败,即生成的pkl或pth文件没有效果
2、BIWI需要GPU支持,且需要cuda8,所以无奈需要重新装ubuntu降级到16.04,配合cuda8
3、GPU不适用的情况,通过修改-GPU 0为配置参数-CPU,解决了运行标定文件问题

重大突破

1、发现pkl与pth文件相互之间能转化,类似于jpg与png的关系
2、训练的模型都是基于hopenet网络,底层都是Resnet_50模型,即https://download.pytorch.org/models/resnet50-19c8e357.pth
3、训练途中发现修改alpha参数和训练曲线,能成功训练模型

训练BIWI

标定BIWI的bbox_path = os.path.join(self.data_dir, y_train_list[0] + ‘/dockerface-’ + y_train_list[-1] + ‘_rgb’ + self.annot_ext)
所以需要训练数据集https://github.com/natanielruiz/dockerface
在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

from __future__ import division
import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms
from utils.timer import Timer
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import caffe, os, sys, cv2
import argparse
import sys

# Dockerface network
NETS = {'vgg16': ('VGG16',
          './caffemodel')}

def parse_args():
  """Parse input arguments."""
  parser = argparse.ArgumentParser(description='Face Detection using Faster R-CNN')
  parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
            default=0, type=int)
  parser.add_argument('--cpu', dest='cpu_mode',
            help='Use CPU mode (overrides --gpu)',
            action='store_true')
  parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16]',
            choices=NETS.keys(), default='vgg16')
  parser.add_argument('--image', dest='image_path', help='Path of image')
  parser.add_argument('--output_string', dest='output_string', help='String appended to output file')
  parser.add_argument('--conf_thresh', dest='conf_thresh', help='Confidence threshold for the detections, float from 0 to 1', default=0.85, type=float)

  args = parser.parse_args()

  return args

if __name__ == '__main__':
  cfg.TEST.HAS_RPN = True  # Use RPN for proposals
  # cfg.TEST.BBOX_REG = False

  args = parse_args()

  prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0],
              'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')
  caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',
                NETS[args.demo_net][1])

  prototxt = 'models/face/VGG16/faster_rcnn_end2end/test.prototxt'
  caffemodel = NETS[args.demo_net][1]

  if not os.path.isfile(caffemodel):
    raise IOError(('{:s} not found.\nDid you run ./data/script/'
             'fetch_faster_rcnn_models.sh?').format(caffemodel))

  if args.cpu_mode:
    caffe.set_mode_cpu()
  else:
    caffe.set_mode_gpu()
    caffe.set_device(args.gpu_id)
    cfg.GPU_ID = args.gpu_id
  net = caffe.Net(prototxt, caffemodel, caffe.TEST)

  print ('\n\nLoaded network {:s}'.format(caffemodel))

  # LOAD DATA FROM IMAGE
  out_dir = 'output/images/'

  if not os.path.exists(out_dir):
    os.makedirs(out_dir)

  dets_file_name = os.path.join(out_dir, 'image-output-%s.txt' % args.output_string)
  fid = open(dets_file_name, 'w')

  CONF_THRESH = args.conf_thresh
  NMS_THRESH = 0.15

  print (args.image_path)
  if not os.path.exists(args.image_path):
    print ('Image does not exist.')

  img = cv2.imread(args.image_path)

  # img is BGR cv2 image.
  # # Detect all object classes and regress object bounds
  scores, boxes = im_detect(net, img)

  cls_ind = 1
  cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
  cls_scores = scores[:, cls_ind]
  dets = np.hstack((cls_boxes,
            cls_scores[:, np.newaxis])).astype(np.float32)
  keep = nms(dets, NMS_THRESH)
  dets = dets[keep, :]

  keep = np.where(dets[:, 4] > CONF_THRESH)
  dets = dets[keep]

  # dets are the upper left and lower right coordinates of bbox
  # dets[:, 0] = x_ul, dets[:, 1] = y_ul
  # dets[:, 2] = x_lr, dets[:, 3] = y_lr

  dets[:, 2] = dets[:, 2]
  dets[:, 3] = dets[:, 3]
  if (dets.shape[0] != 0):
      for j in range(dets.shape[0]):
        # Write file_name bbox_coords
        fid.write(args.image_path.split('/')[-1] + ' %f %f %f %f %f\n' % (dets[j, 0], dets[j, 1], dets[j, 2], dets[j, 3], dets[j, 4]))
        # Draw bbox
        cv2.rectangle(img,(int(dets[j, 0]), int(dets[j, 1])),(int(dets[j, 2]), int(dets[j, 3])),(0,255,0),3)

  cv2.imwrite('output/images/output-%s.png' % args.output_string, img)
  print ('Detected faces in image: ' + args.image_path)
  print ('Done with detection.')

  fid.close()

运行容器

sudo nvidia-docker run -it -v $PWD/data:/opt/py-faster-rcnn/edata -v $PWD/output/video:/opt/py-faster-rcnn/output/video -v $PWD/output/images:/opt/py-faster-rcnn/output/images natanielruiz/dockerface:latest

进入容器,如果容器exit就先start,如果容器没有exit就attach

sudo docker ps -a
sudo docker start CONTAINER_ID
sudo docker attach CONTAINER_ID

在这里插入图片描述

加载了 Faster R-CNN 模型。
解析命令行参数,包括 GPU 设备 ID、网络模型、输入图像路径等。
加载输入图像,并使用 Faster R-CNN 模型进行人脸检测。
对检测到的人脸框进行非极大值抑制和置信度阈值筛选。
将检测到的人脸边界框信息写入输出文件,并在图像上绘制检测到的人脸框。
将带有人脸框的图像保存,并输出检测到的人脸数量。

训练AFLW2000

import argparse
import os
import sys

import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
import torch.utils.model_zoo as model_zoo
import torchvision
from torch.autograd import Variable
from torch.utils.data import DataLoader
from torchvision import transforms

import datasets
import hopenet


def parse_args():
    """Parse input arguments."""
    parser = argparse.ArgumentParser(description='Head pose estimation using the Hopenet network.')
    parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
            default=0, type=int)
    parser.add_argument('--num_epochs', dest='num_epochs', help='Maximum number of training epochs.',
          default=10, type=int)
    parser.add_argument('--batch_size', dest='batch_size', help='Batch size.',
          default=16, type=int)
    parser.add_argument('--lr', dest='lr', help='Base learning rate.',
          default=0.0001, type=float)
    parser.add_argument('--dataset', dest='dataset', help='Dataset type.', default='AFLW2000', type=str)
    parser.add_argument('--data_dir', dest='data_dir', help='Directory path for data.',
          default='C:\Demos\\face_detection1\hopenet_data\AFLW2000', type=str)
    parser.add_argument('--filename_list', dest='filename_list', help='Path to text file containing relative paths for every example.',
          default='C:\Demos\\face_detection1\hopenet_data\\files.txt', type=str)
    parser.add_argument('--output_string', dest='output_string', help='String appended to output snapshots.', default = '', type=str)
    parser.add_argument('--alpha', dest='alpha', help='Regression loss coefficient.',
          default=1, type=float)
    parser.add_argument('--snapshot', dest='snapshot', help='Path of model snapshot.',
          default='', type=str)

    args = parser.parse_args()
    return args

def get_ignored_params(model):
    # Generator function that yields ignored params.
    b = [model.conv1, model.bn1, model.fc_finetune]
    for i in range(len(b)):
        for module_name, module in b[i].named_modules():
            if 'bn' in module_name:
                module.eval()
            for name, param in module.named_parameters():
                yield param

def get_non_ignored_params(model):
    # Generator function that yields params that will be optimized.
    b = [model.layer1, model.layer2, model.layer3, model.layer4]
    for i in range(len(b)):
        for module_name, module in b[i].named_modules():
            if 'bn' in module_name:
                module.eval()
            for name, param in module.named_parameters():
                yield param

def get_fc_params(model):
    # Generator function that yields fc layer params.
    b = [model.fc_yaw, model.fc_pitch, model.fc_roll]
    for i in range(len(b)):
        for module_name, module in b[i].named_modules():
            for name, param in module.named_parameters():
                yield param

def load_filtered_state_dict(model, snapshot):
    # By user apaszke from discuss.pytorch.org

    model_dict = model.state_dict()
    snapshot = {k: v for k, v in snapshot.items() if k in model_dict}
    model_dict.update(snapshot)
    model.load_state_dict(model_dict)

if __name__ == '__main__':
    args = parse_args()

    cudnn.enabled = True
    num_epochs = args.num_epochs
    batch_size = args.batch_size
    gpu = args.gpu_id

    if not os.path.exists('output/snapshots'):
        os.makedirs('output/snapshots')

    # ResNet50 structure
    model = hopenet.Hopenet(torchvision.models.resnet.Bottleneck, [3, 4, 6, 3], 66)

    if args.snapshot == '':
        load_filtered_state_dict(model, model_zoo.load_url('https://download.pytorch.org/models/resnet50-19c8e357.pth'))
    else:
        saved_state_dict = torch.load(args.snapshot)
        model.load_state_dict(saved_state_dict)

    print ('Loading data.')

    transformations = transforms.Compose([transforms.Resize(240),
    transforms.RandomCrop(224), transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

    if args.dataset == 'Pose_300W_LP':
        pose_dataset = datasets.Pose_300W_LP(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'Pose_300W_LP_random_ds':
        pose_dataset = datasets.Pose_300W_LP_random_ds(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'Synhead':
        pose_dataset = datasets.Synhead(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'AFLW2000':
        pose_dataset = datasets.AFLW2000(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'BIWI':
        pose_dataset = datasets.BIWI(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'AFLW':
        pose_dataset = datasets.AFLW(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'AFLW_aug':
        pose_dataset = datasets.AFLW_aug(args.data_dir, args.filename_list, transformations)
    elif args.dataset == 'AFW':
        pose_dataset = datasets.AFW(args.data_dir, args.filename_list, transformations)
    else:
        print ('Error: not a valid dataset name')
        sys.exit()

    train_loader = torch.utils.data.DataLoader(dataset=pose_dataset,
                                               batch_size=batch_size,
                                               shuffle=True,
                                               num_workers=2)

    model.cuda(gpu)
    criterion = nn.CrossEntropyLoss().cuda(gpu)
    reg_criterion = nn.MSELoss().cuda(gpu)
    # Regression loss coefficient
    alpha = args.alpha

    softmax = nn.Softmax().cuda(gpu)
    idx_tensor = [idx for idx in range(66)]
    idx_tensor = Variable(torch.FloatTensor(idx_tensor)).cuda(gpu)

    optimizer = torch.optim.Adam([{'params': get_ignored_params(model), 'lr': 0},
                                  {'params': get_non_ignored_params(model), 'lr': args.lr},
                                  {'params': get_fc_params(model), 'lr': args.lr * 5}],
                                   lr = args.lr)

    print ('Ready to train network.')
    for epoch in range(num_epochs):
        for i, (images, labels, cont_labels, name) in enumerate(train_loader):
            images = Variable(images).cuda(gpu)

            # Binned labels
            label_yaw = Variable(labels[:,0]).cuda(gpu)
            label_pitch = Variable(labels[:,1]).cuda(gpu)
            label_roll = Variable(labels[:,2]).cuda(gpu)

            # Continuous labels
            label_yaw_cont = Variable(cont_labels[:,0]).cuda(gpu)
            label_pitch_cont = Variable(cont_labels[:,1]).cuda(gpu)
            label_roll_cont = Variable(cont_labels[:,2]).cuda(gpu)

            # Forward pass
            yaw, pitch, roll = model(images)

            # Cross entropy loss
            loss_yaw = criterion(yaw, label_yaw)
            loss_pitch = criterion(pitch, label_pitch)
            loss_roll = criterion(roll, label_roll)
            print("name:", str(name))
            # MSE loss
            yaw_predicted = softmax(yaw)
            pitch_predicted = softmax(pitch)
            roll_predicted = softmax(roll)

            yaw_predicted = torch.sum(yaw_predicted * idx_tensor, 1) * 3 - 99
            pitch_predicted = torch.sum(pitch_predicted * idx_tensor, 1) * 3 - 99
            roll_predicted = torch.sum(roll_predicted * idx_tensor, 1) * 3 - 99

            loss_reg_yaw = reg_criterion(yaw_predicted, label_yaw_cont)
            loss_reg_pitch = reg_criterion(pitch_predicted, label_pitch_cont)
            loss_reg_roll = reg_criterion(roll_predicted, label_roll_cont)

            # Total loss
            loss_yaw += alpha * loss_reg_yaw
            loss_pitch += alpha * loss_reg_pitch
            loss_roll += alpha * loss_reg_roll

            loss_seq = [loss_yaw, loss_pitch, loss_roll]
            grad_seq = [torch.tensor(1.0).cuda(gpu) for _ in range(len(loss_seq))]
            optimizer.zero_grad()
            torch.autograd.backward(loss_seq, grad_seq)
            optimizer.step()

            if (i+1) % 100 == 0:
                print(i)
                # print ('Epoch [%d/%d], Iter [%d/%d] Losses: Yaw %.4f, Pitch %.4f, Roll %.4f'
                #        %(epoch+1, num_epochs, i+1, len(pose_dataset)//batch_size, loss_yaw.data[0], loss_pitch.data[0], loss_roll.data[0]))
        # torch.cuda.empty_cache()
        # Save models at numbered epochs.
        if epoch % 1 == 0 and epoch < num_epochs:
            print ('Taking snapshot...')
            torch.save(model.state_dict(),
            'output/snapshots/' + args.output_string + '_epoch_'+ str(epoch+1) + '.pkl')

依照什么作为模型参考

默认是https://download.pytorch.org/models/resnet50-19c8e357.pth,实际是300W_LP

parser.add_argument('--snapshot', dest='snapshot', help='Path of model snapshot.',
          default='', type=str)

Ubuntu版本安装16.04

重启主机,按F12进入安装界面,按照顺序依次执行

  • 8
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值