机器学习-caffe实现人脸检测-11

最新推荐文章于 2024-07-25 22:29:57 发布

lidashent

最新推荐文章于 2024-07-25 22:29:57 发布

阅读量2.2k

点赞数

分类专栏：机器学习文章标签： caffe 人脸检测

本文链接：https://blog.csdn.net/lidashent/article/details/121573842

版权

机器学习专栏收录该内容

22 篇文章 6 订阅

订阅专栏

文章目录

前言

目标：给出图片，用框框住人脸部分

开始实现人脸检测

1·数据格式

1·1制作人脸图片

已经分好类存储的人脸和非人脸图片

标签格式
aaa.jpg x1,y1 x2,y2
后两者坐标代表了一个标注的人脸框，需要机器学习其中的特点

我们需要自己准备数据吗？
如果是我们自己的项目当然如此，如果是学习，其实我们想做的前人已经努力过了，可以直接使用他们标注好的数据集
1，benchmark是行业基准，（数据库，论文，源码，基准，结果）
论文，论文就是前人已经做过实验的记录，可以下载他们附带的数据集，一般都是国外的，可以用学校的邮箱申请学术交流，因为他们不允许商业行为，不是edu邮箱，估计不会通过
2.论坛，会有人交流和提供的

假设现在已经有了1w人脸图片
图中有人脸框，那么假设随机裁剪一个正方形，和人脸框重复的部分称为iou
1，裁剪画框之中的人脸内容作为人脸数据,随机裁剪，如果iou重复大于0.8也认为是一张人脸
2.裁剪画框中和人脸框重合小于0.3的图片，作为非人脸数据
这样就有了1.5w人脸数据，1.5w非人脸数据,数据扩容非常厉害
同时实际生活中，发生人脸遮蔽也非常正常，这样反而能提高识别的健壮性

然而由于原始的图片标注并非一定准确，因为都是人工标注的，漏标人脸，或者标错人脸也会存在的，这就导致非人脸的数据集中会出现部分人脸，这是非常致命的

对于训练集可以忍受，
以上是图片的制作

2·生成lmdb文件

已经有详细的解释了，执行此步后会拥有训练集和数据集lmdb
https://blog.csdn.net/lidashent/article/details/121464092

3·编辑神经图层文件

直接使用alexnet就行了，把最后的全连接层改成自己需要的多分类就行了，最后有几个分类就设置为几
通用的，用来做人脸检测也没问题
train.protxt

#############################  DATA Layer  #############################
name: "face_train_val"
layer {
  top: "data"
  top: "label"
  name: "data"
  type: "Data"
  data_param {
    source: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/face_train_lmdb"
    backend:LMDB
    batch_size: 64
  }
  transform_param {
#是否要减均值，实际发现效果不大，不过数据改变之后围绕零点分布很漂亮，比较均匀
     #mean_file: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/imagenet_mean.binaryproto"
#镜像变换了一下，数据增加一倍
     mirror: true
  }
  include: { phase: TRAIN }
}

layer {
  top: "data"
  top: "label"
  name: "data"
  type: "Data"
  data_param {
    source: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/face_val_lmdb"
    backend:LMDB
    batch_size: 64
  }
  transform_param {
    #mean_file: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/imagenet_mean.binaryproto"
    mirror: true
  }
  include: { 
    phase: TEST 
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
#设定是基础学习率的几倍，这里是1倍，
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
# w参数和b参数初始化
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
#全连接层
layer {
  name: "fc8-expr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8-expr"
  param {
    lr_mult: 10
    decay_mult: 1
  }
  param {
    lr_mult: 20
    decay_mult: 0
  }
#因为最后是检测人脸，二分类，所以是分为两个类别，2
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8-expr"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8-expr"
  bottom: "label"
  top: "loss"
}

求解器文件

net: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/train.prototxt"
#一次要测试多少batch,如果电脑配置好可以调大一点，能一次跑完测试集才nb
test_iter: 100
#迭代多少次进行一次测试正确率
test_interval: 500
# lr for fine-tuning should be lower than when starting from scratch
#基础学习率，一般如此，最终每个层的学习率都是在这里*倍数得到的，所以设置在这里的都是超参数
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
#训练多少次显示一次训练结果
display: 100
#最多迭代的次数
max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
#多少次保存一次模型
snapshot: 10000
snapshot_prefix: "C:/Users/Administrator.DESKTOP-KMH7HN6/Downloads/li_test_net/my_face_detect/data_set/model"
# uncomment the following to default to CPU mode solving
# solver_mode: CPU

4·网络训练

caffe.exe的目录我加入系统变量了，我可不想每次都要带上路径，太扯淡了

caffe.exe  train --solver=C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\li_test_net\my_face_detect\data_set\solver.prototxt

等待训练完成

网络训练的快慢影响因素：
1.模型大小，一个几百层的神经图层和一个几层的训练速度，不必多言
2.输入的数据大小，227x227与32x32，后者当然更快，而且速度相差百倍
此时判别图中是否包含人脸的model就训练成功了
然而这不是目的，还需要知道人脸的位置

5·人脸位置检测

滑动窗口
可以假定一个画框，在图上移动，当移动到人脸上的时候标注

然而人脸有大小，怎么在图像变换之下依旧能够框住？
对于画框而言可以逐次增大，比如尝试27x27能否找到，找不到继续扩大，最后扩大到227x227的时候找到了，那就可以
然而实际中，缩放的是图片，对一张图片从大到小生成一系列缩放，然后让画框移动到人脸上，再缩放回原来的大小

然而问题是，要识别画框中的图片，最后一层连接层大小是固定的，是所有权重的相连，这样对于不断即将要缩放的图片是不合适的
因此要把最后一层变成全卷积层，全连接层不要了

问题来了，那么这到底是训练还是测试？
测试，图像检测只是一种测试而已，人脸识别的模型上面就搭建好了

整体流程：
1.model转化为全卷积
2.多个缩放scale
3.对模型前向传播，得到特征图，得到众多的概率矩阵
4.在特征图上找到人脸部分，记录人脸框体坐标，再反变换回原图，缩放坐标，映射到原图上的真正的坐标
5.NMS非极大值抑制，去除重合度大的框体，因为多个框可能标注的一张人脸，留下概率最高的即可
效果如图
在这里插入图片描述

import numpy as np
import matplotlib.pyplot as plt
# import Image
import sys
import os
from math import pow
from PIL import Image, ImageDraw, ImageFont
import cv2
import math
import random

# caffe_root = '/home/matt/Documents/caffe/'
#
# sys.path.insert(0, caffe_root + 'python')
os.environ['GLOG_minloglevel'] = '2'
import caffe

# caffe.set_device(0)
caffe.set_mode_cpu()


class Point(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


class Rect(object):
    def __init__(self, p1, p2):
        '''Store the top, bottom, left and right values for points
               p1 and p2 are the (corners) in either order
        '''
        self.left = min(p1.x, p2.x)
        self.right = max(p1.x, p2.x)
        self.bottom = min(p1.y, p2.y)
        self.top = max(p1.y, p2.y)

    def __str__(self):
        return "Rect[%d, %d, %d, %d]" % (self.left, self.top, self.right, self.bottom)


def calculateDistance(x1, y1, x2, y2):
    dist = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)
    return dist


def range_overlap(a_min, a_max, b_min, b_max):
    '''Neither range is completely greater than the other
    '''
    return (a_min <= b_max) and (b_min <= a_max)


def rect_overlaps(r1, r2):
    return range_overlap(r1.left, r1.right, r2.left, r2.right) and range_overlap(r1.bottom, r1.top, r2.bottom, r2.top)


def rect_merge(r1, r2, mergeThresh):
    # centralPt1 = Point((r1.left + r1.right)/2,(r1.top + r1.bottom)/2)
    # centralPt2 = Point((r2.left + r2.right)/2,(r2.top + r2.bottom)/2)
    if rect_overlaps(r1, r2):
        # dist = calculateDistance((r1.left + r1.right)/2, (r1.top + r1.bottom)/2, (r2.left + r2.right)/2, (r2.top + r2.bottom)/2)
        SI = abs(min(r1.right, r2.right) - max(r1.left, r2.left)) * abs(max(r1.bottom, r2.bottom) - min(r1.top, r2.top))
        SA = abs(r1.right - r1.left) * abs(r1.bottom - r1.top)
        SB = abs(r2.right - r2.left) * abs(r2.bottom - r2.top)
        S = SA + SB - SI
        ratio = float(SI) / float(S)
        if ratio > mergeThresh:
            return 1
    return 0


# 热度图
def generateBoundingBox(featureMap, scale):
    boundingBox = []
    # 卷积核滑步，假设第一层划了5步，第二层划了5步，相当于划了5x2=10步，以此类推得到最终滑动的步长
    stride = 32
    # 检测窗口
    cellSize = 227
    # 227 x 227 cell, stride=32
    # 返回各个画框的坐标以及判别人脸的概率值
    for (x, y), prob in np.ndenumerate(featureMap):
        if (prob >= 0.95):
            print(prob)
            # 需要得到原始图像上的坐标，而非特征图上的，还需要变换回去
            boundingBox.append(
                [float(stride * y) / scale, float(x * stride) / scale, float(stride * y + cellSize - 1) / scale,
                 float(stride * x + cellSize - 1) / scale, prob])
    # sort by prob, from max to min.
    # boxes = np.array(boundingBox)
    return boundingBox


def nms_average(boxes, groupThresh=2, overlapThresh=0.2):
    rects = []
    temp_boxes = []
    weightslist = []
    new_rects = []
    # print 'boxes: ', boxes
    for i in range(len(boxes)):
        if boxes[i][4] > 0.2:
            rects.append([boxes[i, 0], boxes[i, 1], boxes[i, 2] - boxes[i, 0], boxes[i, 3] - boxes[i, 1]])
    # print 'rects: ', rects
    # for i in range(len(rects)):
    #     rects.append(rects[i])

    rects, weights = cv2.groupRectangles(rects, groupThresh, overlapThresh)
    #######################test#########
    rectangles = []
    for i in range(len(rects)):
        # A______
        # |      |
        # -------B

        #                       A                                       B
        testRect = Rect(Point(rects[i, 0], rects[i, 1]), Point(rects[i, 0] + rects[i, 2], rects[i, 1] + rects[i, 3]))
        rectangles.append(testRect)
    clusters = []
    for rect in rectangles:
        matched = 0
        for cluster in clusters:
            if (rect_merge(rect, cluster, 0.2)):
                matched = 1
                cluster.left = (cluster.left + rect.left) / 2
                cluster.right = (cluster.right + rect.right) / 2
                cluster.top = (cluster.top + rect.top) / 2
                cluster.bottom = (cluster.bottom + rect.bottom) / 2

        if (not matched):
            clusters.append(rect)
    # print "Clusters:"
    # for c in clusters:
    #     print c
    ###################################
    result_boxes = []
    for i in range(len(clusters)):
        # result_boxes.append([rects[i,0], rects[i,1], rects[i,0]+rects[i,2], rects[i,1]+rects[i,3], 1])
        result_boxes.append([clusters[i].left, clusters[i].bottom, clusters[i].right, clusters[i].top, 1])
    # print 'result_boxes: ', result_boxes
    return result_boxes


# 人脸检测
def face_detection(imgFile):
    # 和train相比，网络结构相同，只不过这里没有数据层deploy_full_conv.prototxt，用来测试  模型文件  第三个参数，默认，代表测试
    # 读进来配置文件
    net_full_conv = caffe.Net(r'C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\li_test_net\my_face_detect\data_set\deploy_full_conv.prototxt',
                              r'C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\li_test_net\my_face_detect\data_set\alexnet_iter_50000_full_conv.caffemodel',
                              caffe.TEST)
    randNum = random.randint(1, 10000)

    # 图像缩放比例
    scales = []
    factor = 0.793700526

    # img = Image.open(imgFile.strip())
    # img = img.convert('RGB')

    img = cv2.imread(imgFile)
    print(img.shape)
    # 制作图像金字塔，缩放图片，同一张图有大有小
    largest = min(2, 4000 / max(img.shape[0:2]))
    scale = largest
    minD = largest * min(img.shape[0:2])
    # 图像需要大于227
    while minD >= 227:
        scales.append(scale)
        scale *= factor
        minD *= factor

    total_boxes = []

    for scale in scales:
        # resize image   读进图片，将长宽同比缩放
        scale_img = cv2.resize(img, ((int(img.shape[0] * scale), int(img.shape[1] * scale))))
        # 图片缩放后将图片存入路径，
        cv2.imwrite(r'C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\li_test_net\my_face_detect\data_set\myscale.jpg', scale_img)
        # scale_img.save("tmp{0}.jpg".format(randNum))
        # load input and configure preprocessing
        # im = caffe.io.load_image("tmp{0}.jpg".format(randNum))
        # 读取存入的图片路径
        im = caffe.io.load_image(imgFile)
        # 将opecv读进的图片参数修改为caffe认可的格式，同时图片已经缩放，则图片尺寸也需要修改
        net_full_conv.blobs['data'].reshape(1, 3, scale_img.shape[1], scale_img.shape[0])
        # 图片数据可以输入了，转化成数据层结构
        transformer = caffe.io.Transformer({'data': net_full_conv.blobs['data'].data.shape})
        # 对图像数据进行减mean值操作
        transformer.set_mean('data', np.load(
            r"C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\Compressed\caffer_data\caffe-windows\python\caffe\imagenet\ilsvrc_2012_mean.npy").mean(
            1).mean(1))
        # 将图片由rgb变成bgr格式
        transformer.set_transpose('data', (2, 0, 1))
        # 对channel也进行变换
        transformer.set_channel_swap('data', (2, 1, 0))
        # 像素点是否进行缩放，如果训练时使用的是1/255，则这里也应该进行缩放
        transformer.set_raw_scale('data', 255.0)
        # 此时输入数据已经达标
        # make classification map by forward and print prediction indices at each location
        # 图像进行前向传播一次，求出各个滑动窗口恰好是人脸区域的概率值
        out = net_full_conv.forward_all(data=np.asarray([transformer.preprocess('data', im)]))
        # 0代表是人脸，1代表是人脸的概率值
        print(out['prob'][0, 1].shape)
        # print out['prob'][0].argmax(axis=0)
        # 热度图，最后生成的卷积图片，每个点都是一个概率，分布着特征点
        # 得到的是已经转换的原图人脸的坐标和人脸的概率
        boxes = generateBoundingBox(out['prob'][0, 1], scale)
        # plt.subplot(1, 2, 1)
        # plt.imshow(transformer.deprocess('data', net_full_conv.blobs['data'].data[0]))
        # plt.subplot(1, 2, 2)
        # plt.imshow(out['prob'][0,1])
        # plt.show()
        # print boxes
        # 不一定是一张人脸，因此需要多个,也有可能一张脸，但是画了很多框
        if (boxes):
            total_boxes.extend(boxes)

            # boxes_nms = np.array(total_boxes)
            # true_boxes = nms(boxes_nms, overlapThresh=0.3)
            # #display the nmx bounding box in  image.
            # draw = ImageDraw.Draw(scale_img)
            # for box in true_boxes:
            #     draw.rectangle((box[0], box[1], box[2], box[3]) )
            # scale_img.show()

    # nms
    print(total_boxes)
    # 筛选人脸，对于一张脸划了很多框的留一个
    # 非非极大值抑制，判断框体重合度nms，只保留概率最大的
    boxes_nms = np.array(total_boxes)
    # 重合0.8的视为同一个
    true_boxes = nms_average(boxes_nms, 1, 0.2)
    if not true_boxes == []:
        (x1, y1, x2, y2) = true_boxes[0][:-1]
        # 将坐标描述的方框画出来
        cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
        win = cv2.namedWindow('test win', flags=0)

        cv2.imshow('test win', img)

        cv2.waitKey(0)
    #                 x1 = int(max(1, x1-(x2-x1)/6))


#                 y1 = int(max(1, y1-(y2-y1)/3))
#                 x2 = int(min(img.size[0], x2+(x2-x1)/6))
#                 cvimg = cv2.imread(imgFile)
#                 if cvimg == None:
#                     continue
#                 cvimg = cvimg[y1:y2, x1: x2]
#                 cvimg = cv2.resize(cvimg, (256,256))
#                 outputPath = os.path.join(imgPath+'-c', folder, str(count)+'.jpg')
#                 cv2.imwrite(outputPath, cvimg)
#                 count += 1

if __name__ == "__main__":
    imgFile = r'C:\Users\Administrator.DESKTOP-KMH7HN6\Downloads\li_test_net\my_face_detect\data_set\123.jpg'

    face_detection(imgFile)

但是问题也非常明显，每个scale缩放都需要进行一次前向传播，而模型非常大，非常耗时，10s几乎才一帧

6·训练速度优化

因此可以采用一种思想
1.先用非常小的扫描块扫描图片，得到人脸的大概区域，然后逐次增大，锁定人脸区域识别
24x24 48x48 127x127.。。。
一开始用227x227的实际上非常耗时
2.有时人脸框标注的并非最佳区域，可以使用矫正网络
矫正网络将一张图片变成45张图片，sn5种，代表缩放，xn3种代表左右偏移，yn3种代表上下偏移，5x3x3
矫正网络可以用于服务1中的扫描块，用于微调扫描块所标注的人脸位置
其效果可以达到1s25帧识别，速度更快

7·精确度优化

理论上更深的网络提供的精确度更好
alexnet逊色于aggnet,后者网络结构更深

8·数据增强策略

图像数据，平移，偏转…可以将1000张图片扩充到3万

9·过拟合现象

训练的loss越来越低，而test的loss非常高，代表训练效果过拟合了，
可以减少学习率来解决

比如第五万次开始过拟合，则可以直接用第五万次的训练结果作为初始权重训练，然后降低学习率，查看新的训练结果

lidashent

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
机器学习-caffe实现人脸检测-11

文章目录前言开始实现人脸检测1·数据格式1·1制作人脸图片1·2制作标签前言目标：给出图片，用框框住人脸部分开始实现人脸检测1·数据格式1·1制作人脸图片已经分好类存储的人脸和非人脸图片标签格式aaa.jpg x1,y1 x2,y2后两者坐标代表了一个标注的人脸框，需要机器学习其中的特点我们需要自己准备数据吗？如果是我们自己的项目当然如此，如果是学习，其实我们想做的前人已经努力过了，可以直接使用他们标注好的数据集1，benchmark是行业基准，（数据库，论文，源码，基准，结
复制链接

扫一扫

专栏目录