RCNN_人脸检测

RCNN_人脸检测

1966年,Marvin Minsky让他的学生Gerald Jay Sussman花一个暑假的时间,把相机连接到电脑上并使得电脑能描述出看到的东西,这就是目标检测的由来。

2014年,作者RBG提出RCNN模型,创造性地使用Selective Search方法代替滑动窗口策略,并利用CNN提取图像特征,使之成为第一个可以真正工业级应用的解决方案,掀起了目标检测领域的研究热潮。

前言

本文基于RCNN论文的思路,对人脸检测问题进行具体研究。给定一张RGB图片,找出人脸区域位置。与原模型不同的是,人脸检测问题更为简单,我对CNN网络结构进行了细微修改,并减少了网络层数。同时由于我是基于LFW人脸数据集,并只对单张人脸图片进行检测,所以删去了RCNN模型后续的SVM算法和非极大抑制策略,改用了平均策略以提高检测区域的稳健性。

一、数据集准备

我从网上下载了LWF人脸数据集,并从中取出了近5k张人脸图片。

在这里插入图片描述
第1个麻烦之处在于,我没有人脸区域位置信息的数据集。我调用MATLAB中Viola Jones函数,对LWF中人脸图片进行检测并保留下了人脸区域,由此得到4k张标准人脸图片,作为后续训练使用。

在这里插入图片描述
第2个麻烦之处在于,在对人脸区域进行二分类判别时,我没有负样本数据集。我从网上下载了3k张风景图片数据集,作为负样本,用以后续训练二分类使用。

在这里插入图片描述
第3个麻烦之处在于,在RCNN模型中,最终是要对Selective Search算法的分割区域进行二分类,但算法实际分割出的区域,可能比整个人脸区域稍大一些,也可能比整个人脸区域稍小一些。如果网络只用标准人脸进行训练,最后结果就会把很多实际是人脸区域的图片也判别为0,无法进行检测。

仿照RCNN中pre_train与fine_train的思路,我采取了这样一种策略。

首先利用Selective Search对整个LFW人脸数据集进行分割,把所有分割区域都存储在一个文件夹里面,大概得到了10w张区域图片。

手动挑选出近500张人脸区域,近500张非人脸区域,搭建非常简单的网络进行二分类。因为有标注的数据很少,我只能采用简单网络以避免过拟合。

利用训练好的简单网络,对一部分的区域图片进行二分类,根据分类结果,把分类准确的人脸区域添加进正样本数据集,把误判样本添加进负样本数据集,接着训练,并逐步增加网络复杂度。将这个过程重复3次,最后得到了3674张人脸区域图片,4440张非人脸区域图片。

在这里插入图片描述
在这里插入图片描述
此时,我已经有3674 + 4000 = 7674张人脸图片,4440 + 3000 = 7440张非人脸图片。我将其全部resize到(96.96)大小,这些图片都是已有类别标注信息的,足够用来训练一个较好的网络。

train_x : (12000, 96, 96)
train_y : (12000, 2)
test_x : (3114, 96, 96)
test_y : (3114, 2)

二、网络结构

因为我需要处理的只是人脸图片、非人脸图片的二分类问题,所以网络结构无需过于复杂,首先排除了VGG这种几十层的深层网络结构。

尝试了LeNet网络结构,但效果并不理想。可能是因为我需要将半脸区域、人脸区域区分开,让网络能够将不完整的部分人脸区域也识别为0。当网络层数较浅、模型结构较简单时,是无法提取到这种深层特征的。

最后我在AlexNet网络结构的基础上做了一些细微调整,效果还算差强人意。

    
    np.random.seed(1)
    model = Sequential()
    model.add(Conv2D(16, (5, 5), input_shape=(96, 96, 1), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(32, (5, 5), input_shape=(96, 96, 1), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(64, (3, 3), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(128, (3, 3), strides=1))
    model.add(Activation('relu'))

    model.add(Conv2D(256, (3, 3), strides=1))
    model.add(Activation('relu'))

    model.add(Flatten())

    model.add(Dense(4096))
    model.add(Activation('relu'))

    model.add(Dense(200))
    model.add(Activation('relu'))

    model.add(Dense(2))
    model.add(Activation('softmax'))

    model.summary()

三、区域检测

训练好二分类网络后,人脸检测问题就变得很简单。输入单张人脸图片,利用Selective Search算法对该图片进行分割,得到许多候选区域。利用网络对这些候选区域进行二分类,把分类为1的区域记录下来,这就是检测到的人脸区域。

此时可能得到许多个人脸区域检测框,检测效果都大体上使人满意,我们想要筛选出最佳的一个候选框,作为最终人脸检测区域。RCNN论文中的处理方法是,利用CNN提取到的特征再搭建一个SVM模型,采用非极大抑制策略进行筛选。

在这里插入图片描述
我本来想直接选网络softmax输出最大的那张图片,将其作为最后的人脸检测区域,但实际中碰到了一个麻烦。softmax最终输出结果本应是一个0-1的小数,正好可以作为概率的度量,但我的网络最终输出结果都是1,非极大抑制策略没法继续进行下去。

后面我采取了这么一种处理办法:

因为Selective Search算法分割出的效果不是100%完美的,有时偏向人脸左边区域,有时偏向人脸右边区域。再加上训练出的网络分类效果也不是100%完美的,有时会把一个包含人脸的较大区域识别为1,有时会把人脸部分区域识别为1。

借用数理统计中的均值思想,既然这些检测区域都大体上使人满意,但又或多或少存在一下瑕疵。不妨对所有的检测区域位置坐标取平均值,让偏左的区域与偏右的区域相互抵消,偏大的区域与偏小的区域相互抵消,最终效果岂不是完美了么。

四、模型效果

我抽取了100张图片进行人脸检测,最终检测精度在82%左右,算大体上使人满意了。我还拍了几张自己的图片来做检测,效果也还不错。当人脸占比不过大不过小、拍摄背景环境不复杂的情形下,模型在现实场景下的效果还是可用的。

在这里插入图片描述
至于为什么其他人做出来的人脸检测效果这么好(都快接近100%了),我也没想通,每次我自己复现出来的效果都只是勉勉强强。要是后续发现了什么能改进效果的地方,再回过头来调整吧。

五、对结果的一些深入思考

Ques1:RCNN模型的亮点有哪些?

(1)传统目标检测都是采用滑动窗口策略,很长一段时间都没人能想到更好的idea。作者借助区域聚类,从区域连续性这个角度对滑动窗口策略做了优化。

idea的突破点:一般我们想要检测的目标物体的区域纹理、区域色彩都是连续的,而滑动窗口策略把很多本不应该割裂开来的区域生生断开,这无疑是一种巨大的计算浪费。Selective Search算法正是从这个角度进行了改进,避免了许多区域割裂情况。

(2)作者提出用CNN对每个proposal region进行特征提取。

(3)这篇论文也带来了一个观点:当你缺乏大量的标注数据时,比较可行的手段是进行神经网络的迁移学习,采用在其他大型数据集训练过后的神经网络,然后在小规模特定的数据集中进行fine-tune微调。

Ques2:要想让RCNN模型跑出效果,必须抓住的3个关键点。

(1)Selective Search算法生成候选区域时,一定要确保分割出我们想要的效果,即一定要确保候选区域有我们想要的目标物体。如果分割效果不佳,可以尝试调整felzenszwalb函数中超参数值。

(2)训练出来的二分类网络一定要分类精度高,而且一定要具有鲁棒性。网络一定要能将无效背景区域都分类为0,还要能将许多残缺部分的目标物体也分类为0,而且一旦目标区域为真,应该非常准确地分类为1。如果分类效果不佳,一是可以改善数据集质量,删去正负样本集中比较难把控的样本,并增加训练集数量。二是可以加深网络层数,确保网络能将残缺目标区域与完整目标区域准确区分开。

(3)最后网络输出多个检测区域,一定要保留下检测效果最佳的bounding box。可以考虑用Logistic Regression模型对CNN提取到的特征进行概率建模,或者尝试利用bounding box回归做细微调整。

Ques3:有个问题一直让我无法理解。对于目标检测,哪怕是当下最流行的YOLO算法也做不到100%的检测精度,那怎么可以实际场景下应用呢?比如无人车间车辆检测,一旦有某辆车检测失败,实际造成的损失无法估量。

工业上目标检测多用的是视频流处理,相当于采用了统计上的概率连乘策略。就算我模型的检测精度只有95%,视频流1秒有20帧,这20张图片中只要有1张检测到了目标物体,我就设置门打开,这时出故障的概率大大降低到1-0.05^20。

Ques4:自己这个RCNN人脸检测项目的不足之处。

(1)人脸检测处理效果无法达到实时性。目前大概是0.5s能检测出一张图片,无法直接对视频流进行处理。
(2)Selective Search候选区域生成、CNN网络二分类、平均策略选定检测区域,这三大步骤的效果只是大体上使我满意,但并没有达到完美的处理效果。

比如说有时候Selective Search算法输出的区域,根本就没有我想要的人脸区域。比如说有时候CNN网络居然把某些无关背景也分类为1。比如说有时某些检测区域过于离谱,反而拉低了平均策略的效果。

这时我终于理解为什么要推崇end-to-end的模型了,中间步骤太多,每一步效果稍微打点折扣,最后整体效果就稀里糊涂。

六、源码

Selective Search生成候选区域:

import cv2
import numpy as np
import skimage.segmentation
import random
import skimage.feature


# Selective Search algorithm

# step 1: calculate the first fel_segment region
# step 2: calculate the neighbour couple
# step 3: calculate the similarity dictionary
# step 4: merge regions and calculate the second merged region
# step 5: obtain the target candidate region through the second region


def intersect(a, b):
    if (a["min_x"] < b["min_x"] < a["max_x"] and a["min_y"] < b["min_y"] < a["max_y"]) or \
            (a["min_x"] < b["max_x"] < a["max_x"] and a["min_y"] < b["max_y"] < a["max_y"]) or \
            (a["min_x"] < b["min_x"] < a["max_x"] and a["min_y"] < b["max_y"] < a["max_y"]) or \
            (a["min_x"] < b["max_x"] < a["max_x"] and a["min_y"] < b["min_y"] < a["max_y"]):
        return True
    return False


def calc_similarity(r1, r2, size):

    sim1 = 0
    sim2 = 0
    for a, b in zip(r1["hist_c"], r2["hist_c"]):
        sim1 = sim1 + min(a, b)
    for a, b in zip(r1["hist_t"], r2["hist_t"]):
        sim2 = sim2 + min(a, b)
    sim3 = 1.0 - (r1["size"] + r2["size"]) / size
    rect_size = (max(r1["max_x"], r2["max_x"]) - min(r1["min_x"], r2["min_x"])) * \
                (max(r1["max_y"], r2["max_y"]) - min(r1["min_y"], r2["min_y"]))
    sim4 = 1.0 - (rect_size - r1["size"] - r2["size"]) / size
    similarity = sim1 + sim2 + sim3 + sim4

    return similarity


def merge_region(r1, r2, t):
    new_size = r1["size"] + r2["size"]
    r_new = {
        "min_x": min(r1["min_x"], r2["min_x"]),
        "min_y": min(r1["min_y"], r2["min_y"]),
        "max_x": max(r1["max_x"], r2["max_x"]),
        "max_y": max(r1["max_y"], r2["max_y"]),
        "size": new_size,
        "hist_c": (
            r1["hist_c"] * r1["size"] + r2["hist_c"] * r2["size"]) / new_size,
        "hist_t": (
            r1["hist_t"] * r1["size"] + r2["hist_t"] * r2["size"]) / new_size,
        "labels": t
    }
    return r_new


# Step 1: Calculate the different categories segmented by felzenszwalb algorithm

def first_calc_fel_category(image, scale, sigma, min_size):

    fel_mask = skimage.segmentation.felzenszwalb(image, scale=scale, sigma=sigma, min_size=min_size)
    print('The picture has been segmented in these categories : ', np.max(fel_mask))   # 0-694 categories

    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)    # (250, 250)
    texture_img = skimage.feature.local_binary_pattern(gray_image, 8, 1.0)    # (250, 250)

    # fel_img = np.zeros((fel_mask.shape[0], fel_mask.shape[0], 3))
    # for i in range(np.max(fel_mask)):
    #     a = random.randint(0, 255)
    #     b = random.randint(0, 255)
    #     c = random.randint(0, 255)
    #     for j in range(fel_mask.shape[0]):
    #         for k in range(fel_mask.shape[1]):
    #             if fel_mask[j, k] == i:
    #                 fel_img[j, k, 0] = a
    #                 fel_img[j, k, 1] = b
    #                 fel_img[j, k, 2] = c
    #
    # cv2.namedWindow("image")
    # cv2.imshow('image', fel_img/255)
    # cv2.waitKey(0)
    # cv2.imwrite('felzenszwalb_img.jpg', fel_img)

    img_append = np.zeros((fel_mask.shape[0], fel_mask.shape[1], 4))  # (250, 250, 4)
    img_append[:, :, 0:3] = image
    img_append[:, :, 3] = fel_mask

    region = {}

    # calc the min_x、in_y、max_x、max_y、label in every category
    for y, i in enumerate(img_append):
        for x, (r, g, b, l) in enumerate(i):
            if l not in region:
                region[l] = {"min_x": 0xffff, "min_y": 0xffff, "max_x": 0, "max_y": 0, "labels": l}
            if region[l]["min_x"] > x:
                region[l]["min_x"] = x
            if region[l]["min_y"] > y:
                region[l]["min_y"] = y
            if region[l]["max_x"] < x:
                region[l]["max_x"] = x
            if region[l]["max_y"] < y:
                region[l]["max_y"] = y

    for k, v in list(region.items()):

        # calc the size feature in every category
        masked_color = image[:, :, :][img_append[:, :, 3] == k]
        region[k]["size"] = len(masked_color)

        # calc the color feature in every category
        color_bin = 6
        color_hist = np.array([])

        for colour_channel in (0, 1, 2):
           c = masked_color[:, colour_channel]
           color_hist = np.concatenate([color_hist] + [np.histogram(c, color_bin, (0.0, 255.0))[0]])

        color_hist = color_hist / sum(color_hist)
        region[k]["hist_c"] = color_hist

        # calc the texture feature in every category
        texture_bin = 10
        masked_texture = texture_img[:, :][img_append[:, :, 3] == k]
        texture_hist = np.histogram(masked_texture, texture_bin, (0.0, 255.0))[0]
        texture_hist = texture_hist / sum(texture_hist)
        region[k]["hist_t"] = texture_hist

    return region


# Step 2: Calculate the neighbour couple in the first fel_segment region

def calc_neighbour_couple(region):
    r = list(region.items())
    couples = []

    for cur, a in enumerate(r[:-1]):
        for b in r[cur + 1:]:
            if intersect(a[1], b[1]):
                couples.append((a, b))

    return couples


# Step 3: Calculate the sim_dictionary in the neighbour couple

def calc_sim_dictionary(couple, total_size):

    sim_dictionary = {}

    for (ai, ar), (bi, br) in couple:
        sim_dictionary[(ai, bi)] = calc_similarity(ar, br, total_size)

    return sim_dictionary


# step 4: merge the small regions and calculate the second merged region

def second_calc_merge_category(sim_dictionary, region,  total_size):

    while sim_dictionary != {}:
        i, j = sorted(sim_dictionary.items(), key=lambda i: i[1])[-1][0]
        t = max(region.keys()) + 1.0

        region[t] = merge_region(region[i], region[j], t)
        key_to_delete = []
        for k, v in list(sim_dictionary.items()):
            if (i in k) or (j in k):
                key_to_delete.append(k)
        for k in key_to_delete:
            del sim_dictionary[k]

        for k in [a for a in key_to_delete if a != (i, j)]:
            n = k[1] if k[0] in (i, j) else k[0]
            sim_dictionary[(t, n)] = calc_similarity(region[t], region[n], total_size)

    return region


# step 5: obtain the target candidate regions through the second region

def calc_candidate_box(second_region, total_size):
    category = []
    for k, r in list(second_region.items()):
        category.append({'rect': (r['min_x'], r['min_y'], r['max_x'], r['max_y']), 'size': r['size']})

    candidate_box = set()
    for r in category:
        if r['rect'] in candidate_box:
            continue

        x1, y1, x2, y2 = r['rect']

        if (x2-x1)*(y2-y1) > total_size / 3:
            continue

        if (x2-x1)*(y2-y1) < total_size / 16:
            continue

        if (x2-x1) == 0 or (y2-y1) == 0:
            continue

        if (y2-y1) / (x2-x1) > 1.5 or (x2-x1) / (y2-y1) > 1.5:
            continue

        candidate_box.add(r['rect'])

    return candidate_box

制作训练数据集:

import cv2
import numpy as np
import random
import os

# face_image: [1, 0]
# background_image: [0, 1]


def read_data():

    # 3674 + 4440 : train data from selective search algorithm
    # 4000 + 3000 : train data from online download dataset

    n = 15114
    data_x = np.zeros((n, 96, 96))
    data_y = np.zeros((n, 2))

    filename = os.listdir("/home/archer/CODE/PF/selective_train_face")
    filename.sort()
    i = 0
    for name in filename:
        face_image = cv2.imread("/home/archer/CODE/PF/selective_train_face/" + name)
        face_gray_image = cv2.cvtColor(face_image, cv2.COLOR_BGR2GRAY)
        face_resize_image = cv2.resize(face_gray_image, (96, 96), interpolation=cv2.INTER_AREA)

        data_x[i, :, :] = face_resize_image / 255
        data_y[i, :] = np.array([1, 0])
        i = i + 1

    print('the selective_train_face has been download : ', i)

    for k in range(4000):
        face_image = cv2.imread("/home/archer/CODE/PF/download_train_face/" + str(k+1) + '.jpg')
        face_gray_image = cv2.cvtColor(face_image, cv2.COLOR_BGR2GRAY)
        face_resize_image = cv2.resize(face_gray_image, (96, 96), interpolation=cv2.INTER_AREA)

        data_x[i, :, :] = face_resize_image / 255
        data_y[i, :] = np.array([1, 0])
        i = i + 1

    print('the download_train_face has been download : ', i)

    filename = os.listdir("/home/archer/CODE/PF/selective_train_background")
    filename.sort()
    for name in filename:
        background_image = cv2.imread("/home/archer/CODE/PF/selective_train_background/" + name)
        background_gray_image = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
        background_resize_image = cv2.resize(background_gray_image, (96, 96), interpolation=cv2.INTER_AREA)

        data_x[i, :, :] = background_resize_image / 255
        data_y[i, :] = np.array([0, 1])
        i = i + 1

    print('the selective_train_background has been download : ', i)

    for k in range(3000):
        background_image = cv2.imread("/home/archer/CODE/PF/download_train_background/" + str(k+1) + '.jpg')
        background_gray_image = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
        background_resize_image = cv2.resize(background_gray_image, (96, 96), interpolation=cv2.INTER_AREA)

        data_x[i, :, :] = background_resize_image / 255
        data_y[i, :] = np.array([0, 1])
        i = i + 1

    print('the download_train_background has been download : ', i)

    return data_x, data_y


def make_train_data():

    # train number : n1 , test number : n2
    n1 = 12000
    n2 = 3114
    n = n1 + n2

    data_x, data_y = read_data()
    random_index = np.arange(0, n, 1)
    random.shuffle(random_index)

    train_x = np.zeros((n1, 96, 96))
    train_y = np.zeros((n1, 2))
    test_x = np.zeros((n2, 96, 96))
    test_y = np.zeros((n2, 2))

    for i in range(n1):
        index = random_index[i]
        train_x[i, :, :] = data_x[index, :, :]
        train_y[i, :] = data_y[index, :]

    for i in range(n2):
        index = random_index[n1 + i]
        test_x[i, :, :] = data_x[index, :, :]
        test_y[i, :] = data_y[index, :]

    return train_x, train_y, test_x, test_y


网络结构:

    
import numpy as np
from keras.models import Sequential, Model
from keras.layers import Dense, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.models import load_model
import matplotlib.pyplot as plt


def create_network():
    np.random.seed(1)
    model = Sequential()
    model.add(Conv2D(16, (5, 5), input_shape=(96, 96, 1), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(32, (5, 5), input_shape=(96, 96, 1), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(64, (3, 3), strides=1))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=2))
    model.add(Activation('relu'))

    model.add(Conv2D(128, (3, 3), strides=1))
    model.add(Activation('relu'))

    model.add(Conv2D(256, (3, 3), strides=1))
    model.add(Activation('relu'))

    model.add(Flatten())

    model.add(Dense(4096))
    model.add(Activation('relu'))

    model.add(Dense(200))
    model.add(Activation('relu'))

    model.add(Dense(2))
    model.add(Activation('softmax'))

    model.summary()
    return model


# batch generator: reduce the consumption of computer memory
def generator(train_x, train_y, batch_size):

    while 1:
        row = np.random.randint(0, len(train_x), size=batch_size)
        x = train_x[row]
        y = train_y[row]
        yield x, y


# create model and train and save
def train_network(train_x, train_y, test_x, test_y, epoch, batch_size):
    train_x = train_x[:, :, :, np.newaxis]
    test_x = test_x[:, :, :, np.newaxis]

    model = create_network()
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    history = model.fit_generator(generator(train_x, train_y, batch_size), epochs=epoch,
                        steps_per_epoch=len(train_x) // batch_size)

    model.save('first_model.h5')

    calculate_test_accuracy(test_x, test_y, 'first_model.h5')


# Load the partially trained model and continue training and save
def load_network_then_train(train_x, train_y, test_x, test_y, epoch, batch_size, input_name, output_name):
    train_x = train_x[:, :, :, np.newaxis]
    test_x = test_x[:, :, :, np.newaxis]

    model = load_model(input_name)
    history = model.fit_generator(generator(train_x, train_y, batch_size),
                                  epochs=epoch, steps_per_epoch=len(train_x) // batch_size)

    model.save(output_name)

    calculate_test_accuracy(test_x, test_y, output_name)


# plot the loss and the accuracy
def show_plot(history):
    # list all data in history
    print(history.history.keys())

    plt.plot(history.history['loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.savefig('loss1.jpg')
    plt.show()

    plt.plot(history.history['accuracy'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.savefig('accuracy1.jpg')
    plt.show()


# calculate the accuracy in test set
def calculate_test_accuracy(test_x, test_y, output_name):

    model = load_model(output_name)
    test_result = model.predict(test_x)

    accuracy_number = 0

    for i in range(len(test_x)):
        if np.argmax(test_result[i, :]) == 0 and test_y[i, 0] == 1:
            accuracy_number = accuracy_number + 1
        if np.argmax(test_result[i, :]) == 1 and test_y[i, 0] == 0:
            accuracy_number = accuracy_number + 1

    print('The accuracy in test set is :')
    print(accuracy_number/len(test_x))


main函数调用:

    
import get_data as gt
import network as nt
import numpy as np
from keras.models import load_model
import os
import cv2
import selective_search as ss


if __name__ == "__main__":
    train_x, train_y, test_x, test_y = gt.make_train_data()
    # nt.train_network(train_x, train_y, test_x, test_y, epoch=1,  batch_size=32)

    model = load_model('best_model.h5')
    nt.calculate_test_accuracy(test_x[:, :, :, np.newaxis], test_y, 'best_model.h5')

    filename = os.listdir("/home/archer/CODE/PF/demo1")
    filename.sort()

    for name in filename:
        img = cv2.imread("/home/archer/CODE/PF/demo1/" + name)
        img = np.float32(img)

        total_size = img.shape[0] * img.shape[1]
        first_region = ss.first_calc_fel_category(img, scale=10, sigma=0.9, min_size=100)
        neighbour_couple = ss.calc_neighbour_couple(first_region)
        sim_dictionary = ss.calc_sim_dictionary(neighbour_couple, total_size)
        second_region = ss.second_calc_merge_category(sim_dictionary, first_region, total_size)
        candidate_box = ss.calc_candidate_box(second_region, total_size)

        a1, b1, a2, b2 = [0, 0, 0, 0]

        # record the well detected region
        bounding_box = []

        for (x1, y1, x2, y2) in candidate_box:
            select_img1 = img[y1:y2, x1:x2]

            select_img2 = cv2.cvtColor(select_img1, cv2.COLOR_BGR2GRAY)
            select_img3 = cv2.resize(select_img2, (96, 96), interpolation=cv2.INTER_AREA)
            select_img4 = select_img3[np.newaxis, :, :, np.newaxis]
            pre = model.predict(select_img4)

            if pre[0, 0] > 0.8:
                bounding_box.append([x1, y1, x2, y2])

        # calculate the average, and ensure the stability
        if len(bounding_box) > 0:
            bounding_box = np.array(bounding_box)
            print(bounding_box)

            a1 = int(np.mean(bounding_box[:, 0]))
            b1 = int(np.mean(bounding_box[:, 1]))
            a2 = int(np.mean(bounding_box[:, 2]))
            b2 = int(np.mean(bounding_box[:, 3]))

            print(a1, b1, a2, b2, '***************************')
            cv2.rectangle(img, (a1, b1), (a2, b2), (0, 0, 255), 2)

        cv2.imwrite("/home/archer/CODE/PF/demo_detection/" + name, img)


七、项目链接

如果代码跑不通,或者想直接使用训练好的模型,可以去项目链接下载:
https://blog.csdn.net/Twilight737

  • 3
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值