keras faster物体检测_keras+Faster R-CNN实现深度学习行人检测

最新推荐文章于 2023-07-13 01:24:17 发布

weixin_39660408

最新推荐文章于 2023-07-13 01:24:17 发布

阅读量318

点赞数

文章标签： keras faster物体检测

前言

Faster R-CNN是深度学习Two-Stage目标检测算法的杰出代表，其蕴含的思想在如今许多网络中都得以体现。Faster R-CNN的理论解读可以看一下下面博客。

馨意：深度学习目标检测Faster R-CNN论文解读zhuanlan.zhihu.com

Faster R-CNN代码

这里我们使用bubbliiiing大佬的代码：

bubbliiiing/faster-rcnn-kerasgithub.com

大佬的代码解读可以看大佬的B站视频和CSDN文字解读：

https://www.bilibili.com/video/BV1U7411T72r?p=11www.bilibili.com 睿智的目标检测18--Keras搭建Faster-RCNN目标检测平台_Bubbliiiing的学习小课堂-CSDN博客_https://blog.csdn.net/weixin_44791964/article/detablog.csdn.net

行人视频数据下载及预处理

官网下载

视频数据下载：bbenfold_headpose/Datasets/TownCentreXVID.avi

标注数据下载：bbenfold_headpose/Datasets/TownCentre-groundtruth.top

百度网盘

链接：https://pan.baidu.com/s/1P2OrgUuGYBqDmwAMEAqQbw

提取码：4ms9

数据说明

该数据集包含一个视频TownCentreXVID.avi和标签文件TownCentre-groundtruth.top。其中TownCentreXVID.avi一共5 min，每1 sec包含25帧图像（1920*1080），因此一共包含7500帧图像；TownCentre-groundtruth.top包含前4500帧图像中行人的位置信息，每一行信息组织格式如下：

personNumber - 个人的唯一标识符
frameNumber - 帧号（从0开始计数）
headValid - 如果头部区域有效，则为1，否则为0
bodyValid - 如果身体区域有效，则为1，否则为0
headLeft,headTop,headRight,headBottom - 头部边框（以像素为单位）
bodyLeft,bodyTop,bodyRight,bodyBottom - 身体边框（以像素为单位）

对于行人检测，我们主要需要上述标黑的数据。

预处理

原始数据集为视频，而我们的代码是对图像进行训练。在此之前，可利用opencv的VideoCapture从TownCentreXVID.avi中抽取用于训练的图像帧，将其尺寸减半，存储到JPEGImages和JPEGImages_test，参考代码：

import cv2
import os

#  利用opencv的VideoCapture从TownCentreXVID.avi中抽取用于训练的图像帧，存储到JPEGImages和JPEGImages_test
def video2im(video_name, train_path = 'JPEGImages', test_path = 'JPEGImages_test', factor = 2):
    frame = 0
    cap = cv2.VideoCapture(video_name)
    length = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    print('Total Frame Count:', length )
    
    while True:
        check, img = cap.read()
        if check:
            if frame <= 4500:
                path = train_path
            else:
                path = test_path
            img = cv2.resize(img, (1920 // factor, 1080 // factor))
            cv2.imwrite(os.path.join(path, str(frame) + ".jpg"), img)
            frame += 1
            print('Processed: ',frame)
        else:
            break    
    cap.release()

video2im("TownCentreXVID.avi")

根据大佬的训练代码，我们需要生成一个训练txt，txt格式为：

#  图像文件地址 左上角,左下角,右上角,右上角,类别 左上角,左下角,右上角,右上角,类别 ...
JPEGImages0.jpg 141,144,184,245,0 142,114,181,207,1 358,109,392,203,2 395,116,429,212,0 437,195,477,313,0 327,496,394,692,0 802,299,857,445,0 806,444,873,637,0 894,86,937,178,0 821,55,855,137,0 728,13,754,82,0 679,4,703,70,0 444,35,470,109,0
JPEGImages1.jpg 139,145,181,247,0 144,114,184,206,3
...

生成上txt的代码为：

def read_top(file):
    with open(file, 'r') as fp:
        lines = fp.readlines()
        lastframe = -1
        train_file = open('train.txt', 'w')
        for line in lines:
            line = line.strip().split(",")
            frameNumber = int(line[1])
            xmin = int(abs(float(line[-4])-2)/2)
            ymin = int(abs(float(line[-3])-2)/2)
            xmax = int(abs(float(line[-2])-2)/2)
            ymax = int(abs(float(line[-1])-2)/2)
            
            if lastframe != frameNumber:
                #新的帧
                if(lastframe != -1):
                    train_file.write('n')
                    train_file.write(r"JPEGImages%s.jpg"%(frameNumber))
                else:
                    train_file.write(r"JPEGImages%s.jpg"%(frameNumber))
                lastframe = frameNumber
            else:
                train_file.write(' %s,%s,%s,%s,0'%(xmin, ymin, xmax, ymax))
            
infos=read_top(r"TownCentre-groundtruth.top")

训练代码改动

因为我们类别只有一个行人person，所以我们修改NUM_CLASSES的值为2（行人+背景）。训练轮数EPOCH设置为20即可，100的时间太长了。注释路径annotation_path改为我们刚刚生成的txt路径。这样就可以愉快地进行训练了。

NUM_CLASSES = 2
EPOCH = 20
annotation_path = r"MyDataset_TownCentretrain.txt"

模型预测

我们需要修改frcnn.py中的_defaults中的model_path改为我们生成的loss最低h5模型和classes_path改为内容只有person的txt文件。

源代码predict.py是手动输入图片地址进行预测：

from frcnn import FRCNN 
from PIL import Image

frcnn = FRCNN()

#  输入预测
while True:
    img = input('Input image filename:')
    try:
        image = Image.open(img)
    except:
        print('Open Error! Try again!')
        continue
    else:
        r_image = frcnn.detect_image(image)
        r_image.show()
        r_image.save('result1.jpg', quality=95)
frcnn.close_session()

我们修改的单文件预测代码为：

from frcnn import FRCNN 
from PIL import Image

frcnn = FRCNN()

#  单文件预测
img = 'img/street.jpg'
image = Image.open(img)
r_image = frcnn.detect_image(image)
r_image.show()
r_image.save('result2.jpg', quality=95)
frcnn.close_session()

我们修改的多文件批量预测代码为：

from frcnn import FRCNN 
from PIL import Image
import os

frcnn = FRCNN()

#  批量文件预测
ReadPath = r"JPEGImages_test"
SavePath = r"JPEGImages_result"
FileList = os.listdir(ReadPath)

for i in range(len(FileList)):
    img = ReadPath + "//" + FileList[i]
    image = Image.open(img)
    r_image = frcnn.detect_image(image)
    #r_image.show()
    r_image.save(SavePath + "//" + FileList[i], quality = 95)

frcnn.close_session()

预测结果示例1

预测结果示例2

我们将预测的图像结果转为视频格式：

import os
import cv2
import numpy as np

path = r"JPEGImages_result"
filelist = os.listdir(path)

fps = 25 #视频每秒25帧
size = (960,540) #需要转为视频的图片的尺寸
video = cv2.VideoWriter("result.avi", cv2.VideoWriter_fourcc('I', '4', '2', '0'), fps, size)
for item in filelist:
    if item.endswith('.jpg'): 
        img_name = path + "" + item
        print(img_name)
        img = cv2.imread(img_name)
        #  解决中文路径问题
        if(img == None):
            img = cv2.imdecode(np.fromfile(img_name, dtype = np.uint8),-1) 

        video.write(img)

video.release()
cv2.destroyAllWindows()
print('end')

知乎视频www.zhihu.com

参考

https://github.com/bubbliiiing/faster-rcnn-kerasgithub.com

weixin_39660408

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
keras faster物体检测_keras+Faster R-CNN实现深度学习行人检测

前言Faster R-CNN是深度学习Two-Stage目标检测算法的杰出代表，其蕴含的思想在如今许多网络中都得以体现。Faster R-CNN的理论解读可以看一下下面博客。馨意：深度学习目标检测Faster R-CNN论文解读zhuanlan.zhihu.comFaster R-CNN代码这里我们使用bubbliiiing大佬的代码：bubbliiiing/faster-rcnn-keras...
复制链接

扫一扫