OpenCV目标追踪

最新推荐文章于 2023-08-08 12:53:08 发布

qq903952032

最新推荐文章于 2023-08-08 12:53:08 发布

阅读量524

点赞数 2

文章标签： python opencv

原文链接：https://www.pyimagesearch.com/2018/10/22/object-tracking-with-dlib/

版权

OpenCV目标追踪

python命令行传参

add_argument()的使用方法

argparse 是 Python 内置的一个用于命令项选项与参数解析的模块，通过在程序中定义好我们需要的参数，argparse 将会从 sys.argv 中解析出这些参数，并自动生成帮助和使用信息。

简单示例

创建 ArgumentParser() 对象
调用 add_argument() 方法添加参数
使用 parse_args() 解析添加的参数

# 一个小例子
import argparse

# constrcut the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-n", "--name", required = True,
                help = "name of the user"
)
args = vars(ap.parse_args())

# display a friendly message to the user
print("hello {}, nice to meet you!".format(args["name"]))

第四行：创建一个ArgumentParser类的对象，命名为ap

第五行：

这里只调用了一个命令行参数，–name。不管是在命令行中输入-n/–name都可以，因为这里required(可选参数是否可以省略)设置为true。

help：添加一些说明信息来提示你输入

usage：xxx.py --help/-h 都可以。

第六行：

调用对象上的vars()将解析后的命令行参数转化为字典.

key：命令行参数的名称
value：为命令行参数提供的字典的值

项目结构

在这里插入图片描述

input/ : 输入的视频，用来进行目标追踪
output/ : 经过处理后的视频，目标被矩形框标记
mobilenet_ssd/
The Caffe CNN model files

实现dlib对象追踪器

引用必须的库

from imutils.video import FPS
import numpy as np
import argparse
import imutils
import dlib
import cv2

解析命令行参数

### 解析命令行参数 ###
ap= argparse.ArgumentParser()

ap.add_argument("-p","--prototxt", required=True,
    help="path to Caffe 'deploy' prototxt file")

ap.add_argument("-m","--model", required=True,
    help="path to Caffe pre-trained model")

ap.add_argument("-v","--video", required=True,
    help="path to input video file")

ap.add_argument("-l","--label", required=True,
    help="class label we are interested in detecting + tracking")

ap.add_argument("-o","--output",type=str,
    help="path to optional output video file")

ap.add_argument("-c","--confidence",type=float, default=0.2,
    help="minimum probability to filter weak detections")

args= vars(ap.parse_args())

–prototxt: path to the Caffe deploy prototxt file.

e.g: -p ./mobilenet_ssd/MobileNetSSD_deploy.prototxt
–model: path to the Caffe pre-trained model.

e.g: -m ./mobilenet_ssd/MobileNetSSD_deploy.caffemodel
–video : 输入的视频(不支持网络摄像头)路径.

e.g: -v ./input/xxx.mp4
–label: 对要检测和跟踪的目标的标签,下面会有具体的类表

e.g: -l person

另外2个可选参数:

–output: 想保存目标追踪输出的结果

e.g: -o ./output/object_result.avi
–confidence: With a

default=0.2

这是最小的概率阈值，从而允许我们从Caffe目标i检测器中过滤掉微弱检测

初始化标签列表

这些类都已经被MobileNet_SSD模型训练过能够检测。

我们用已预训练过的MobileNet_SSD模型在一个帧中来表现目标追踪，目标的位置交给dlib的相关跟踪器追踪，以便在视频的剩余帧中进行跟踪。

我们的这个模型包含20个下载的支持的目标检测的类（再加上一个后台的类）。

注意事项：如果要使用不同的Caffe模型，需要重新定义下面的类表（同样地，如果你使用下面已经下载好的模型就不用去调整）。

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES= ["background","aeroplane","bicycle","bird","boat",
    "bottle","bus","car","cat","chair","cow","diningtable",
    "dog","horse","motorbike","person","pottedplant","sheep",
    "sofa","train","tvmonitor"]
# load our serialized model from disk
print("[INFO] loading model...")
net= cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

初始化视频流

在开始遍历帧之前，我们需要加载模型到内存中（上面最后一行）。加载中只需要prototxt和model文件的路径，这些在输入命令行参数中已输入。

初始化视频流，dlib相关的追踪器,写视频对象，要检测的类标签。

在最后一行，实例化FPS的吞吐量估计（帧数），为了最后的计数

# initialize the video stream, dlib correlation tracker, # output video writer, and predicted class label
print("[INFO] starting video stream...")
vs= cv2.VideoCapture(args["video"])
tracker= None
writer= None
label= ""
# start the frames per second throughput estimator
fps= FPS().start()

在视频帧上进行循环遍历

帧的大小被重新调整大小（能够更快的处理）

转变颜色信道（OpenCV颜色的存储是BGR）为RGB

在运行中，可以通过命令行参数来输出视频路径，所以这里实例化了一个video writer

# loop over frames from the video file stream
while True:
	# grab the next frame from the video file
	(grabbed, frame) = vs.read()
	# check to see if we have reached the end of the video file
	if frame is None:
		break
	# resize the frame for faster processing and then convert the frame from BGR to RGB ordering (dlib needs RGB ordering)
	frame = imutils.resize(frame, width=600)
	rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
	# if we are supposed to be writing a video to disk, initialize the writer
	if args["output"] is not None and writer is None:
		fourcc = cv2.VideoWriter_fourcc(*"MJPG")
		writer = cv2.VideoWriter(args["output"], fourcc, 				30,(frame.shape[1], frame.shape[0]), True)

为追踪任务来检测目标（tracker为空）

如果tracker为空，首先要检测输入帧中的对象

获取一个帧，转化它到blob对象（图像中相同像素的连通域，直观理解为色斑：相同像素组成的一小块，一小块特征，一团，一坨）中

通过网络来传递这个blob（包含追踪器和预测器）

    # if our correlation object tracker is None we first need to apply an object detector to seed the tracker with something to actually track
    if tracker is None:
        # grab the frame dimensions and convert the frame to a blob
        (h, w) = frame.shape[:2]
        blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)
        # pass the blob through the network and obtain the detections and predictions
        net.setInput(blob)
        detections = net.forward()

处理检测到的对象

如果我们的检测器找到一个对象，我们会抓取到一个可能性最大的。

这篇文章只演示如何使用dlib库来实现单目标的追踪，所以我们找到可能性最大的检测目标。

特征会演示如何检测和提取到具体的目标。

我们将会获取与对象关联的置信度和标签。

        # ensure at least one detection is made
        if len(detections) > 0:
            # find the index of the detection with the largest
            # probability -- out of convenience we are only going
            # to track the first object we find with the largest
            # probability; future examples will demonstrate how to
            # detect and extract *specific* objects
            i = np.argmax(detections[0, 0, :, 2])
            # grab the probability associated with the object along with its class label
            conf = detections[0, 0, i, 2]
            label = CLASSES[int(detections[0, 0, i, 1])]

过滤掉弱检测结果

首先要确保我们获取到了正确的目标类型（通过该目标的置信度比较）。这个例子中我们使用person或者cat，所以你可以看到过滤的结果。

我们通过确定目标的的边界box来协调我们的目标。

然后建立box对象来绘制目标对象。

        # filter out weak detections by requiring a minimum confidence
        if conf > args["confidence"] and label == args["label"]:
            # compute the (x, y)-coordinates of the bounding box for the object
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")
            # construct a dlib rectangle object from the bounding box coordinates and then start the dlib correlation tracker
            tracker = dlib.correlation_tracker()
            rect = dlib.rectangle(startX, startY, endX, endY)
            tracker.start_track(rgb, rect)
            # draw the bounding box and text for the object
            cv2.rectangle(frame, (startX, startY), (endX, endY),
                          (0, 255, 0), 2)
            cv2.putText(frame, label, (startX, startY - 15),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

处理已经建立好追踪器的情形

更新追踪目标（在这个更新方法的后面执行繁重的工作）。

从追踪器获取到目标对象的位置，比如一个机器人能够根据追踪器返回的目标进行寻找，在本篇文章中，我们只将追踪目标用box框起来并且注释。

    # otherwise, we've already performed detection so let's track
    # the object
    else:
        # update the tracker and grab the position of the tracked
        # object
        tracker.update(rgb)
        pos = tracker.get_position()
        # unpack the position object
        startX = int(pos.left())
        startY = int(pos.top())
        endX = int(pos.right())
        endY = int(pos.bottom())
        # draw the bounding box from the correlation object tracker
        cv2.rectangle(frame, (startX, startY), (endX, endY),
                      (0, 255, 0), 2)
        cv2.putText(frame, label, (startX, startY - 15),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

完成最后的循环

将结果输出到我们想要的视频中。

将处理好的帧（有box和注释）显示出来。

更新FPS值

    # check to see if we should write the frame to disk
    if writer is not None:
        writer.write(frame)
    # show the output frame
    cv2.imshow("Frame", frame)
    key = cv2.waitKey(1) & 0xFF
    # if the `q` key was pressed, break from the loop
    if key == ord("s"):
        break
    # update the FPS counter
    fps.update()

收尾工作

打印FPS吞吐量和释放指针。

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))
# check to see if we need to release the video writer pointer
if writer is not None:
    writer.release()
# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()

完整代码及示例

# USAGE
# python track_object.py --prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
#	--model mobilenet_ssd/MobileNetSSD_deploy.caffemodel --video input/race.mp4 \
#	--label person --output output/race_output.avi

# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import dlib
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
ap.add_argument("-l", "--label", required=True,
	help="class label we are interested in detecting + tracking")
ap.add_argument("-o", "--output", type=str,
	help="path to optional output video file")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
	"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
	"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
	"sofa", "train", "tvmonitor"]

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# initialize the video stream, dlib correlation tracker, output video
# writer, and predicted class label
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(args["video"])
tracker = None
writer = None
label = ""

# start the frames per second throughput estimator
fps = FPS().start()

# loop over frames from the video file stream
while True:
	# grab the next frame from the video file
	(grabbed, frame) = vs.read()

	# check to see if we have reached the end of the video file
	if frame is None:
		break

	# resize the frame for faster processing and then convert the
	# frame from BGR to RGB ordering (dlib needs RGB ordering)
	frame = imutils.resize(frame, width=600)
	rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

	# if we are supposed to be writing a video to disk, initialize
	# the writer
	if args["output"] is not None and writer is None:
		fourcc = cv2.VideoWriter_fourcc(*"MJPG")
		writer = cv2.VideoWriter(args["output"], fourcc, 30,
			(frame.shape[1], frame.shape[0]), True)

	# if our correlation object tracker is None we first need to
	# apply an object detector to seed the tracker with something
	# to actually track
	if tracker is None:
		# grab the frame dimensions and convert the frame to a blob
		(h, w) = frame.shape[:2]
		blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)

		# pass the blob through the network and obtain the detections
		# and predictions
		net.setInput(blob)
		detections = net.forward()

		# ensure at least one detection is made
		if len(detections) > 0:
			# find the index of the detection with the largest
			# probability -- out of convenience we are only going
			# to track the first object we find with the largest
			# probability; future examples will demonstrate how to
			# detect and extract *specific* objects
			i = np.argmax(detections[0, 0, :, 2])

			# grab the probability associated with the object along
			# with its class label
			conf = detections[0, 0, i, 2]
			label = CLASSES[int(detections[0, 0, i, 1])]

			# filter out weak detections by requiring a minimum
			# confidence
			if conf > args["confidence"] and label == args["label"]:
				# compute the (x, y)-coordinates of the bounding box
				# for the object
				box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
				(startX, startY, endX, endY) = box.astype("int")

				# construct a dlib rectangle object from the bounding
				# box coordinates and then start the dlib correlation
				# tracker
				tracker = dlib.correlation_tracker()
				rect = dlib.rectangle(startX, startY, endX, endY)
				tracker.start_track(rgb, rect)

				# draw the bounding box and text for the object
				cv2.rectangle(frame, (startX, startY), (endX, endY),
					(0, 255, 0), 2)
				cv2.putText(frame, label, (startX, startY - 15),
					cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

	# otherwise, we've already performed detection so let's track
	# the object
	else:
		# update the tracker and grab the position of the tracked
		# object
		tracker.update(rgb)
		pos = tracker.get_position()

		# unpack the position object
		startX = int(pos.left())
		startY = int(pos.top())
		endX = int(pos.right())
		endY = int(pos.bottom())

		# draw the bounding box from the correlation object tracker
		cv2.rectangle(frame, (startX, startY), (endX, endY),
			(0, 255, 0), 2)
		cv2.putText(frame, label, (startX, startY - 15),
			cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

	# check to see if we should write the frame to disk
	if writer is not None:
		writer.write(frame)

	# show the output frame
	cv2.imshow("Frame", frame)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

	# update the FPS counter
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# check to see if we need to release the video writer pointer
if writer is not None:
	writer.release()

# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()