ImageAI之CustomVideoObjectDetection之detectObjectsFromVideo方法介绍-CSDN博客

本文链接：https://blog.csdn.net/warylee/article/details/107980000

当我看到了这个方法，头大呀，这么多参数，对于英文不好的我，下面的英文注释，头更大了，硬着头皮一点点的看，就是为了实现对摄像头视频流的实时检测。
把看的内容记录一下。首先看一下这个方法头：

def detectObjectsFromVideo(self, input_file_path="", camera_input=None, output_file_path="", frames_per_second=20,
                           frame_detection_interval=1, minimum_percentage_probability=50, log_progress=False,
                           display_percentage_probability=True, display_object_name=True, save_detected_video=True,
                           per_frame_function=None, per_second_function=None, per_minute_function=None,
                           video_complete_function=None, return_detected_frame=False, detection_timeout = None):

英文注释

英文好的应该看到其方法的注释针对非常详尽，下面我把我读这个方法的翻译和理解介绍一下。先看一下英文注释：

 """

    'detectObjectsFromVideo()' function is used to detect objects observable in the given video path or a camera input:
        * input_file_path , which is the file path to the input video. It is required only if 'camera_input' is not set
        * camera_input , allows you to parse in camera input for live video detections
        * output_file_path , which is the path to the output video. It is required only if 'save_detected_video' is not set to False
        * frames_per_second , which is the number of frames to be used in the output video
        * frame_detection_interval (optional, 1 by default)  , which is the intervals of frames that will be detected.
        * minimum_percentage_probability (optional, 50 by default) , option to set the minimum percentage probability for nominating a detected object for output.
        * log_progress (optional) , which states if the progress of the frame processed is to be logged to console
        * display_percentage_probability (optional), can be used to hide or show probability scores on the detected video frames
        * display_object_name (optional), can be used to show or hide object names on the detected video frames
        * save_save_detected_video (optional, True by default), can be set to or not to save the detected video
        * per_frame_function (optional), this parameter allows you to parse in a function you will want to execute after each frame of the video is detected. If this parameter is set to a function, after every video  frame is detected, the function will be executed with the following values parsed into it:
            -- position number of the frame
            -- an array of dictinaries, with each dictinary corresponding to each object detected. Each dictionary contains 'name', 'percentage_probability' and 'box_points'
            -- a dictionary with with keys being the name of each unique objects and value are the number of instances of the object present
            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed as the fourth value into the function

        * per_second_function (optional), this parameter allows you to parse in a function you will want to execute after each second of the video is detected. If this parameter is set to a function, after every second of a video is detected, the function will be executed with the following values parsed into it:
            -- position number of the second
            -- an array of dictionaries whose keys are position number of each frame present in the last second , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame
            -- an array of dictionaries, with each dictionary corresponding to each frame in the past second, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame
            -- a dictionary with its keys being the name of each unique object detected throughout the past second, and the key values are the average number of instances of the object found in all the frames contained in the past second
            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed
                                                                as the fifth value into the function

        * per_minute_function (optional), this parameter allows you to parse in a function you will want to execute after each minute of the video is detected. If this parameter is set to a function, after every minute of a video is detected, the function will be executed with the following values parsed into it:
            -- position number of the minute
            -- an array of dictionaries whose keys are position number of each frame present in the last minute , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame

            -- an array of dictionaries, with each dictionary corresponding to each frame in the past minute, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame

            -- a dictionary with its keys being the name of each unique object detected throughout the past minute, and the key values are the average number of instances of the object found in all the frames contained in the past minute

            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed as the fifth value into the function

        * video_complete_function (optional), this parameter allows you to parse in a function you will want to execute after all of the video frames have been detected. If this parameter is set to a function, after all of frames of a video is detected, the function will be executed with the following values parsed into it:
            -- an array of dictionaries whose keys are position number of each frame present in the entire video , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame
            -- an array of dictionaries, with each dictionary corresponding to each frame in the entire video, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame
            -- a dictionary with its keys being the name of each unique object detected throughout the entire video, and the key values are the average number of instances of the object found in all the frames contained in the entire video

        * return_detected_frame (optionally, False by default), option to obtain the return the last detected video frame into the per_per_frame_function, per_per_second_function or per_per_minute_function

        * detection_timeout (optionally, None by default), option to state the number of seconds of a video that should be detected after which the detection function stop processing the video

    :param input_file_path:
    :param camera_input:
    :param output_file_path:
    :param frames_per_second:
    :param frame_detection_interval:
    :param minimum_percentage_probability:
    :param log_progress:
    :param display_percentage_probability:
    :param display_object_name:
    :param save_detected_video:
    :param per_frame_function:
    :param per_second_function:
    :param per_minute_function:
    :param video_complete_function:
    :param return_detected_frame:
    :param detection_timeout:
    :return output_video_filepath:
    :return counting:
    :return output_objects_array:
    :return output_objects_count:
    :return detected_copy:
    :return this_second_output_object_array:
    :return this_second_counting_array:
    :return this_second_counting:
    :return this_minute_output_object_array:
    :return this_minute_counting_array:
    :return this_minute_counting:
    :return this_video_output_object_array:
    :return this_video_counting_array:
    :return this_video_counting:
    """

参数说明翻译：

方法名字：
detectObjectsFromVideo：该方法用于检测给定的视频或摄像机输入作为检测对象
参数：
input_file_path：输入视频的文件路径。只有当‘camera_input’参数没有被设置时，该参数为必填参数；
camera_input：输入您要检测的实时视频
output_file_path：这是输出视频的路径。只有当’save_detected_video’没有被设置为False时，才需要它
frames_per_second：输出视频中使用的帧数是多少
frame_detection_interval（可选参数，默认为1）：哪一帧的间隔会被检测到
minimum_percentage_probability (optional, 50 by default)：选项设置提名检测到的对象进行输出的最小百分比概率。
log_progress (optional) ：将哪一种状态的的帧的进程记录到控制台中
display_percentage_probability (optional)：可以用来隐藏或显示检测到的视频帧的概率分数
display_object_name (optional)：用于在检测到的视频帧上显示或隐藏对象名称
save_save_detected_video (optional, True by default)：是否可以设置为保存检测到的视频
per_frame_function (optional)：该参数允许您在检测到视频的每一帧后执行的函数（回调函数）中进行解析。如果将此参数设置为一个函数，则在检测到每个视频帧之后，将执行该函数，并将以下值解析到函数中

帧的位置号
字典的数组，每个字典对应于检测到的每个对象。每个字典都包含“name”、“percentage_probability”和“box_points”
键是每个唯一对象的名称，值是当前对象的实例数
如果将return_detected_frame设置为True，则被检测帧的numpy数组将被解析为函数中的第四个值

*******省略了一些（其他的几个函数暂时不介绍了。）

return_detected_frame (optionally, False by default)：选项获取返回最后检测到的视频帧到per_per_frame_function, per_per_second_function
detection_timeout (optionally, None by default)：选项表示应该检测的视频的秒数，在此之后检测功能停止对视频的处理

示例代码

我的方法调用示例（代码不完整，抱歉，只表明调用过程）

if __name__ == '__main__':
    run_camera_test_net()

def run_camera_test_net():
    user, pwd, ip, channel = "admin", "123456789", "10.13.10.64", 1
    cap_path = "rtsp://%s:%s@%s/h264/ch%s/main/av_stream" % (user, pwd, ip, channel)  # HIKIVISION old version 2015
    #读取海康威视视频流
    cap = cv2.VideoCapture(cap_path)
    #将视频流给**检测处理方法
    detect_from_camera_image(cap)

#处理摄像中的**检测
def detect_from_camera_image(cap_path):
    detector = CustomVideoObjectDetection()
    detector.setModelTypeAsYOLOv3()
    detector.setModelPath(detection_model_path= "test.h5")#未提供该模型
    detector.setJsonPath(configuration_json= "test.json")#未提供该配置文件
    detector.loadModel()
    detected_video_path = detector.detectObjectsFromVideo(camera_input=cap_path,
                                                          save_detected_video=True,
                                                          frame_detection_interval=1,
                                                          minimum_percentage_probability=50,
                                                          output_file_path=os.path.join(execution_path, "camera-detected"),
                                                          frames_per_second=20,
                                                          per_frame_function=show_video,
                                                          return_detected_frame=True,
                                                          log_progress=True)

#摄像头检测后图像的回调函数，用来图像的实时展示。
def show_video(counting,output_objects_array,output_objects_count,detected_frame):
    print(counting)#帧的位置号
    print(output_objects_array)#字典的数组，每个字典对应于检测到的每个对象。每个字典都包含“name”、“percentage_probability”和“box_points”
    print(output_objects_count)#键是每个唯一对象的名称，值是当前对象的实例数
    cv2.imshow("frame", detected_frame)#如果将return_detected_frame设置为True，则被检测帧的numpy数组将被解析为函数中的第四个值
    cv2.waitKey(1)

方法源码

该方法的源码如下：

def detectObjectsFromVideo(self, input_file_path="", camera_input=None, output_file_path="", frames_per_second=20,
                           frame_detection_interval=1, minimum_percentage_probability=50, log_progress=False,
                           display_percentage_probability=True, display_object_name=True, save_detected_video=True,
                           per_frame_function=None, per_second_function=None, per_minute_function=None,
                           video_complete_function=None, return_detected_frame=False, detection_timeout = None):




    """

    'detectObjectsFromVideo()' function is used to detect objects observable in the given video path or a camera input:
        * input_file_path , which is the file path to the input video. It is required only if 'camera_input' is not set
        * camera_input , allows you to parse in camera input for live video detections
        * output_file_path , which is the path to the output video. It is required only if 'save_detected_video' is not set to False
        * frames_per_second , which is the number of frames to be used in the output video
        * frame_detection_interval (optional, 1 by default)  , which is the intervals of frames that will be detected.
        * minimum_percentage_probability (optional, 50 by default) , option to set the minimum percentage probability for nominating a detected object for output.
        * log_progress (optional) , which states if the progress of the frame processed is to be logged to console
        * display_percentage_probability (optional), can be used to hide or show probability scores on the detected video frames
        * display_object_name (optional), can be used to show or hide object names on the detected video frames
        * save_save_detected_video (optional, True by default), can be set to or not to save the detected video
        * per_frame_function (optional), this parameter allows you to parse in a function you will want to execute after each frame of the video is detected. If this parameter is set to a function, after every video  frame is detected, the function will be executed with the following values parsed into it:
            -- position number of the frame
            -- an array of dictinaries, with each dictinary corresponding to each object detected. Each dictionary contains 'name', 'percentage_probability' and 'box_points'
            -- a dictionary with with keys being the name of each unique objects and value are the number of instances of the object present
            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed as the fourth value into the function

        * per_second_function (optional), this parameter allows you to parse in a function you will want to execute after each second of the video is detected. If this parameter is set to a function, after every second of a video is detected, the function will be executed with the following values parsed into it:
            -- position number of the second
            -- an array of dictionaries whose keys are position number of each frame present in the last second , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame
            -- an array of dictionaries, with each dictionary corresponding to each frame in the past second, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame
            -- a dictionary with its keys being the name of each unique object detected throughout the past second, and the key values are the average number of instances of the object found in all the frames contained in the past second
            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed
                                                                as the fifth value into the function

        * per_minute_function (optional), this parameter allows you to parse in a function you will want to execute after each minute of the video is detected. If this parameter is set to a function, after every minute of a video is detected, the function will be executed with the following values parsed into it:
            -- position number of the minute
            -- an array of dictionaries whose keys are position number of each frame present in the last minute , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame

            -- an array of dictionaries, with each dictionary corresponding to each frame in the past minute, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame

            -- a dictionary with its keys being the name of each unique object detected throughout the past minute, and the key values are the average number of instances of the object found in all the frames contained in the past minute

            -- If return_detected_frame is set to True, the numpy array of the detected frame will be parsed as the fifth value into the function

        * video_complete_function (optional), this parameter allows you to parse in a function you will want to execute after all of the video frames have been detected. If this parameter is set to a function, after all of frames of a video is detected, the function will be executed with the following values parsed into it:
            -- an array of dictionaries whose keys are position number of each frame present in the entire video , and the value for each key is the array for each frame that contains the dictionaries for each object detected in the frame
            -- an array of dictionaries, with each dictionary corresponding to each frame in the entire video, and the keys of each dictionary are the name of the number of unique objects detected in each frame, and the key values are the number of instances of the objects found in the frame
            -- a dictionary with its keys being the name of each unique object detected throughout the entire video, and the key values are the average number of instances of the object found in all the frames contained in the entire video

        * return_detected_frame (optionally, False by default), option to obtain the return the last detected video frame into the per_per_frame_function, per_per_second_function or per_per_minute_function

        * detection_timeout (optionally, None by default), option to state the number of seconds of a video that should be detected after which the detection function stop processing the video

    :param input_file_path:
    :param camera_input:
    :param output_file_path:
    :param frames_per_second:
    :param frame_detection_interval:
    :param minimum_percentage_probability:
    :param log_progress:
    :param display_percentage_probability:
    :param display_object_name:
    :param save_detected_video:
    :param per_frame_function:
    :param per_second_function:
    :param per_minute_function:
    :param video_complete_function:
    :param return_detected_frame:
    :param detection_timeout:
    :return output_video_filepath:
    :return counting:
    :return output_objects_array:
    :return output_objects_count:
    :return detected_copy:
    :return this_second_output_object_array:
    :return this_second_counting_array:
    :return this_second_counting:
    :return this_minute_output_object_array:
    :return this_minute_counting_array:
    :return this_minute_counting:
    :return this_video_output_object_array:
    :return this_video_counting_array:
    :return this_video_counting:
    """

    output_frames_dict = {}
    output_frames_count_dict = {}

    input_video = cv2.VideoCapture(input_file_path)
    if (camera_input != None):
        input_video = camera_input

    output_video_filepath = output_file_path + '.avi'

    frame_width = int(input_video.get(3))
    frame_height = int(input_video.get(4))
    output_video = cv2.VideoWriter(output_video_filepath, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'),
                                   frames_per_second,
                                   (frame_width, frame_height))

    counting = 0
    predicted_numbers = None
    scores = None
    detections = None


    detection_timeout_count = 0
    video_frames_count = 0


    if(self.__model_type == "yolov3"):



        while (input_video.isOpened()):
            ret, frame = input_video.read()

            if (ret == True):

                detected_frame = frame.copy()

                video_frames_count += 1
                if (detection_timeout != None):
                    if ((video_frames_count % frames_per_second) == 0):
                        detection_timeout_count += 1

                    if (detection_timeout_count >= detection_timeout):
                        break

                output_objects_array = []

                counting += 1

                if (log_progress == True):
                    print("Processing Frame : ", str(counting))



                check_frame_interval = counting % frame_detection_interval

                if (counting == 1 or check_frame_interval == 0):
                    try:
                        detected_frame, output_objects_array = self.__detector.detectObjectsFromImage(
                            input_image=frame, input_type="array", output_type="array",
                            minimum_percentage_probability=minimum_percentage_probability,
                            display_percentage_probability=display_percentage_probability,
                            display_object_name=display_object_name)
                    except:
                        None


                output_frames_dict[counting] = output_objects_array

                output_objects_count = {}
                for eachItem in output_objects_array:
                    eachItemName = eachItem["name"]
                    try:
                        output_objects_count[eachItemName] = output_objects_count[eachItemName] + 1
                    except:
                        output_objects_count[eachItemName] = 1

                output_frames_count_dict[counting] = output_objects_count


                if (save_detected_video == True):
                    output_video.write(detected_frame)

                if (counting == 1 or check_frame_interval == 0):
                    if (per_frame_function != None):
                        if (return_detected_frame == True):
                            per_frame_function(counting, output_objects_array, output_objects_count,
                                               detected_frame)
                        elif (return_detected_frame == False):
                            per_frame_function(counting, output_objects_array, output_objects_count)

                if (per_second_function != None):
                    if (counting != 1 and (counting % frames_per_second) == 0):

                        this_second_output_object_array = []
                        this_second_counting_array = []
                        this_second_counting = {}

                        for aa in range(counting):
                            if (aa >= (counting - frames_per_second)):
                                this_second_output_object_array.append(output_frames_dict[aa + 1])
                                this_second_counting_array.append(output_frames_count_dict[aa + 1])

                        for eachCountingDict in this_second_counting_array:
                            for eachItem in eachCountingDict:
                                try:
                                    this_second_counting[eachItem] = this_second_counting[eachItem] + \
                                                                     eachCountingDict[eachItem]
                                except:
                                    this_second_counting[eachItem] = eachCountingDict[eachItem]

                        for eachCountingItem in this_second_counting:
                            this_second_counting[eachCountingItem] = int(this_second_counting[eachCountingItem] / frames_per_second)

                        if (return_detected_frame == True):
                            per_second_function(int(counting / frames_per_second),
                                                this_second_output_object_array, this_second_counting_array,
                                                this_second_counting, detected_frame)

                        elif (return_detected_frame == False):
                            per_second_function(int(counting / frames_per_second),
                                                this_second_output_object_array, this_second_counting_array,
                                                this_second_counting)

                if (per_minute_function != None):

                    if (counting != 1 and (counting % (frames_per_second * 60)) == 0):

                        this_minute_output_object_array = []
                        this_minute_counting_array = []
                        this_minute_counting = {}

                        for aa in range(counting):
                            if (aa >= (counting - (frames_per_second * 60))):
                                this_minute_output_object_array.append(output_frames_dict[aa + 1])
                                this_minute_counting_array.append(output_frames_count_dict[aa + 1])

                        for eachCountingDict in this_minute_counting_array:
                            for eachItem in eachCountingDict:
                                try:
                                    this_minute_counting[eachItem] = this_minute_counting[eachItem] + \
                                                                     eachCountingDict[eachItem]
                                except:
                                    this_minute_counting[eachItem] = eachCountingDict[eachItem]

                        for eachCountingItem in this_minute_counting:
                            this_minute_counting[eachCountingItem] = int(this_minute_counting[eachCountingItem] / (frames_per_second * 60))

                        if (return_detected_frame == True):
                            per_minute_function(int(counting / (frames_per_second * 60)),
                                                this_minute_output_object_array, this_minute_counting_array,
                                                this_minute_counting, detected_frame)

                        elif (return_detected_frame == False):
                            per_minute_function(int(counting / (frames_per_second * 60)),
                                                this_minute_output_object_array, this_minute_counting_array,
                                                this_minute_counting)


            else:
                break

        if (video_complete_function != None):

            this_video_output_object_array = []
            this_video_counting_array = []
            this_video_counting = {}

            for aa in range(counting):
                this_video_output_object_array.append(output_frames_dict[aa + 1])
                this_video_counting_array.append(output_frames_count_dict[aa + 1])

            for eachCountingDict in this_video_counting_array:
                for eachItem in eachCountingDict:
                    try:
                        this_video_counting[eachItem] = this_video_counting[eachItem] + \
                                                        eachCountingDict[eachItem]
                    except:
                        this_video_counting[eachItem] = eachCountingDict[eachItem]

            for eachCountingItem in this_video_counting:
                this_video_counting[eachCountingItem] = this_video_counting[
                                                            eachCountingItem] / counting

            video_complete_function(this_video_output_object_array, this_video_counting_array,
                                    this_video_counting)

        input_video.release()
        output_video.release()

        if (save_detected_video == True):
            return output_video_filepath