视频镜头转换检测与图片模糊检测

最新推荐文章于 2025-01-11 23:53:31 发布

NCU_wander

最新推荐文章于 2025-01-11 23:53:31 发布

阅读量1.4k

点赞数 3

分类专栏：图像相关基础知识

本文链接：https://blog.csdn.net/NCU_wander/article/details/107362583

版权

图像相关基础知识专栏收录该内容

9 篇文章

订阅专栏

最近在看有关于视频镜头转换检测的相关内容，突然感觉和研究生期间做的路沿检测内容非常相似，也就是当年由陈东大佬命名的滑动窗格法。重看镜头转换检测内容，觉得十分亲切。

1、镜头转换检测

传统的镜头切换检测是通过图片的特征变换来做到的，简而言之就是提取视频连续的帧一定的特征，如果特征发生剧烈的变化则说明视频中的镜头发生了切换。在Blog镜头分割：像素域方法综述中提出了比较多的特征方法，包括直方图、边缘检测算子等特征，下面以单一算子为例来介绍镜头切换检测的滑动窗格法。

首先是思路上来说：根据帧图像的灰度值直方图差异进行边缘检测，差异值越大的帧可能就是镜头边缘帧。这种方式可以避免在镜头移动或者图像中出现动态移动的时候差异，提高边缘检测的准确性。其中要注意的地方
1、相邻的两个镜头，中间的帧图像个数应该有一个阈值，也就是说帧数相差太少不认可为新的一个镜头
2、检测出来的镜头边缘帧，它与前一帧的差值应该是此镜头中所有图片中帧差最大的。其帧差数值也应该是当前镜头中所有帧差均值的一个倍数（比如要大于平均帧差的5倍）。

1、创建一个窗口，定义窗口中帧的数量，每次对窗口中的帧进行判断。然后取对应数量的帧；
2、计算窗口中差值最大的帧，定义为可疑的镜头边缘帧M再进行下一步判断；
3、取得前一镜头边缘帧P，判断当前M与P中间的帧数量，是否超过设定的镜头最小帧数阈值，如果超过进入下一步；否则舍弃M，清空窗口数据，继续重复1-3步骤；
4、判断M的差值是不是P到M的平均差值(不包括M的差值)的一个阈值倍数。

滑动窗格的主体思路就是这样镜头窗格额滑动选取，除此之外需要考虑对全黑帧的剔除、镜头选取之后的去模糊等后续完善工作。
视频镜头分割与关键帧提取
 视频镜头分割

如果只是需要快速实践镜头切换，可以直接借助github上已有的成熟项目 scenedetect。

pip install scenedetect
scenedetect -i liziqi.flv -o ./scenecut -s liziqi.stats.csv list-scenes detect-content save-images -n 1 -o ./scenecut/images export-html

感觉使用起来效果还是可以的，10000帧的视频检测出87帧镜头切换结果。

除去命令行启动方式之外，在github上下载原代码，结合api_test.py和scene_manager.py，可以改写出基于HSV的content检测镜像转换，并对单个镜头所有帧照片进行模糊检测，选取最清楚的三张图片保存下来。这部分代码的实践难度不大，完全可以轻松搞定。

2、图像清晰度检测

传统的图像清晰度检测手段多样，核心思想都是通过计算边缘梯度来得到图像的边缘锐利程度。因为清晰的图像边缘较为锐利，而模糊的图像使用梯度算子检测边缘得不到高的数值。

以Laplace 算子为例：

Laplace 算子是一种各向同性算子，二阶微分算子，在只关心边缘的位置而不考虑其周围的象素灰度差值时比较合适。Laplace算子对孤立像素的响应要比对边缘或线性像素的响应更强烈，因此只适用于无噪声图像。在图片存在噪声情况下，使用Laplace 算子检测边缘之前需要先进行低通滤波。所以，通常的边缘分割算法都是把Laplace算子和平滑算子结合起来生成一个新的模板。

图像锐化处理的作用是使灰度反差增强，从而使模糊图像变得更加清晰。图像模糊的实质就是图像受到平均运算或积分运算，因此可以对图像进行逆运算，如微分运算能够突出图像细节，使图像变得更为清晰。

由于拉普拉斯是一种微分算子，它的应用可增强图像中灰度突变的区域，减弱灰度的缓慢变化区域。因此，锐化处理可选择拉普拉斯算子对原图像进行处理，产生描述灰度突变的图像，再将拉普拉斯图像与原始图像叠加而产生锐化图像。这种简单的锐化方法既可以产生拉普拉斯锐化处理的效果，同时又能保留背景信息，将原始图像叠加到拉普拉斯变换的处理结果中去，可以使图像中的各灰度值得到保留，使灰度突变处的对比度得到增强，最终结果是保留图像背景的前提下，突显出图像中的小细节，缺点是对图像中的某些边缘产生双重相应。

Laplace用于图像清晰度判断的一个比较大的依赖在于阈值的设定，阈值需要结合任务进行合理设定。考虑到阈值与图片大小有着直接的相互关系，因此最好在阈值计算之前进行统一的图片resize操作。

#!/usr/bin/env python
# -*- coding: utf-8 -*-

'''
Blur Detection works using the total variance of the laplacian of an image, this provides a quick and accurate method for scoring how blurry an image is.
# run on a single image
python process.py -i input_image.png

# run on a directory of images
python process.py -i input_directory/ 

The saved json file has information on how blurry an image is, the higher the value, the less blurry the image.

see more details in https://github.com/WillBrennan/BlurDetection2
'''

import sys
import argparse
import logging
import pathlib
import json

import cv2
import numpy

def fix_image_size(image: numpy.array, expected_pixels: float = 2E6):
    ratio = expected_pixels / (image.shape[0] * image.shape[1])
    return cv2.resize(image, (0, 0), fx=ratio, fy=ratio)


def estimate_blur(image: numpy.array, threshold: int = 100):
    if image.ndim == 3:
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    blur_map = cv2.Laplacian(image, cv2.CV_64F)
    score = numpy.var(blur_map)
    return blur_map, score, bool(score < threshold)


def pretty_blur_map(blur_map: numpy.array, sigma: int = 5, min_abs: float = 0.5):
    abs_image = numpy.abs(blur_map).astype(numpy.float32)
    abs_image[abs_image < min_abs] = min_abs

    abs_image = numpy.log(abs_image)
    cv2.blur(abs_image, (sigma, sigma))
    return cv2.medianBlur(abs_image, sigma)


def parse_args():
    parser = argparse.ArgumentParser(description='run blur detection on a single image')
    parser.add_argument('-i', '--images', type=str, nargs='+', required=True, help='directory of images')
    parser.add_argument('-s', '--save-path', type=str, default=None, help='path to save output')

    parser.add_argument('-t', '--threshold', type=float, default=100.0, help='blurry threshold')
    parser.add_argument('-f', '--variable-size', action='store_true', help='fix the image size')

    parser.add_argument('-v', '--verbose', action='store_true', help='set logging level to debug')
    parser.add_argument('-d', '--display', action='store_true', help='display images')

    return parser.parse_args()


def find_images(image_paths, img_extensions=['.jpg', '.png', '.jpeg']):
    img_extensions += [i.upper() for i in img_extensions]

    for path in image_paths:
        path = pathlib.Path(path)

        if path.is_file():
            if path.suffix not in img_extensions:
                logging.info(f'{path.suffix} is not an image extension! skipping {path}')
                continue
            else:
                yield path

        if path.is_dir():
            for img_ext in img_extensions:
                yield from path.rglob(f'*{img_ext}')


if __name__ == '__main__':
    assert sys.version_info >= (3, 6), sys.version_info
    args = parse_args()

    level = logging.DEBUG if args.verbose else logging.INFO
    logging.basicConfig(level=level)

    fix_size = not args.variable_size
    logging.info(f'fix_size: {fix_size}')

    if args.save_path is not None:
        save_path = pathlib.Path(args.save_path)
        assert save_path.suffix == '.json', save_path.suffix
    else:
        save_path = None

    results = []

    for image_path in find_images(args.images):
        image = cv2.imread(str(image_path))
        if image is None:
            logging.warning(f'warning! failed to read image from {image_path}; skipping!')
            continue

        logging.info(f'processing {image_path}')

        if fix_size:
            image = fix_image_size(image)
        else:
            logging.warning('not normalizing image size for consistent scoring!')

        blur_map, score, blurry = estimate_blur(image, threshold=args.threshold)

        logging.info(f'image_path: {image_path} score: {score} blurry: {blurry}')
        results.append({'input_path': str(image_path), 'score': score, 'blurry': blurry})

        if args.display:
            cv2.imshow('input', image)
            cv2.imshow('result', pretty_blur_map(blur_map))

            if cv2.waitKey(0) == ord('q'):
                logging.info('exiting...')
                exit()

    if save_path is not None:
        logging.info(f'saving json to {save_path}')

        with open(save_path, 'w') as result_file:
            data = {'images': args.images, 'threshold': args.threshold, 'fix_size': fix_size, 'results': results}
            json.dump(data, result_file, indent=4)

3、过暗图片分拣

理解一下图像的灰度平均值与灰度平均方差，灰度平均值主要反映的是图像的基础亮度，而灰度平均方差可以反应图像高频部分的大小。如果一副图片灰度平均值过高或者过低，那么对应的图片可能处于过曝或者曝光不足的状态；如果一幅图看起来很均一，则这张图的灰度平均方差值较小；反之若一张图片看起来非常的鲜艳，则灰度平均方差值较大。

以电脑屏幕为例，如果将屏幕亮度整体调大，则均值变大，方差保持不变；如果调动屏幕的对比度，则方差发生较大变化，均值基本上不变。

当然区分明与暗图片的另外一个有效手段是进行直方图，通过统计直方图中在低数值区域与高数值区域所占像素数目，进而计算百分比来判断图片处于过曝或者低曝光的拍摄状态。