提取视频关键帧和关键帧的时间点信息

最新推荐文章于 2024-08-06 08:25:39 发布

天地一扁舟

最新推荐文章于 2024-08-06 08:25:39 发布

阅读量1w

点赞数 1

分类专栏：视频处理文章标签： ffmpeg 视频处理视频关键帧提取关键帧时间点记录

视频处理专栏收录该内容

2 篇文章 0 订阅

订阅专栏

这篇文章的原文地址是:http://www.videoproductionslondon.com/blog/scene-change-detection-during-encoding-key-frame-extraction-code

里面主要介绍了如何提取视频关键帧，以及视频关键帧的时间点信息，这样就能做出各大视频网站的那种：点击视频下方进度条，显示该时间点附近关键帧图片的效果。

老外还是太牛，真心是佩服。

部分代码:

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">ffmpeg -vf select="eq(pict_type\,PICT_TYPE_I)" -i yourvideo.mp4 -vsync 2 -s 73x41 -f image2 thumbnails-%02d.jpeg</span> ffmpeg的-vf选择=“EQ（pict_type \，PICT_TYPE_I）”-i yourvideo.mp4 -vsync 2 -s -f 73x41 IMAGE2 thumbnails-％02d.jpeg</span> 
    

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">-loglevel debug 2>&1 | grep "pict_type:I -> select:1" | cut -d " " -f 6 - > keyframe-timecodes.txt</span> -loglevel调试2>＆1 | grep的“pict_type：我 - >选择：1”|切-d“”-f 6  - >关键帧timecodes.txt</span> 
    

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">[select @ 0000000001A88BE0] n:0 pts:0 t:0.000000 pos:1953 interlace_type:P key:0 pict_type:I -> select:1.000000</span> [选择@ 0000000001A88BE0] N：0分：0 T：0.000000 POS：1953年interlace_type：P键：0 pict_type：我 - >选择：1.000000</span> 
    

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">[select @ 0000000001A88BE0] n:1 pts:40000 t:0.040000 pos:4202 interlace_type:P key:0 pict_type:P -> select:0.000000</span> [选择@ 0000000001A88BE0] N：1点：40000电话：0.040000 POS：4202 interlace_type：P键：0 pict_type：P  - >选择：0.000000</span> 
    

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">t:0.000000</span> T：0.000000</span> 
    

 
     <span class="notranslate" onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">t:1.360000</span> T：1.360000</span> 
    

 
     < 
     span 
      class 
     = 
     "notranslate" 
      onmouseover 
     = 
     "_tipon(this)" 
      onmouseout 
     = 
     "_tipoff()" 
     >< 
     span 
      class 
     = 
     "google-src-text" 
      style 
     = 
     "direction: ltr; text-align: left" 
     >< 
     a 
      onclick 
     = 
     "jwplayer().seek(0); return false" 
      href 
     = 
     "#" 
     >< 
     img 
      src 
     = 
     "thumbnails-01.jpeg" 
     ></ 
     a 
     ></ 
     span 
     > < 
     a 
      onclick 
     = 
     "jwplayer().seek(0);返回false" 
      href 
     = 
     "#" 
     > < 
     IMG 
      SRC =“缩略图-01.jpeg”> </ 
     A 
     ></ 
     span 
     > 
    

 
     < 
     span 
      class 
     = 
     "notranslate" 
      onmouseover 
     = 
     "_tipon(this)" 
      onmouseout 
     = 
     "_tipoff()" 
     >< 
     span 
      class 
     = 
     "google-src-text" 
      style 
     = 
     "direction: ltr; text-align: left" 
     >< 
     a 
      onclick 
     = 
     "jwplayer().seek(1.36); return false" 
      href 
     = 
     "#" 
     >< 
     img 
      src 
     = 
     "thumbnails-02.jpeg" 
     ></ 
     a 
     ></ 
     span 
     > < 
     a 
      onclick 
     = 
     "jwplayer().seek(1.36);返回false" 
      href 
     = 
     "#" 
     > < 
     IMG 
      SRC =“缩略图-02.jpeg”> </ 
     A 
     ></ 
     span 
     > 
    

Scene change detection during encoding and key frame extraction code

Posted by: Alex on January 29th, 2012

Tags:
- video production london,
- ffmpeg

In a previous post we explained how to generate thumbnails for each scene change using Edit Decision Lists from your non-linear editor of choice. Sometimes, though, you won't have an EDL along with your video assets (for instance, when you are using off-airs or videos edited by somebody else). In those situations we can still create a thumbnail for each scene change thanks to how frames are structured in digital compression.

According to Wikipedia, a GOP structure specifies the order in which intra- and inter-frames are arranged. A Group Of Pictures can contain the following frame types:

I-frame (intra coded picture) - reference picture, which represents a full image and which is independent of other picture types. Each GOP begins with this type of picture.
P-frame (predictive coded picture) - contains motion-compensated difference information from the preceding I- or P-frame.
B-frame (bidirectionally predictive coded picture) - contains difference information from the preceding and following I- or P-frame within a GOP.

The GOP structure is often referred by two numbers, for example M=3, N=12, which equals IBBPBBPBBPBBI. The first one tells the distance between two anchor frames (I or P) and the second the distance between two I-frames (GOP length).

In order to use key frames as a scene change detection method we need to use flexible GOP structures, with minimum (min-keynt) and maximum (keyint) values when encoding our video.

For example, a minimum setting that is the same as the frame rate of the video will prevent the encoded video from having two subsequent key frames within a second of each other. Please note that if your video has scenes shorter than a second long you won't be able to detect all scene changes unless you reduce this setting.

Similarly, a maximum setting ensures that a key frame is inserted at least every X number of frames. A recommend setting is to set this as 10 times the frame rate, which equates to 10 seconds of video between key frames. We can set this to infinite to never insert non-scenecut key frames although this might cause problems when seeking (if you try to skip to a part of the video without a key frame, there won't be any video until the next key frame is reached).

In addition, we need to define the threshold for scenecut detection. The encoder calculates a metric for every frame to estimate how different it is from the previous frame. If the value is lower than the threshold, a scenecut is detected.

Once we have encoded the video we can then run the following ffmpeg (I've used 0.8-win64-static) parameters:

ffmpeg -vf select="eq(pict_type\,PICT_TYPE_I)" -i yourvideo.mp4 -vsync 2 -s 73x41 -f image2 thumbnails-%02d.jpeg
-loglevel debug 2>&1 | grep "pict_type:I -> select:1" | cut -d " " -f 6 - > keyframe-timecodes.txt

What follows -vf in a ffmpeg command line is a Filtergraph description. The select filter selects frames to pass in output. The constant of the filter is “pict_type” and the value “ PICT_TYPE_I”. In short, we are only passing key frames to the output.

-vsync 2 prevents ffmpeg to generate more than one copy for each key frame.

-f image2 writes video frames to image files. The output filenames are specified by a pattern, which can be used to produce sequentially numbered series of files. The pattern may contain the string "%d" or "%0Nd".

-loglevel debug 2 > keyframe-timecodes.txt outputs:

[select @ 0000000001A88BE0] n:0 pts:0 t:0.000000 pos:1953 interlace_type:P key:0 pict_type:I -> select:1.000000
[select @ 0000000001A88BE0] n:1 pts:40000 t:0.040000 pos:4202 interlace_type:P key:0 pict_type:P -> select:0.000000

I use “>&1 | grep "pict_type:I -> select:1" | cut -d " " -f 6 -” to output something more readable:

t:0.000000
t:1.360000

Finally, I can convert “keyframe-timecodes.txt” into a chapter navigation list and use the thumbnails to navigate the video:

<a οnclick="jwplayer().seek(0); return false" href="#"><img src="thumbnails-01.jpeg"></a>
<a οnclick="jwplayer().seek(1.36); return false" href="#"><img src="thumbnails-02.jpeg"></a>