FFMPEG学习-版本为4.2.2 使用opencv播放 opencv版本为4.2.0

最新推荐文章于 2024-06-24 16:53:54 发布

清风弥天

最新推荐文章于 2024-06-24 16:53:54 发布

阅读量1.4k

点赞数 1

分类专栏： ffmpeg opencv

本文链接：https://blog.csdn.net/d137578736/article/details/104424859

版权

ffmpeg 同时被 2 个专栏收录

2 篇文章 0 订阅

订阅专栏

opencv

2 篇文章 0 订阅

订阅专栏

FFMPEG信息获取和基本开发环境搭建

ffmpeg当前学习版本为4.2.2
https://www.cnblogs.com/leisure_chn/category/1351812.html 教程主要参考网址
ffmpeg sdk下载地址：https://ffmpeg.zeranoe.com/builds/
ffmpeg官网地址：http://ffmpeg.org/
音视频基础术语介绍：https://www.jianshu.com/p/e9e6484b1f89
Ffmpeg环境搭建 https://blog.csdn.net/qq_41051855/article/details/78665620 https://blog.csdn.net/yao_hou/article/details/80553660
ffmpeg命令行工具加入环境变量 https://jingyan.baidu.com/article/a3a3f81124c5e08da2eb8a29.html
ffmpeg命令行入门教程 http://www.ruanyifeng.com/blog/2020/01/ffmpeg.html
国内教程 https://blog.csdn.net/leixiaohua1020/article/details/15811977 雷神教程
网友参考教程https://blog.csdn.net/yao_hou/category_9275800.html
官网教程http://ffmpeg.tv/
Ffmpeg库sdk使用时候的链接问题：https://blog.csdn.net/huyu107/article/details/51980029

将dll库文件放入对应程序目录下面

12. 雷神总结音视频学习方法 https://blog.csdn.net/leixiaohua1020/article/details/18893769

13. ffmpeg sdk中的数据结构和函数总结和记录：

AVFormatContext

描述媒体文件或媒体流构成和基本信息（包含码流参数较多，位于：avformat.h）,封装格式上下文结构体，也是统领全局的结构体，保存了视频文件封装格式相关信息

在使用FFMPEG进行开发的时候，AVFormatContext是一个贯穿始终的数据结构，很多函数都要用到它作为参数。它是FFMPEG解封装（flv，mp4，rmvb，avi）功能的结构体。下面看几个主要变量的作用（在这里考虑解码的情况）：

主要变量：

struct AVInputFormat *iformat：输入数据的封装格式

AVIOContext *pb：输入数据缓存

unsigned int nb_streams：音视频流个数（输入视频的AVStream 个数）

AVStream **streams：音视频流（输入视频的AVStream []数组）

char filename[1024]：文件名

int64_t duration：时长（单位：us）（输入视频的时长（以微秒为单位））

int bit_rate：比特率（输入视频的码率）

AVDictionary *metadata：元数据

参考地址：https://blog.csdn.net/leixiaohua1020/article/details/14214705

AVInputFormat

每种封装格式（例如FLV, MKV, MP4, AVI）对应一个该结构体。

long_name：封装格式的长名称

extensions：封装格式的扩展名

id：封装格式ID

一些封装格式处理的接口函数

AVCodecContext：

描述编解码器上下文的数据结构，包含编解码器需要的参数信息（位于：avcodec.h）,编码器上下文结构体，保存了视频（音频）编解码相关信息，AVCodecContext中很多的参数是编码的时候使用的，而不是解码的时候使用的。

说明：

codec：编解码器的AVCodec

width, height：图像的宽高（只针对视频）

pix_fmt：像素格式（只针对视频）

sample_rate：采样率（只针对音频）

channels：声道数（只针对音频）

sample_fmt：采样格式（只针对音频）

参考网址：https://blog.csdn.net/leixiaohua1020/article/details/14214859

AVCodecContext函数的空间开辟和释放空间函数如下：

AVCodecContext * avcode_alloc_context3(const AVCodec *codec)

初始化解码器上下文

void avcodec_free_context(AVCodecContext ** avcodecContext)

释放解码器上下文

参考网址 https://blog.csdn.net/king1425/article/details/70622258 ffmpeg源码简析（四）avcodec_find_encoder()，avcodec_open2()，avcodec_close()

AVStream

描述一个媒体流（存储视频/音频流信息的结构体，位于：avformat.h）,视频文件中每个视频（音频）流对应一个该结构体

主要变量：

AVCodecContext *codec　　　　　　　　// 已过时，使用另一个 codecpar 结构体代替。

AVRational time_base　　　　　　　　　// 时间基数。

int64_t duration　　　　　　　　　　　　// 总时长。流的总时长，该参数不可靠。

AVRational avg_frame_rate　　　　　　 // 帧率。

AVCodecParameters *codecpar;　　　　 // 包含音视频参数的结构体。很重要，可以用来获取音视频参数中的宽度、高度、采样率、编码格式等信息。

//事实上codecpar包含了大部分解码器相关的信息，这里是直接从AVCodecParameters复制到AVCodecContext

avcodec_parameters_to_context(codec_ctx, stream->codecpar);

AVStream结构体中用AVCodecParameters代替AVCodecContext

参考网址：https://blog.csdn.net/luotuo44/article/details/54981809

https://blog.51cto.com/fengyuzaitu/2059121

AVCodecParameters

enum AVMediaType codec_type; 　　　// 编码类型。说明这段流数据究竟是音频还是视频。

enum AVCodecID codec_id 　　　　 // 编码格式。说明这段流的编码格式，h264，MPEG4, MJPEG，etc...

uint32_t codecTag; 　　　 // 一般不用

int format; 　　　　// 格式。对于视频来说指的就是像素格式(YUV420,YUV422...)，对于音频来说，指的就是音频的采样格式。

int width, int height; 　　　　// 视频的宽高，只有视频有

uint64_t channel_layout; 　　　// 取默认值即可

int channels; 　　　 // 声道数

int sample_rate; 　　　　 // 样本率

int frame_size; 　　　 // 只针对音频，一帧音频的大小

AVCodec

每种视频（音频）编解码器(例如H.264解码器)对应一个该结构体。

name：编解码器名称

long_name：编解码器长名称

type：编解码器类型

id：编解码器ID

一些编解码的接口函数

AVFrame

存储一帧解码后像素（采样）数据。

data：解码后的图像像素数据（音频采样数据）。

linesize：对视频来说是图像中一行像素的大小；对音频来说是整个音频帧的大小。

width, height：图像的宽高（只针对视频）。

key_frame：是否为关键帧（只针对视频）。

pict_type：帧类型（只针对视频）。例如I，P，B。

https://www.cnblogs.com/leisure_chn/p/10404502.html 参考地址

Avframe内存分配方式 https://blog.csdn.net/xionglifei2014/article/details/90693048

常用的API接口

1、avformat_open_input

int avformat_open_input(AVFormatContext **ic_ptr,const char *filename,AVInputFormat *fmt,AVDictionary **options);

作用：打开文件或URL，并使基于字节流的底层输入模块得到初始化；解析多媒体文件或多媒体流的头信息，创建AVFormatContext结构并填充其中的关键字段，依次为各个原始流建立AVStream结构。

参数：

ic_ptr：用于返回avformat_open_input内部构造的一个AVFormatContext结构体。

filename：指定文件名。

fmt：用于显式指定输入文件的格式，如果设为空则自动判断其输入格式。

options：传入的附加参数。

说明：这个函数通过解析多媒体文件或流的头信息及其他辅助数据，能够获取足够多的关于文件、流和编解码器的信息，但任何一种多媒体格式提供的信息都是有限的，而且不同的多媒体软件制作对头信息的设置各有不同，另外这些软件在产生多媒体内容时难免引入错误，这种情况下并不能保证获取到所有需要的信息，这是就要考虑另一个函数：avformat_find_stream_info。

2、avformat_find_stream_info

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);

作用：用于获取必要的编解码器参数。需要得到各媒体流对应编解码器的类型和id，这是两个定义在avutils.h和avcodec.h中的枚举：

enum AVMediaType {

AVMEDIA_TYPE_UNKNOWN = -1,

AVMEDIA_TYPE_VIDEO,

AVMEDIA_TYPE_AUDIO,

AVMEDIA_TYPE_DATA,

AVMEDIA_TYPE_SUBTITLE,

AVMEDIA_TYPE_ATTACHMENT,

AVMEDIA_TYPE_NB

};

enum CodecID {

CODEC_ID_NONE,

CODEC_ID_MPEG1VIDEO,

CODEC_ID_MPEG2VIDEO,

CODEC_ID_MPEG2VIDEO_XVMC,

CODEC_ID_H261,

CODEC_ID_H263,

CODEC_ID_H264,

...

};

若媒体格式的数据流具有完整头信息，可以通过avformat_open_input得到编解码器的类型和id；否则，需要通过avformat_find_stream_info函数获取。此外，对于音频编解码器，时间基准、采样率、声道数、位宽、帧长度与视频编解码器图像大小、色彩空间等也需要从avformat_find_stream_info函数得到。

3、av_read_frame

int av_read_frame(AVFormatContext *s, AVPacket *pkt);

作用：用于从多媒体文件或多媒体流中读取媒体数据，数据由AVPacket结构pkt来存放。对于音频数据，若是固定比特率，则pkt中装载一个或多个音频帧；若为可变比特率，则pkt中装载一个音频帧。对于视频数据，pkt中装载有一个视频帧。注：当再次调用本函数之前，需使用av_free_packet释放pkt所占用的资源。

4、av_seek_frame

int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp, int flags);

作用：通过改变媒体文件的读写指针来实现对媒体文件的随机访问，大多源于媒体播放器的快进、快退等功能。

参数：

s：AVFormatContext指针；

avformat_open_input返回得到。

stream_index：指定媒体流。

timestamp：时间标签。

flags：定位方式。

5、av_close_input_file

void av_close_input_file(AVFormatContext *s);

作用：关闭媒体文件，释放资源，关闭物理IO。

6、avcodec_find_decoder

AVCodec *avcodec_find_decoder(enum CodecID id);

AVCodec *avcodec_find_decoder_by_name(const char *name);

作用：根据指定解码器ID或者解码器名称查找相应的解码器并返回AVCodec 。

7、avcodec_open

int avcodec_open(AVCodecContext *avctx, AVCodec *codec);

作用：根据输入的AVCodec指针具体化AVCodecContext结构。在调用该函数之前，首先调用avcodec_alloc_context分配一个AVCodecContext结构，或调用avformat_open_input获取媒体文件中对应媒体流的AVCodecContext结构；

此外，通过avcodec_find_decoder获取AVCodec结构。

8、avcodec_decode_video2

int avcodec_decode_video2(AVCodecContext *avctx,AVFrame *picture,int *got_picture_ptr,AVPacket *avpkt);

作用：解码视频帧。

参数：

avctx：解码器上下文。

picture：输出数据。

got_picture_ptr：指示是否有解码数据输出。

avpkt：输入数据。

9、avcodec_decode_audio4

int avcodec_decode_audio4(AVCodecContext *avctx, AVFrame *frame, int *got_frame_ptr, AVPacket *avpkt);

作用：解码音频帧。输入数据在AVPacket结构中，输出数据在frame中，got_frame_ptr表示是否有数据输出。

参数：

avctx：解码器上下文。

frame：输出数据。

got_frame_ptr：指示是否有解码数据输出。

avpkt：输入数据。

10、avcodec_close

int avcodec_close(AVCodecContext *avctx);作用：关闭解码器，释放avcodec_open中分配的资源。

FFMPEG开发中遇到的问题①avpicture_fill和AVFrame::data的内存管理

https://blog.csdn.net/maybeall/article/details/78921144

https://blog.csdn.net/u013539952/article/details/80002434 Mat构造函数

https://blog.csdn.net/slj2017/article/details/80819753 Mat的行数和列数理解

视频解码并使用opencv播放的代码如下：

/*****************************************************************
A simple ffmpeg player.
 *
 * refrence:
 *   1. https://blog.csdn.net/leixiaohua1020/article/details/38868499
 *   2. http://dranger.com/ffmpeg/ffmpegtutorial_all.html#tutorial01.html
 *   3. http://dranger.com/ffmpeg/ffmpegtutorial_all.html#tutorial02.html
******************************************************************/

#if 1
#include <stdio.h>
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
extern "C"
{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libavcodec\avcodec.h>
#include <libavformat\avformat.h>
#include <libswscale\swscale.h>
#include <libavutil\pixfmt.h>
#include <libavutil\imgutils.h>
};

int main(int argc, char *argv[])
{
	// Initalizing these to NULL prevents segfaults!
	//描述媒体文件或媒体流构成和基本信息, 封装格式上下文结构体，也是统领全局的结构体，保存了视频文件封装格式相关信息
	AVFormatContext*    p_fmt_ctx = NULL;
	//描述编解码器上下文的数据结构，包含编解码器需要的参数信息,编码器上下文结构体，保存了视频（音频）编解码相关信息
	AVCodecContext*     p_codec_ctx = NULL;

	//该参数用来获取 AVCodecContext
	AVCodecParameters*  p_codec_par = NULL;

	//每种视频（音频）编解码器(例如H.264解码器)对应一个该结构体
	AVCodec*            p_codec = NULL;
	AVFrame*            p_frm_raw = NULL;        // 帧，由包解码得到原始帧
	AVFrame*            p_frm_rgb = NULL;        // 帧，由原始帧色彩转换得到
	AVPacket*           p_packet = NULL;         // 包，从流中读出的一段数据

	//主要用于视频图像的转换，比如格式转换
	struct SwsContext*  sws_ctx = NULL;
	int                 buf_size;
	uint8_t*            buffer = NULL;
	int width;// / 4
	int height;// / 4
	int                 i;
	int                 v_idx;
	int                 ret;

	// 初始化libavformat(所有格式)，注册所有复用器/解复用器
	// av_register_all();   // 已被申明为过时的，直接不再使用即可

	// A1. 打开视频文件：读取文件头，将文件格式信息存储在p_fmt_ctx中
	ret = avformat_open_input(&p_fmt_ctx, "test.mp4", NULL, NULL);
	if (ret != 0)
	{
		printf("avformat_open_input() failed\n");
		return -1;
	}

	// A2. 搜索流信息：读取一段视频文件数据，尝试解码，将取到的流信息填入pFormatCtx->streams
	//     p_fmt_ctx->streams是一个指针数组，数组大小是pFormatCtx->nb_streams
	ret = avformat_find_stream_info(p_fmt_ctx, NULL);
	if (ret < 0)
	{
		printf("avformat_find_stream_info() failed\n");
		return -1;
	}

	// 将文件相关信息打印在标准错误设备上
	//av_dump_format()是一个手工调试的函数，能使我们看到pFormatCtx->streams里面有什么内容。
	//一般接下来我们使用av_find_stream_info()函数，它的作用是为pFormatCtx->streams填充上正确的信息。
	av_dump_format(p_fmt_ctx, 0, "test.mp4", 0);

	// A3. 查找第一个视频流
	v_idx = -1;
	for (i = 0; i < p_fmt_ctx->nb_streams; i++)
	{
		if (p_fmt_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
		{
			v_idx = i;
			printf("Find a video stream, index %d\n", v_idx);
			break;
		}
	}
	if (v_idx == -1)
	{
		printf("Cann't find a video stream\n");
		return -1;
	}

	// A5. 为视频流构建解码器AVCodecContext

	// A5.1 获取解码器参数AVCodecParameters
	p_codec_par = p_fmt_ctx->streams[v_idx]->codecpar;
	// A5.2 获取解码器
	p_codec = avcodec_find_decoder(p_codec_par->codec_id);
	if (p_codec == NULL)
	{
		printf("Cann't find codec!\n");
		return -1;
	}
	// A5.3 构建解码器AVCodecContext
	// A5.3.1 p_codec_ctx初始化：分配结构体，使用p_codec初始化相应成员为默认值
	p_codec_ctx = avcodec_alloc_context3(p_codec);

	// A5.3.2 p_codec_ctx初始化：p_codec_par ==> p_codec_ctx，初始化相应成员
	ret = avcodec_parameters_to_context(p_codec_ctx, p_codec_par);
	if (ret < 0)
	{
		printf("avcodec_parameters_to_context() failed %d\n", ret);
		return -1;
	}

	// A5.3.3 p_codec_ctx初始化：使用p_codec初始化p_codec_ctx，初始化完成 
	//p_codec_ctx进行解码的时候  需要avcodec_parameters_to_context和avcodec_open2两步进行解码
	ret = avcodec_open2(p_codec_ctx, p_codec, NULL);
	if (ret < 0)
	{
		printf("avcodec_open2() failed %d\n", ret);
		return -1;
	}

	// A6. 分配AVFrame
	// A6.1 分配AVFrame结构，注意并不分配data buffer(即AVFrame.*data[])
	p_frm_raw = av_frame_alloc();
	p_frm_rgb = av_frame_alloc();
	if (!p_frm_raw || !p_frm_rgb)
	{
		goto FreeAssets;
	}

	//pVideoCodecCtx->pix_fmt == AV_PIX_FMT_NONE的时候如果调用sws_getContext函数会出现错误，需要提前避过
	if (p_codec_ctx->pix_fmt == AV_PIX_FMT_NONE)
	{
		goto FreeAssets;
	}

	width = p_codec_ctx->width;// / 4
	height = p_codec_ctx->height;// / 4

	// A6.2 为AVFrame.*data[]手工分配缓冲区，用于存储sws_scale()中目的帧视频数据
	//     p_frm_raw的data_buffer由av_read_frame()分配，因此不需手工分配
	//     p_frm_yuv的data_buffer无处分配，因此在此处手工分配
	buf_size = av_image_get_buffer_size(AV_PIX_FMT_BGRA,
		width,
		height,
		1
	);
	// buffer将作为p_frm_yuv的视频数据缓冲区
	buffer = (uint8_t *)av_malloc(buf_size);
	// 使用给定参数设定p_frm_rgb->data和p_frm_rgb->linesize
	av_image_fill_arrays(p_frm_rgb->data,           // dst data[]
		p_frm_rgb->linesize,       // dst linesize[]
		buffer,                    // src buffer
		AV_PIX_FMT_BGRA,        // pixel format
		width,        // width
		height,       // height
		1                          // align
	);


	// A7. 初始化SWS context，用于后续图像转换
	//     此处第6个参数使用的是FFmpeg中的像素格式，对比参考注释B4
	//     FFmpeg中的像素格式AV_PIX_FMT_YUV420P对应SDL中的像素格式SDL_PIXELFORMAT_IYUV
	//     如果解码后得到图像的不被SDL支持，不进行图像转换的话，SDL是无法正常显示图像的
	//     如果解码后得到图像的能被SDL支持，则不必进行图像转换
	//     这里为了编码简便，统一转换为SDL支持的格式AV_PIX_FMT_YUV420P==>SDL_PIXELFORMAT_IYUV
	sws_ctx = sws_getContext(p_codec_ctx->width,    // src width
		p_codec_ctx->height,   // src height
		p_codec_ctx->pix_fmt,  // src format
		width,    // dst width
		height,   // dst height
		AV_PIX_FMT_BGRA,       // dst format
		SWS_BICUBIC,           // flags
		NULL,                  // src filter
		NULL,                  // dst filter
		NULL                   // param
	);

	p_packet = (AVPacket *)av_malloc(sizeof(AVPacket));
	// A8. 从视频文件中读取一个packet
	//     packet可能是视频帧、音频帧或其他数据，解码器只会解码视频帧或音频帧，非音视频数据并不会被
	//     扔掉、从而能向解码器提供尽可能多的信息
	//     对于视频来说，一个packet只包含一个frame
	//     对于音频来说，若是帧长固定的格式则一个packet可包含整数个frame，
	//                   若是帧长可变的格式则一个packet只包含一个frame

	cv::namedWindow("bgra", cv::WINDOW_NORMAL);
	while (av_read_frame(p_fmt_ctx, p_packet) == 0)
	{
		if (p_packet->stream_index == v_idx)  // 仅处理视频帧
		{
			// A9. 视频解码：packet ==> frame
			// A9.1 向解码器喂数据，一个packet可能是一个视频帧或多个音频帧，此处音频帧已被上一句滤掉
			ret = avcodec_send_packet(p_codec_ctx, p_packet);
			if (ret != 0)
			{
				printf("avcodec_send_packet() failed %d\n", ret);
				return -1;
			}
			// A9.2 接收解码器输出的数据，此处只处理视频帧，每次接收一个packet，将之解码得到一个frame
			ret = avcodec_receive_frame(p_codec_ctx, p_frm_raw);
			if (ret != 0)
			{
				printf("avcodec_receive_frame() failed %d\n", ret);
				return -1;
			}

			// A10. 图像转换：p_frm_raw->data ==> p_frm_rgb->data
			// 将源图像中一片连续的区域经过处理后更新到目标图像对应区域，处理的图像区域必须逐行连续
			// plane: 如YUV有Y、U、V三个plane，RGB有R、G、B三个plane
			// slice: 图像中一片连续的行，必须是连续的，顺序由顶部到底部或由底部到顶部
			// stride/pitch: 一行图像所占的字节数，Stride=BytesPerPixel*Width+Padding，注意对齐
			// AVFrame.*data[]: 每个数组元素指向对应plane
			// AVFrame.linesize[]: 每个数组元素表示对应plane中一行图像所占的字节数
			sws_scale(sws_ctx,                                  // sws context
				(const uint8_t *const *)p_frm_raw->data,  // src slice
				p_frm_raw->linesize,                      // src stride
				0,                                        // src slice y
				height,                      // src slice height
				p_frm_rgb->data,                          // dst planes
				p_frm_rgb->linesize                       // dst strides
			);

			//rows 是行数 ， 即图像的高度
			//cols 是列数 ， 即图像的宽度
			cv::Mat BGR24_IMG(height, width, CV_8UC4, p_frm_rgb->data[0]);
			cv::imshow("bgra", BGR24_IMG);
			cv::waitKey(33);

		}
		av_packet_unref(p_packet);
	}


FreeAssets:
	if (sws_ctx != NULL)
		sws_freeContext(sws_ctx);
	if (buffer != NULL)
		av_free(buffer);
	if (p_frm_rgb != NULL)
		av_frame_free(&p_frm_rgb);
	if (p_frm_raw != NULL)
		av_frame_free(&p_frm_raw);
	if(p_codec_ctx!=NULL)
		avcodec_free_context(&p_codec_ctx);
	if (p_fmt_ctx != NULL)
		avformat_close_input(&p_fmt_ctx);

	return 0;
}
#endif