在进行合成MP4之前请仔细阅读FFMpeg官方例子中的muxing.c文件,该文件通过生成测试数据合成MP4文件。
并阅读如下文章对FFMpeg的时间基、时间戳,以及音视频同步原理进行理解。
<Compute PTS and DTS correctly to sync audio and video ffmpeg C++>
<ffmpeg 2.3版本, 关于ffplay音视频同步的分析>
<理解ffmpeg中的pts,dts,time_base>
<用FFMPEG SDK进行视频转码压缩时解决音视频不同步问题的方法(转)>
<ffmpeg time_base、FFmpeg时间戳整理、 ffmpeg 时间戳问题汇总>
知晓并理解基本概念将避免走很多弯路,简单来说,在音视频播放中一般以声音或视频的时间线来做基准,比如以音频为基准,那么在顺序的播放每一帧音频时(此处和直播流的播放又有不同,直播中会根据当前时间来舍弃一些音视频真来保证同步性),会计算出当前音频帧的持续时间,与视频帧时间戳、持续时间进行比较,从而总是能够在当前时间播放正确的音频和视频数据。
在之前的捕获和压缩文章中可以看到,均特意对AVFrame或AVPacket中的PTS和DTS进行了设置,并最终将一个包含有PTS和DTS的AVPacket结构体指针送给了muxer,这里其实等同于使用了编码器的时间戳进行了最终的音视频同步,我猜测应该有更优雅的方式,比如在最终合流时根据每一帧的音视频数据计算出时间戳会比较合适?望高人指点。
重点函数,最好能阅读源码
/**
* Write a packet to an output media file ensuring correct interleaving.
*
* This function will buffer the packets internally as needed to make sure the
* packets in the output file are properly interleaved in the order of
* increasing dts. Callers doing their own interleaving should call
* av_write_frame() instead of this function.
*
* Using this function instead of av_write_frame() can give muxers advance
* knowledge of future packets, improving e.g. the behaviour of the mp4
* muxer for VFR content in fragmenting mode.
*
* @param s media file handle
* @param pkt The packet containing the data to be written.
* <br>
* If the packet is reference-counted, this function will take
* ownership of this reference and unreference it later when it sees
* fit.
* The caller must not access the data through this reference after
* this function returns. If the packet is not reference-counted,
* libavformat will make a copy.
* <br>
* This parameter can be NULL (at any time, not just at the end), to
* flush the interleaving queues.
* <br>
* Packet's @ref AVPacket.stream_index "stream_index" field must be
* set to the index of the corresponding stream in @ref
* AVFormatContext.streams "s->streams".
* <br>
* The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
* must be set to correct values in the stream's timebase (unless the
* output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
* they can be set to AV_NOPTS_VALUE).
* The dts for subsequent packets in one stream must be strictly
* increasing (unless the output format is flagged with the
* AVFMT_TS_NONSTRICT, then they merely have to be nondecreasing).
* @ref AVPacket.duration "duration") should also be set if known.
*
* @return 0 on success, a negative AVERROR on error. Libavformat will always
* take care of freeing the packet, even if this function fails.
*
* @see av_write_frame(), AVFormatContext.max_interleave_delta
*/
int av_interleaved_write_frame(AVFormatContext *s, AVPacket *pkt);
请务必保证调用此函数时是线程安全的,阅读源码你会发现它在最后将会操作一个内部维护的数据队列进行IO操作,本文采用了音视频不同的线程进行压缩和合成,所以因为调用该函数没有加锁导致了莫名其妙的堆损坏异常。
自定义一个结构体(建议)
typedef struct {
//common
AVStream *st; // av stream
AVBitStreamFilterContext *filter; //pps|sps adt
uint64_t pre_pts;
//video
encoder_264 *v_enc; // video encoder
record_desktop *v_src; // video source
sws_helper *v_sws; // video sws
//audio
int a_nb; // audio source num
encoder_aac *a_enc; // audio encoder
record_audio **a_src; // audio sources
}MUX_STREAM;
该结构体用以存储在录制期间音视频捕获器、编码器、转码器等必须的对象。
根据文件名或文件扩展名创建AVFormatContext
int muxer_mp4::alloc_oc(const char * output_file, const MUX_SETTING_T & setting)
{
_output_file = std::string(output_file);
int error = AE_NO;
int ret = 0;
do {
ret = avformat_alloc_output_context2(&_fmt_ctx, NULL, NULL, output_file);
if (ret < 0 || !_fmt_ctx) {
error = AE_FFMPEG_ALLOC_CONTEXT_FAILED;
break;
}
_fmt = _fmt_ctx->oformat;
} while (0);
return error;
}
添加视频流并设置正确的编码参数、码率等等
int muxer_mp4::add_video_stream(const MUX_SETTING_T & setting, record_desktop * source_desktop)
{
int error = AE_NO;
int ret = 0;
_v_stream = new MUX_STREAM();
memset(_v_stream, 0, sizeof(MUX_STREAM));
_v_stream->v_src = source_desktop;
_v_stream->pre_pts = -1;
_v_stream->v_src->registe_cb(
std::bind(&muxer_mp4::on_desktop_data, this, std::placeholders::_1),
std::bind(&muxer_mp4::on_desktop_error, this, std::placeholders::_1)
);
RECORD_DESKTOP_RECT v_rect = _v_stream->v_src->get_rect();
do {
_v_stream->v_enc = new encoder_264();
error = _v_stream->v_enc->init(setting.v_width, setting.v_height, setting.v_frame_rate,setting.v_bit_rate, setting.v_qb);
if (error != AE_NO)
break;
_v_stream->v_enc->registe_cb(
std::bind(&muxer_mp4::on_enc_264_data, this, std::placeholders::_1),
std::bind(&muxer_mp4::on_enc_264_error, this, std::placeholders::_1)
);
_v_stream->v_sws = new sws_helper();
error = _v_stream->v_sws->init(
_v_stream->v_src->get_pixel_fmt(),
v_rect.right - v_rect.left,
v_rect.bottom - v_rect.top,
AV_PIX_FMT_YUV420P,
setting.v_width,
setting.v_height
);
if (error != AE_NO)
break;
AVCodec *codec = avcodec_find_encoder(_fmt->video_codec);
if (!codec) {
error = AE_FFMPEG_FIND_ENCODER_FAILED;
break;
}
AVStream *st = avformat_new_stream(_fmt_ctx, codec);
if (!st) {
error = AE_FFMPEG_NEW_STREAM_FAILED;
break;
}
st->codec->codec_id = AV_CODEC_ID_H264;
st->codec->bit_rate_tolerance = setting.v_bit_rate;
st->codec->codec_type = AVMEDIA_TYPE_VIDEO;
st->codec->time_base.den = setting.v_frame_rate;
st->codec->time_base.num = 1;
st->codec->pix_fmt = AV_PIX_FMT_YUV420P;
st->codec->coded_width = setting.v_width;
st->codec->coded_height = setting.v_height;
st->codec->width = setting.v_width;
st->codec->height = setting.v_height;
st->codec->max_b_frames = 0;//NO B Frame
st->time_base = { 1,90000 };//fixed?
st->avg_frame_rate = av_inv_q(st->codec->time_base);
if (_fmt_ctx->oformat->flags & AVFMT_GLOBALHEADER) {//without this,normal player can not play,extradata will write with avformat_write_header
st->codec->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
st->codec->extradata_size = _v_stream->v_enc->get_extradata_size();// +AV_INPUT_BUFFER_PADDING_SIZE;
st->codec->extradata = (uint8_t*)av_memdup(_v_stream->v_enc->get_extradata(), _v_stream->v_enc->get_extradata_size());
}
_v_stream->st = st;
_v_stream->setting = setting;
_v_stream->filter = av_bitstream_filter_init("h264_mp4toannexb");
} while (0);
return error;
}
需要注意的是其中的extradata设置,其实就是为了写入MP4文件头部信息做准备,mp4文件头可能包含了视频的解码信息、时长等用以播放器加载正确的解码器进行播放,h264对应的是sps、pps,aac对应的是adts信息。没有这一步,一般的播放器可能无法直接播放生成的MP4文件。
添加音频流并设置正确的编码参数、码率等等
int muxer_mp4::add_audio_stream(const MUX_SETTING_T & setting, record_audio ** source_audios, const int source_audios_nb)
{
int error = AE_NO;
int ret = 0;
_a_stream = new MUX_STREAM();
memset(_a_stream, 0, sizeof(MUX_STREAM));
_a_stream->a_nb = source_audios_nb;
_a_stream->a_rs = new resample_pcm*[_a_stream->a_nb];
_a_stream->a_src = new record_audio*[_a_stream->a_nb];
_a_stream->pre_pts = -1;
do {
_a_stream->a_enc = new encoder_aac();
error = _a_stream->a_enc->init(
setting.a_nb_channel,
setting.a_sample_rate,
setting.a_sample_fmt,
setting.a_bit_rate
);
if (error != AE_NO)
break;
_a_stream->a_enc->registe_cb(
std::bind(&muxer_mp4::on_enc_aac_data, this, std::placeholders::_1),
std::bind(&muxer_mp4::on_enc_aac_error, this, std::placeholders::_1)
);
for (int i = 0; i < _a_stream->a_nb; i++) {
_a_stream->a_src[i] = source_audios[i];
_a_stream->a_src[i]->registe_cb(
std::bind(&muxer_mp4::on_audio_data, this, std::placeholders::_1, std::placeholders::_2),
std::bind(&muxer_mp4::on_audio_error, this, std::placeholders::_1, std::placeholders::_2),
i
);
SAMPLE_SETTING src_setting = {
_a_stream->a_enc->get_nb_samples(),
av_get_default_channel_layout(_a_stream->a_src[i]->get_channel_num()),
_a_stream->a_src[i]->get_channel_num(),
_a_stream->a_src[i]->get_fmt(),
_a_stream->a_src[i]->get_sample_rate()
};
SAMPLE_SETTING dst_setting = {
_a_stream->a_enc->get_nb_samples(),
av_get_default_channel_layout(setting.a_nb_channel),
setting.a_nb_channel,
setting.a_sample_fmt,
setting.a_sample_rate
};
}
AVCodec *codec = avcodec_find_encoder(_fmt->audio_codec);
if (!codec) {
error = AE_FFMPEG_FIND_ENCODER_FAILED;
break;
}
AVStream *st = avformat_new_stream(_fmt_ctx, codec);
if (!st) {
error = AE_FFMPEG_NEW_STREAM_FAILED;
break;
}
st->time_base = { 1,setting.a_sample_rate };
st->codec->bit_rate = setting.a_bit_rate;
st->codec->channels = setting.a_nb_channel;
st->codec->sample_rate = setting.a_sample_rate;
st->codec->sample_fmt = setting.a_sample_fmt;
st->codec->time_base = { 1,setting.a_sample_rate };
st->codec->channel_layout = av_get_default_channel_layout(setting.a_nb_channel);
if (_fmt_ctx->oformat->flags & AVFMT_GLOBALHEADER) {//without this,normal player can not play
st->codec->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
st->codec->extradata_size = _a_stream->a_enc->get_extradata_size();// +AV_INPUT_BUFFER_PADDING_SIZE;
st->codec->extradata = (uint8_t*)av_memdup(_a_stream->a_enc->get_extradata(), _a_stream->a_enc->get_extradata_size());
}
_a_stream->st = st;
_a_stream->setting = setting;
_a_stream->filter = av_bitstream_filter_init("aac_adtstoasc");
} while (0);
return error;
}
可以看到添加音频流同样的对extradata 进行了设置,该值可以很方便的在编码器初始化后获取到。
打开文件并写入文件头
int muxer_mp4::open_output(const char * output_file, const MUX_SETTING_T & setting)
{
int error = AE_NO;
int ret = 0;
do {
if (!(_fmt->flags & AVFMT_NOFILE)) {
ret = avio_open(&_fmt_ctx->pb, output_file, AVIO_FLAG_WRITE);
if (ret < 0) {
error = AE_FFMPEG_OPEN_IO_FAILED;
break;
}
}
ret = avformat_write_header(_fmt_ctx, NULL);
if (ret < 0) {
error = AE_FFMPEG_WRITE_HEADER_FAILED;
break;
}
} while (0);
return error;
}
写入视频频数据
int muxer_mp4::write_video(AVPacket *packet)
{
//must lock here,coz av_interleaved_write_frame will push packet into a queue,and is not thread safe
std::lock_guard<std::mutex> lock(_mutex);
if (_paused) return AE_NO;
packet->stream_index = _v_stream->st->index;
if (_v_stream->pre_pts == (uint64_t)-1) {
_v_stream->pre_pts = packet->pts;
}
packet->pts = packet->pts - _v_stream->pre_pts;
packet->pts = av_rescale_q_rnd(packet->pts, _v_stream->v_src->get_time_base(), _v_stream->st->time_base, (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
packet->dts = packet->pts;//make sure that dts is equal to pts
//al_debug("V:%lld", packet->pts);
#if 0
static FILE *fp = NULL;
if (fp == NULL) {
fp = fopen("..\\..\\save.264", "wb+");
//write sps pps
fwrite(_v_stream->v_enc->get_extradata(), 1, _v_stream->v_enc->get_extradata_size(), fp);
}
fwrite(packet->data, 1, packet->size, fp);
fflush(fp);
#endif
av_assert0(packet->data != NULL);
int ret = av_interleaved_write_frame(_fmt_ctx, packet);//no need to unref packet,this will be auto unref
}
在调用av_interleaved_write_frame之前需要对视频数据包AVPacket进行一些处理,这里我们仅仅将其PTS根据视频流时间基进行转换,这样在写入文件时才可以设置正确的时间戳信息,并设置DTS值等于PTS。
写入音频数据
int muxer_mp4::write_audio(AVPacket *packet)
{
std::lock_guard<std::mutex> lock(_mutex);
if (_paused) return AE_NO;
packet->stream_index = _a_stream->st->index;
if (_a_stream->pre_pts == (uint64_t)-1) {
_a_stream->pre_pts = packet->pts;
}
packet->pts = packet->pts - _a_stream->pre_pts;
packet->pts = av_rescale_q(packet->pts, _a_stream->a_filter->get_time_base(), { 1,AV_TIME_BASE });
packet->pts = av_rescale_q_rnd(packet->pts, { 1,AV_TIME_BASE }, _a_stream->st->time_base, (AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
packet->dts = packet->pts;//make sure that dts is equal to pts
//al_debug("A:%lld %lld", packet->pts, packet->dts);
av_assert0(packet->data != NULL);
int ret = av_interleaved_write_frame(_fmt_ctx, packet);//no need to unref packet,this will be auto unref
return ret;
}
音频数据写入时同视频数据相同,仅仅对时间戳进行了基数转换,注意转换时的基础时间基。
写MP4文件尾
av_write_trailer(_fmt_ctx);//must write trailer ,otherwise mp4 can not play
此步骤和写入文件头一样重要,一旦录制过程中出现异常导致进程崩溃,则文件就没法被播放器播放了。
所以后续会增加合流为MKV文件,并增加转码功能。
释放资源关闭文件
cleanup_video();
cleanup_audio();
if (_fmt && !(_fmt->flags & AVFMT_NOFILE))
avio_closep(&_fmt_ctx->pb);
if (_fmt_ctx) {
avformat_free_context(_fmt_ctx);
}
最初在做的时候完全没有文章可以借鉴,前面都很快完成,但为了音视频同步折腾了有一周,问过许多大佬,加上自己的理解才做出来现在这一版,有一位大佬说过FFMpeg只是工具,对于录屏你只要保证音视频的时间戳是真实的时间戳就可以。
一直想搞AV,希望有大佬能指点一二。
有不正之处请指出,本系列只希望能带来一些启发和交流。
也希望有兴趣的朋友可以在GitHub中一起完善和修改这个录频软件。