ffmpeg:开源的跨平台的视频和音频流方案,提供了录制、转换以及流化音视频的完整解决方案,包含先进的音频/视频编解码库libavcodec,里面提供了许多API给我们使用,但仍有一些问题需要我们自己解决,如同步问题。
ffmpeg编译后的一些文件:
ffplay:真正的播放器,像vlc、mplayer等,有图形界面的
ffmpeg:可以理解为一种工具,利用ffmpeg提供的API,再加上其他操作,可以实现转码等一些功能。
ffserver:做服务器的,可以单播或多播一些流。
处理音视频的一般过程:
1、从视频文件中打开视频流(解复用的过程)
2、从视频流中读取包到帧当中(解码)
3、如果帧还不完整,跳回2
4、对该帧进行操作
5、跳回2
一、打开文件获取视频流(以下操作使用ffmpeg-0.8版本)
<1> 使用av_register_all()注册所有的文件格式和编解码器的库,只需要调用一次,所以最好的选择就是在main函数中。
<2> av_open_input_file 打开视频文件。这个函数会读取视频文件头部信息并保存在AVFormatContext中,函数原型如下(avformat.h中):
- int av_open_input_file(AVFormatContext **ic_ptr, const char *filename,
- AVInputFormat *fmt,
- int buf_size,
- AVFormatParameters *ap)
- {
- int err;
- AVDictionary *opts = convert_format_parameters(ap);
- if (!ap || !ap->prealloced_context)
- *ic_ptr = NULL;
- err = avformat_open_input(ic_ptr, filename, fmt, &opts);
- av_dict_free(&opts);
- return err;
- }
函数实现在utils.c中。如果后3个参数为NULL或者0,libavformat将自动检测这些参数,该函数最终调用avformat_open_input进行操作。
- int avformat_open_input(AVFormatContext **ps, const char *filename, AVInputFormat *fmt, AVDictionary **options)
- {
- return avformat_open_input_header(ps,filename,fmt,options,NULL);
- }
avformat_open_input_header作用是读取视频文件头部信息并保存在AVFormatContext中。
<3> 根据视频文件头部信息,得到音视频流的信息,调用函数av_find_stream_info(定义在avformat.h)
- int av_find_stream_info(AVFormatContext *ic)
- {
- int i, count, ret, read_size, j;
- AVStream *st;
- AVPacket pkt1, *pkt;
- int64_t old_offset = avio_tell(ic->pb);
- for(i=0;i<ic->nb_streams;i++) {
- AVCodec *codec;
- st = ic->streams[i];
- /*st->codec得到的是AVCodecContext类型,其保存了流中关于使用
- 编解码器的信息 */
- if (st->codec->codec_id == CODEC_ID_AAC) {
- st->codec->sample_rate = 0;
- st->codec->frame_size = 0;
- st->codec->channels = 0;
- }
- if (st->codec->codec_type == AVMEDIA_TYPE_VIDEO ||
- st->codec->codec_type == AVMEDIA_TYPE_SUBTITLE) {
- /* if(!st->time_base.num)
- st->time_base= */
- if(!st->codec->time_base.num)
- /* time_base是一个AVRational(分母)结构体,保存帧率的信息,现在很多编解码器
- 都使用非整数的帧率,如NTSC使用29.97fps*/
- st->codec->time_base= st->time_base;
- }
- //only for the split stuff
- if (!st->parser && !(ic->flags & AVFMT_FLAG_NOPARSE)) {
- st->parser = av_parser_init(st->codec->codec_id);
- if(st->need_parsing == AVSTREAM_PARSE_HEADERS && st->parser){
- st->parser->flags |= PARSER_FLAG_COMPLETE_FRAMES;
- }
- }
- assert(!st->codec->codec);
- /*找到对应的编解码器*/
- codec = avcodec_find_decoder(st->codec->codec_id);
- /* Force decoding of at least one frame of codec data
- * this makes sure the codec initializes the channel configuration
- * and does not trust the values from the container.
- */
- if (codec && codec->capabilities & CODEC_CAP_CHANNEL_CONF)
- st->codec->channels = 0;
- /* Ensure that subtitle_header is properly set. */
- if (st->codec->codec_type == AVMEDIA_TYPE_SUBTITLE
- && codec && !st->codec->codec)
- //打开编解码器
- avcodec_open(st->codec, codec);
- //try to just open decoders, in case this is enough to get parameters
- if(!has_codec_parameters(st->codec)){
- if (codec && !st->codec->codec)
- avcodec_open(st->codec, codec);
- }
- }
- for (i=0; i<ic->nb_streams; i++) {
- ic->streams[i]->info->last_dts = AV_NOPTS_VALUE;
- }
- count = 0;
- read_size = 0;
- for(;;) {
- if(url_interrupt_cb()){
- ret= AVERROR_EXIT;
- av_log(ic, AV_LOG_DEBUG, "interrupted\n");
- break;
- }
- /* check if one codec still needs to be handled */
- for(i=0;i<ic->nb_streams;i++) {
- int fps_analyze_framecount = 20;
- st = ic->streams[i];
- if (!has_codec_parameters(st->codec))
- break;
- /* if the timebase is coarse (like the usual millisecond precision
- of mkv), we need to analyze more frames to reliably arrive at
- the correct fps */
- if (av_q2d(st->time_base) > 0.0005)
- fps_analyze_framecount *= 2;
- if (ic->fps_probe_size >= 0)
- fps_analyze_framecount = ic->fps_probe_size;
- /* variable fps and no guess at the real fps */
- if( tb_unreliable(st->codec) && !(st->r_frame_rate.num && st->avg_frame_rate.num)
- && st->info->duration_count < fps_analyze_framecount
- && st->codec->codec_type == AVMEDIA_TYPE_VIDEO)
- break;
- if(st->parser && st->parser->parser->split && !st->codec->extradata)
- break;
- if(st->first_dts == AV_NOPTS_VALUE)
- break;
- }
- if (i == ic->nb_streams) {
- /* NOTE: if the format has no header, then we need to read
- some packets to get most of the streams, so we cannot
- stop here */
- if (!(ic->ctx_flags & AVFMTCTX_NOHEADER)) {
- /* if we found the info for all the codecs, we can stop */
- ret = count;
- av_log(ic, AV_LOG_DEBUG, "All info found\n");
- break;
- }
- }
- /* we did not get all the codec info, but we read too much data */
- if (read_size >= ic->probesize) {
- ret = count;
- av_log(ic, AV_LOG_DEBUG, "Probe buffer size limit %d reached\n", ic->probesize);
- break;
- }
- /* NOTE: a new stream can be added there if no header in file
- (AVFMTCTX_NOHEADER) */
- ret = av_read_frame_internal(ic, &pkt1);
- if (ret < 0 && ret != AVERROR(EAGAIN)) {
- /* EOF or error */
- ret = -1; /* we could not have all the codec parameters before EOF */
- for(i=0;i<ic->nb_streams;i++) {
- st = ic->streams[i];
- if (!has_codec_parameters(st->codec)){
- char buf[256];
- avcodec_string(buf, sizeof(buf), st->codec, 0);
- av_log(ic, AV_LOG_WARNING, "Could not find codec parameters (%s)\n", buf);
- } else {
- ret = 0;
- }
- }
- break;
- }
- if (ret == AVERROR(EAGAIN))
- continue;
- pkt= add_to_pktbuf(&ic->packet_buffer, &pkt1, &ic->packet_buffer_end);
- if ((ret = av_dup_packet(pkt)) < 0)
- goto find_stream_info_err;
- read_size += pkt->size;
- st = ic->streams[pkt->stream_index];
- if (st->codec_info_nb_frames>1) {
- int64_t t;
- if (st->time_base.den > 0 && (t=av_rescale_q(st->info->codec_info_duration, st->time_base, AV_TIME_BASE_Q)) >= ic->max_analyze_duration) {
- av_log(ic, AV_LOG_WARNING, "max_analyze_duration %d reached at %"PRId64"\n", ic->max_analyze_duration, t);
- break;
- }
- st->info->codec_info_duration += pkt->duration;
- }
- {
- int64_t last = st->info->last_dts;
- int64_t duration= pkt->dts - last;
- if(pkt->dts != AV_NOPTS_VALUE && last != AV_NOPTS_VALUE && duration>0){
- double dur= duration * av_q2d(st->time_base);
- // if(st->codec->codec_type == AVMEDIA_TYPE_VIDEO)
- // av_log(NULL, AV_LOG_ERROR, "%f\n", dur);
- if (st->info->duration_count < 2)
- memset(st->info->duration_error, 0, sizeof(st->info->duration_error));
- for (i=1; i<FF_ARRAY_ELEMS(st->info->duration_error); i++) {
- int framerate= get_std_framerate(i);
- int ticks= lrintf(dur*framerate/(1001*12));
- double error= dur - ticks*1001*12/(double)framerate;
- st->info->duration_error[i] += error*error;
- }
- st->info->duration_count++;
- // ignore the first 4 values, they might have some random jitter
- if (st->info->duration_count > 3)
- st->info->duration_gcd = av_gcd(st->info->duration_gcd, duration);
- }
- if (last == AV_NOPTS_VALUE || st->info->duration_count <= 1)
- st->info->last_dts = pkt->dts;
- }
- if(st->parser && st->parser->parser->split && !st->codec->extradata){
- int i= st->parser->parser->split(st->codec, pkt->data, pkt->size);
- if(i){
- st->codec->extradata_size= i;
- st->codec->extradata= av_malloc(st->codec->extradata_size + FF_INPUT_BUFFER_PADDING_SIZE);
- memcpy(st->codec->extradata, pkt->data, st->codec->extradata_size);
- memset(st->codec->extradata + i, 0, FF_INPUT_BUFFER_PADDING_SIZE);
- }
- }
- /* if still no information, we try to open the codec and to
- decompress the frame. We try to avoid that in most cases as
- it takes longer and uses more memory. For MPEG-4, we need to
- decompress for QuickTime. */
- if (!has_codec_parameters(st->codec) || !has_decode_delay_been_guessed(st))
- try_decode_frame(st, pkt);
- st->codec_info_nb_frames++;
- count++;
- }
- // close codecs which were opened in try_decode_frame()
- for(i=0;i<ic->nb_streams;i++) {
- st = ic->streams[i];
- if(st->codec->codec)
- avcodec_close(st->codec);
- }
- for(i=0;i<ic->nb_streams;i++) {
- st = ic->streams[i];
- if (st->codec_info_nb_frames>2 && !st->avg_frame_rate.num && st->info->codec_info_duration)
- av_reduce(&st->avg_frame_rate.num, &st->avg_frame_rate.den,
- (st->codec_info_nb_frames-2)*(int64_t)st->time_base.den,
- st->info->codec_info_duration*(int64_t)st->time_base.num, 60000);
- if (st->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
- if(st->codec->codec_id == CODEC_ID_RAWVIDEO && !st->codec->codec_tag && !st->codec->bits_per_coded_sample){
- uint32_t tag= avcodec_pix_fmt_to_codec_tag(st->codec->pix_fmt);
- if(ff_find_pix_fmt(ff_raw_pix_fmt_tags, tag) == st->codec->pix_fmt)
- st->codec->codec_tag= tag;
- }
- // the check for tb_unreliable() is not completely correct, since this is not about handling
- // a unreliable/inexact time base, but a time base that is finer than necessary, as e.g.
- // ipmovie.c produces.
- if (tb_unreliable(st->codec) && st->info->duration_count > 15 && st->info->duration_gcd > FFMAX(1, st->time_base.den/(500LL*st->time_base.num)) && !st->r_frame_rate.num)
- av_reduce(&st->r_frame_rate.num, &st->r_frame_rate.den, st->time_base.den, st->time_base.num * st->info->duration_gcd, INT_MAX);
- if (st->info->duration_count && !st->r_frame_rate.num
- && tb_unreliable(st->codec) /*&&
- //FIXME we should not special-case MPEG-2, but this needs testing with non-MPEG-2 ...
- st->time_base.num*duration_sum[i]/st->info->duration_count*101LL > st->time_base.den*/){
- int num = 0;
- double best_error= 2*av_q2d(st->time_base);
- best_error = best_error*best_error*st->info->duration_count*1000*12*30;
- for (j=1; j<FF_ARRAY_ELEMS(st->info->duration_error); j++) {
- double error = st->info->duration_error[j] * get_std_framerate(j);
- // if(st->codec->codec_type == AVMEDIA_TYPE_VIDEO)
- // av_log(NULL, AV_LOG_ERROR, "%f %f\n", get_std_framerate(j) / 12.0/1001, error);
- if(error < best_error){
- best_error= error;
- num = get_std_framerate(j);
- }
- }
- // do not increase frame rate by more than 1 % in order to match a standard rate.
- if (num && (!st->r_frame_rate.num || (double)num/(12*1001) < 1.01 * av_q2d(st->r_frame_rate)))
- av_reduce(&st->r_frame_rate.num, &st->r_frame_rate.den, num, 12*1001, INT_MAX);
- }
- if (!st->r_frame_rate.num){
- if( st->codec->time_base.den * (int64_t)st->time_base.num
- <= st->codec->time_base.num * st->codec->ticks_per_frame * (int64_t)st->time_base.den){
- st->r_frame_rate.num = st->codec->time_base.den;
- st->r_frame_rate.den = st->codec->time_base.num * st->codec->ticks_per_frame;
- }else{
- st->r_frame_rate.num = st->time_base.den;
- st->r_frame_rate.den = st->time_base.num;
- }
- }
- }else if(st->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
- if(!st->codec->bits_per_coded_sample)
- st->codec->bits_per_coded_sample= av_get_bits_per_sample(st->codec->codec_id);
- // set stream disposition based on audio service type
- switch (st->codec->audio_service_type) {
- case AV_AUDIO_SERVICE_TYPE_EFFECTS:
- st->disposition = AV_DISPOSITION_CLEAN_EFFECTS; break;
- case AV_AUDIO_SERVICE_TYPE_VISUALLY_IMPAIRED:
- st->disposition = AV_DISPOSITION_VISUAL_IMPAIRED; break;
- case AV_AUDIO_SERVICE_TYPE_HEARING_IMPAIRED:
- st->disposition = AV_DISPOSITION_HEARING_IMPAIRED; break;
- case AV_AUDIO_SERVICE_TYPE_COMMENTARY:
- st->disposition = AV_DISPOSITION_COMMENT; break;
- case AV_AUDIO_SERVICE_TYPE_KARAOKE:
- st->disposition = AV_DISPOSITION_KARAOKE; break;
- }
- }
- }
- av_estimate_timings(ic, old_offset);
- compute_chapters_end(ic);
- #if 0
- /* correct DTS for B-frame streams with no timestamps */
- for(i=0;i<ic->nb_streams;i++) {
- st = ic->streams[i];
- if (st->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
- if(b-frames){
- ppktl = &ic->packet_buffer;
- while(ppkt1){
- if(ppkt1->stream_index != i)
- continue;
- if(ppkt1->pkt->dts < 0)
- break;
- if(ppkt1->pkt->pts != AV_NOPTS_VALUE)
- break;
- ppkt1->pkt->dts -= delta;
- ppkt1= ppkt1->next;
- }
- if(ppkt1)
- continue;
- st->cur_dts -= delta;
- }
- }
- }
- #endif
- find_stream_info_err:
- for (i=0; i < ic->nb_streams; i++)
- av_freep(&ic->streams[i]->info);
- return ret;
- }
二、读取包的信息保存在帧中
<1> 分配目标帧的内存,函数为avcode_alloc_frame() (定义在avcodec.h)
- AVFrame *avcodec_alloc_frame(void);
- /* ffmpeg/libavcodec/utils.c */
- AVFrame *avcodec_alloc_frame(void){
- AVFrame *pic= av_malloc(sizeof(AVFrame));
- if(pic==NULL) return NULL;
- avcodec_get_frame_defaults(pic);
- return pic;
- }
<2> 通过读取包来读取视频流,将它解码成帧,主要函数有av_read_frame(),这里说明下,av_read_packet这个方法已经不用了,在ffmpeg 0.8中有进行说明,函数原型如下:
- /**
- * Return the next frame of a stream.
- * This function returns what is stored in the file, and does not validate
- * that what is there are valid frames for the decoder. It will split what is
- * stored in the file into frames and return one for each call. It will not
- * omit invalid data between valid frames so as to give the decoder the maximum
- * information possible for decoding.
- *
- * The returned packet is valid
- * until the next av_read_frame() or until av_close_input_file() and
- * must be freed with av_free_packet. For video, the packet contains
- * exactly one frame. For audio, it contains an integer number of
- * frames if each frame has a known fixed size (e.g. PCM or ADPCM
- * data). If the audio frames have a variable size (e.g. MPEG audio),
- * then it contains one frame.
- *
- * pkt->pts, pkt->dts and pkt->duration are always set to correct
- * values in AVStream.time_base units (and guessed if the format cannot
- * provide them). pkt->pts can be AV_NOPTS_VALUE if the video format
- * has B-frames, so it is better to rely on pkt->dts if you do not
- * decompress the payload.
- *
- * @return 0 if OK, < 0 on error or end of file
- */
- int av_read_frame(AVFormatContext *s, AVPacket *pkt);
av_read_frame通常是在while循环中,主要是读取一个包并且将它保存在AVPacket结构体中,使用函数avcodec_decode_video2()将包转换为帧(原先的avcodec_decode_video()方法已经不用了,在ffmpeg/doc/APIChanges 说明如下:
- 2009-04-07 - r18351 - lavc 52.23.0 - avcodec_decode_video/audio/subtitle
- The old decoding functions are deprecated, all new code should use the
- new functions avcodec_decode_video2(), avcodec_decode_audio3() and
- avcodec_decode_subtitle2(). These new functions take an AVPacket *pkt
- argument instead of a const uint8_t *buf / int buf_size pair.
当解码一个包时,我们可能没有得到自己所需的帧的信息,因此当我们得到下一帧的时候,avcodec_decode_video2中设置了帧的结束标志 got_picture, 如果得到我们所需的帧,我们就可以对其进行自己所需要的操作了。