encode_audio.c /encode_video.c 解读

hjjdebug

已于 2023-12-14 11:25:50 修改

阅读量758

点赞数

分类专栏： # ffmpeg 文章标签： ffmpeg encode audio video

于 2022-01-29 18:06:28 首次发布

本文链接：https://blog.csdn.net/hejinjing_tom_com/article/details/122746197

版权

ffmpeg 专栏收录该内容

52 篇文章 6 订阅

订阅专栏

----------------------------------------
author:hjjdebug
date:2022-0129
----------------------------------------

encode_audio.c 解读
----------------------------------------
我们要构建一个音频碼流.

过程分三步:

1. 构建编码用的codec, 确定channel-layout,sample_fmt,及采样率等
2. 构建frame, 准备音频数据,
3. 编码frame, 写到输出文件.

关键知识点: 引入了frame,packet,codec的概念!!! 需要体会,理解.

能够生成带压缩的音频或视频文件.

如下是构建完成时的数据输出:ffprobe audio
Input #0, mp3, from 'audio':
Duration: 00:00:05.22, start: 0.000000, bitrate: 64 kb/s
Stream #0:0: Audio: mp2, 44100 Hz, stereo, fltp, 64 kb/s

从find encoder, alloc context 开始
    codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
    c = avcodec_alloc_context3(codec);
    为ctx 进行了一系列的赋值,然后打开它.
    c->sample_fmt = AV_SAMPLE_FMT_S16;
    c->sample_rate    = select_sample_rate(codec);
    c->channel_layout = select_channel_layout(codec);
    c->channels       = av_get_channel_layout_nb_channels(c->channel_layout);

从check_sample_fmt(codec, c->sample_fmt)) 中, 我们知道codec可以支持不止一种sample_fmt
从select_sample_rate ,可发现codec 可支持多种sample_rate
从select_channel_layout, 可发现codec 可支持多种channel_layout

    if (avcodec_open2(c, codec, NULL) < 0)
然后,分配一个packet,一个frame.
为frame 赋值:
    frame->nb_samples     = c->frame_size;        //采样数
    frame->format         = c->sample_fmt;        //采用点格式
    frame->channel_layout = c->channel_layout; //通道布局
    ret = av_frame_get_buffer(frame, 0);        // 0参数是自动对齐的意思
    samples = (uint16_t*)frame->data[0];
    然后给samples 填充数据
    以上是手工为frame 赋值了. 然后对frame 进行编码., 编码如下.
    encode()
    ret = avcodec_send_frame(ctx, frame);
    ret = avcodec_receive_packet(ctx, pkt);
    写文件: output 就是一个简单的文件指针.
    fwrite(pkt->data, 1, pkt->size, output);
看起来打包的音频流就是很多个pkt 来组成的.

有个问题,音频bit率参数到底是什么角色?
采样率,采样点格式,通道layout 确定后, 就可以确定不压缩的bit率,
除以压缩比,就是压缩后的bit率了.

比特率并不是随便可以设置的,它取决于编码codec的能力, 对mp2而言,我设置6K,它说不支持.
我查了一下代码, 其下是它可能选择的一张表:
// {0, 32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384 },
当你不设置bit_rate时,它默认选择384K, 最小不能小于32K, 否则你就打不开codec,avcodec_open2()会返回失败!

添加一点log后观察其写入编码文件的过程.
其中frame 顺序号是frame->pts, 让其递长（从0开始）. printf("Send frame %3ld\n", frame->pts);
得到的packet pts 出现一个偏移(从-481开始），我们看到它也是递长的.
printf("Write packet %3"PRId64" (size=%5d)\n", pkt->pts, pkt->size);

./encode_audio audio
Send frame   0
Write packet -481 (size= 208)
Send frame   1
Write packet -480 (size= 209)
Send frame   2
Write packet -479 (size= 209)
Send frame   3
Write packet -478 (size= 209)
Send frame   4
Write packet -477 (size= 209)
Send frame   5
Write packet -476 (size= 209)
Send frame   6
Write packet -475 (size= 209)
Send frame   7
Write packet -474 (size= 209)
Send frame   8
Write packet -473 (size= 209)
Send frame   9
Write packet -472 (size= 209)
Send frame 10
Write packet -471 (size= 209)
Send frame 11
Write packet -470 (size= 209)
Send frame 12
Write packet -469 (size= 209)
Send frame 13
Write packet -468 (size= 209)
Send frame 14

.....

encode_video.c 解读
----------------------------------------
demo 程序总是有不同的输出,令人有了解的欲望.

构建过程还是上面所述的3步曲.
仍然从查找codec,分配codec_ctx开始
    codec = avcodec_find_encoder_by_name(codec_name);
    c = avcodec_alloc_context3(codec);
进行codec_ctx初始化, 打开cocec_ctx
    /* put sample parameters */
    c->bit_rate = 400000;
    /* resolution must be a multiple of two */
    c->width = 352;
    c->height = 288;
    /* frames per second */
    c->time_base = (AVRational){1, 25};
    c->framerate = (AVRational){25, 1};
    c->gop_size = 10; // 每10个frame 发一个I贞
    c->max_b_frames = 1;
    c->pix_fmt = AV_PIX_FMT_YUV420P;
    ret = avcodec_open2(c, codec, NULL); //打开会产生多个线程.

创建一个frame, 并对frame进行手工初始化

    frame = av_frame_alloc();
    frame->format = c->pix_fmt;
    frame->width = c->width;
    frame->height = c->height;
//获取buffer, 将会初始化frame-data 8指针,frame->linesize 8大小,虽然libx64只用了3个
    ret = av_frame_get_buffer(frame, 0); // 为frame 分配内存,可以写数据了.
    后面就是填充若干个frame, 编码成packet, 写入文件了.
    每个frame 的构成按上述.一行行填
    Y填到Y区,CbCr各填入自己的对应区. 按YUV4:2:0格式

压缩是如此的简单, send_frame,receive_packet,然后直接写包数据.

libx264编码实际上发了20多个frame 都收不到一个packet (可能与机器或系统有关,例如ubuntu18 与ubuntu20 上的表现收到的packet 不一定完全相同.),

然后才开始发frame,收packet, 并不是1对1关系,

当flush encoder 时(frame 为NULL), 会连续收到packet, packet的pts 也不是按顺序增长的,而是有次序上的调整.但dts 是按顺序增长的.

$ ./encode_video video libx264
[libx264 @ 0x5635564bd040] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x5635564bd040] profile High, level 1.3
Send frame pts: 0
Send frame pts: 1
Send frame pts: 2
Send frame pts: 3
Send frame pts: 4
Send frame pts: 5
Send frame pts: 6
Send frame pts: 7
Send frame pts: 8
Send frame pts: 9
Send frame pts: 10
Send frame pts: 11
Send frame pts: 12
Send frame pts: 13
Send frame pts: 14
Send frame pts: 15
Send frame pts: 16
Send frame pts: 17
Send frame pts: 18
Send frame pts: 19
Send frame pts: 20
Send frame pts: 21
Write packet pts: 0, dts: -1, (size= 2069)
Send frame pts: 22
Write packet pts: 2, dts: 0, (size= 672)
Send frame pts: 23
Write packet pts: 1, dts: 1, (size= 133)
Send frame pts: 24
Write packet pts: 4, dts: 2, (size= 766)
Write packet pts: 3, dts: 3, (size= 193)
Write packet pts: 6, dts: 4, (size= 681)
Write packet pts: 5, dts: 5, (size= 515)
Write packet pts: 8, dts: 6, (size= 737)
Write packet pts: 7, dts: 7, (size= 437)
Write packet pts: 9, dts: 8, (size= 506)
Write packet pts: 10, dts: 9, (size= 2240)
Write packet pts: 12, dts: 10, (size= 1015)
Write packet pts: 11, dts: 11, (size= 704)
Write packet pts: 14, dts: 12, (size= 912)
Write packet pts: 13, dts: 13, (size= 481)
Write packet pts: 16, dts: 14, (size= 936)
Write packet pts: 15, dts: 15, (size= 652)
Write packet pts: 18, dts: 16, (size= 1261)
Write packet pts: 17, dts: 17, (size= 583)
Write packet pts: 19, dts: 18, (size= 611)
Write packet pts: 20, dts: 19, (size= 2361)
Write packet pts: 22, dts: 20, (size= 1119)
Write packet pts: 21, dts: 21, (size= 758)
Write packet pts: 24, dts: 22, (size= 708)
Write packet pts: 23, dts: 23, (size= 691)
[libx264 @ 0x5635564bd040] frame I:3     Avg QP:25.32 size: 2223
[libx264 @ 0x5635564bd040] frame P:12    Avg QP:24.19 size:   827
[libx264 @ 0x5635564bd040] frame B:10    Avg QP:28.24 size:   515
[libx264 @ 0x5635564bd040] consecutive B-frames: 20.0% 80.0%
[libx264 @ 0x5635564bd040] mb I I16..4: 78.4% 11.4% 10.2%
[libx264 @ 0x5635564bd040] mb P I16..4: 76.7% 0.9% 0.2% P16..4: 20.9% 0.7% 0.4% 0.0% 0.0%    skip: 0.3%
[libx264 @ 0x5635564bd040] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 12.0% 0.3% 0.0% direct:11.1% skip:76.7% L0:24.9% L1:38.3% BI:36.8%
[libx264 @ 0x5635564bd040] final ratefactor: 15.78
[libx264 @ 0x5635564bd040] 8x8 transform intra:3.7% inter:19.2%
[libx264 @ 0x5635564bd040] direct mvs spatial:0.0% temporal:100.0%
[libx264 @ 0x5635564bd040] coded y,uvDC,uvAC intra: 4.8% 31.9% 4.2% inter: 1.3% 35.8% 8.8%
[libx264 @ 0x5635564bd040] i16 v,h,dc,p: 0% 0% 0% 100%
[libx264 @ 0x5635564bd040] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 3% 32% 16% 48% 0% 0% 0% 0% 0%
[libx264 @ 0x5635564bd040] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 7% 14% 53% 1% 5% 2% 7% 1%
[libx264 @ 0x5635564bd040] i8c dc,h,v,p: 1% 6% 4% 88%
[libx264 @ 0x5635564bd040] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x5635564bd040] ref P L0: 97.3% 0.7% 1.5% 0.3% 0.1% 0.1%
[libx264 @ 0x5635564bd040] ref B L0: 63.5% 31.8% 4.7%
[libx264 @ 0x5635564bd040] kb/s:173.93

hjjdebug

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
encode_audio.c /encode_video.c 解读

encode_audio.c 解读----------------------------------------我们关心的不是二进制数据本身,而是如何构建的channel-layout,sample_smt及采样率Input #0, mp3, from 'audio': Duration: 00:00:05.22, start: 0.000000, bitrate: 64 kb/s Stream #0:0: Audio: mp2, 44100 Hz, stereo, fltp, 64 kb/..
复制链接

扫一扫