----------------------------------------
author:hjjdebug
date:2022-0129
----------------------------------------
encode_audio.c 解读
----------------------------------------
我们要构建一个音频碼流.
过程分三步:
1. 构建编码用的codec, 确定channel-layout,sample_fmt,及采样率等
2. 构建frame, 准备音频数据,
3. 编码frame, 写到输出文件.
关键知识点: 引入了frame,packet,codec的概念!!! 需要体会,理解.
能够生成带压缩的音频或视频文件.
如下是构建完成时的数据输出:ffprobe audio
Input #0, mp3, from 'audio':
Duration: 00:00:05.22, start: 0.000000, bitrate: 64 kb/s
Stream #0:0: Audio: mp2, 44100 Hz, stereo, fltp, 64 kb/s
从find encoder, alloc context 开始
codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
c = avcodec_alloc_context3(codec);
为ctx 进行了一系列的赋值,然后打开它.
c->sample_fmt = AV_SAMPLE_FMT_S16;
c->sample_rate = select_sample_rate(codec);
c->channel_layout = select_channel_layout(codec);
c->channels = av_get_channel_layout_nb_channels(c->channel_layout);
从check_sample_fmt(codec, c->sample_fmt)) 中, 我们知道codec可以支持不止一种sample_fmt
从select_sample_rate ,可发现codec 可支持多种sample_rate
从select_channel_layout, 可发现codec 可支持多种channel_layout
if (avcodec_open2(c, codec, NULL) < 0)
然后,分配一个packet,一个frame.
为frame 赋值:
frame->nb_samples = c->frame_size; //采样数
frame->format = c->sample_fmt; //采用点格式
frame->channel_layout = c->channel_layout; //通道布局
ret = av_frame_get_buffer(frame, 0); // 0参数是自动对齐的意思
samples = (uint16_t*)frame->data[0];
然后给samples 填充数据
以上是手工为frame 赋值了. 然后对frame 进行编码., 编码如下.
encode()
ret = avcodec_send_frame(ctx, frame);
ret = avcodec_receive_packet(ctx, pkt);
写文件: output 就是一个简单的文件指针.
fwrite(pkt->data, 1, pkt->size, output);
看起来打包的音频流就是很多个pkt 来组成的.
有个问题,音频bit率参数到底是什么角色?
采样率,采样点格式,通道layout 确定后, 就可以确定不压缩的bit率,
除以压缩比,就是压缩后的bit率了.
比特率并不是随便可以设置的,它取决于编码codec的能力, 对mp2而言,我设置6K,它说不支持.
我查了一下代码, 其下是它可能选择的一张表:
// {0, 32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384 },
当你不设置bit_rate时,它默认选择384K, 最小不能小于32K, 否则你就打不开codec,avcodec_open2()会返回失败!
添加一点log后观察其写入编码文件的过程.
其中frame 顺序号是frame->pts, 让其递长(从0开始). printf("Send frame %3ld\n", frame->pts);
得到的packet pts 出现一个偏移(从-481开始),我们看到它也是递长的.
printf("Write packet %3"PRId64" (size=%5d)\n", pkt->pts, pkt->size);
./encode_audio audio
Send frame 0
Write packet -481 (size= 208)
Send frame 1
Write packet -480 (size= 209)
Send frame 2
Write packet -479 (size= 209)
Send frame 3
Write packet -478 (size= 209)
Send frame 4
Write packet -477 (size= 209)
Send frame 5
Write packet -476 (size= 209)
Send frame 6
Write packet -475 (size= 209)
Send frame 7
Write packet -474 (size= 209)
Send frame 8
Write packet -473 (size= 209)
Send frame 9
Write packet -472 (size= 209)
Send frame 10
Write packet -471 (size= 209)
Send frame 11
Write packet -470 (size= 209)
Send frame 12
Write packet -469 (size= 209)
Send frame 13
Write packet -468 (size= 209)
Send frame 14
.....
encode_video.c 解读
----------------------------------------
demo 程序总是有不同的输出,令人有了解的欲望.
构建过程还是上面所述的3步曲.
仍然从查找codec,分配codec_ctx开始
codec = avcodec_find_encoder_by_name(codec_name);
c = avcodec_alloc_context3(codec);
进行codec_ctx初始化, 打开cocec_ctx
/* put sample parameters */
c->bit_rate = 400000;
/* resolution must be a multiple of two */
c->width = 352;
c->height = 288;
/* frames per second */
c->time_base = (AVRational){1, 25};
c->framerate = (AVRational){25, 1};
c->gop_size = 10; // 每10个frame 发一个I贞
c->max_b_frames = 1;
c->pix_fmt = AV_PIX_FMT_YUV420P;
ret = avcodec_open2(c, codec, NULL); //打开会产生多个线程.
创建一个frame, 并对frame进行手工初始化
frame = av_frame_alloc();
frame->format = c->pix_fmt;
frame->width = c->width;
frame->height = c->height;
//获取buffer, 将会初始化frame-data 8指针,frame->linesize 8大小,虽然libx64只用了3个
ret = av_frame_get_buffer(frame, 0); // 为frame 分配内存,可以写数据了.
后面就是填充若干个frame, 编码成packet, 写入文件了.
每个frame 的构成按上述.一行行填
Y填到Y区,CbCr各填入自己的对应区. 按YUV4:2:0格式
压缩是如此的简单, send_frame,receive_packet,然后直接写包数据.
libx264编码实际上发了20多个frame 都收不到一个packet (可能与机器或系统有关,例如ubuntu18 与ubuntu20 上的表现收到的packet 不一定完全相同.),
然后才开始发frame,收packet, 并不是1对1关系,
当flush encoder 时(frame 为NULL), 会连续收到packet, packet的pts 也不是按顺序增长的,而是有次序上的调整.但dts 是按顺序增长的.
$ ./encode_video video libx264
[libx264 @ 0x5635564bd040] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 @ 0x5635564bd040] profile High, level 1.3
Send frame pts: 0
Send frame pts: 1
Send frame pts: 2
Send frame pts: 3
Send frame pts: 4
Send frame pts: 5
Send frame pts: 6
Send frame pts: 7
Send frame pts: 8
Send frame pts: 9
Send frame pts: 10
Send frame pts: 11
Send frame pts: 12
Send frame pts: 13
Send frame pts: 14
Send frame pts: 15
Send frame pts: 16
Send frame pts: 17
Send frame pts: 18
Send frame pts: 19
Send frame pts: 20
Send frame pts: 21
Write packet pts: 0, dts: -1, (size= 2069)
Send frame pts: 22
Write packet pts: 2, dts: 0, (size= 672)
Send frame pts: 23
Write packet pts: 1, dts: 1, (size= 133)
Send frame pts: 24
Write packet pts: 4, dts: 2, (size= 766)
Write packet pts: 3, dts: 3, (size= 193)
Write packet pts: 6, dts: 4, (size= 681)
Write packet pts: 5, dts: 5, (size= 515)
Write packet pts: 8, dts: 6, (size= 737)
Write packet pts: 7, dts: 7, (size= 437)
Write packet pts: 9, dts: 8, (size= 506)
Write packet pts: 10, dts: 9, (size= 2240)
Write packet pts: 12, dts: 10, (size= 1015)
Write packet pts: 11, dts: 11, (size= 704)
Write packet pts: 14, dts: 12, (size= 912)
Write packet pts: 13, dts: 13, (size= 481)
Write packet pts: 16, dts: 14, (size= 936)
Write packet pts: 15, dts: 15, (size= 652)
Write packet pts: 18, dts: 16, (size= 1261)
Write packet pts: 17, dts: 17, (size= 583)
Write packet pts: 19, dts: 18, (size= 611)
Write packet pts: 20, dts: 19, (size= 2361)
Write packet pts: 22, dts: 20, (size= 1119)
Write packet pts: 21, dts: 21, (size= 758)
Write packet pts: 24, dts: 22, (size= 708)
Write packet pts: 23, dts: 23, (size= 691)
[libx264 @ 0x5635564bd040] frame I:3 Avg QP:25.32 size: 2223
[libx264 @ 0x5635564bd040] frame P:12 Avg QP:24.19 size: 827
[libx264 @ 0x5635564bd040] frame B:10 Avg QP:28.24 size: 515
[libx264 @ 0x5635564bd040] consecutive B-frames: 20.0% 80.0%
[libx264 @ 0x5635564bd040] mb I I16..4: 78.4% 11.4% 10.2%
[libx264 @ 0x5635564bd040] mb P I16..4: 76.7% 0.9% 0.2% P16..4: 20.9% 0.7% 0.4% 0.0% 0.0% skip: 0.3%
[libx264 @ 0x5635564bd040] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 12.0% 0.3% 0.0% direct:11.1% skip:76.7% L0:24.9% L1:38.3% BI:36.8%
[libx264 @ 0x5635564bd040] final ratefactor: 15.78
[libx264 @ 0x5635564bd040] 8x8 transform intra:3.7% inter:19.2%
[libx264 @ 0x5635564bd040] direct mvs spatial:0.0% temporal:100.0%
[libx264 @ 0x5635564bd040] coded y,uvDC,uvAC intra: 4.8% 31.9% 4.2% inter: 1.3% 35.8% 8.8%
[libx264 @ 0x5635564bd040] i16 v,h,dc,p: 0% 0% 0% 100%
[libx264 @ 0x5635564bd040] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 3% 32% 16% 48% 0% 0% 0% 0% 0%
[libx264 @ 0x5635564bd040] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 7% 14% 53% 1% 5% 2% 7% 1%
[libx264 @ 0x5635564bd040] i8c dc,h,v,p: 1% 6% 4% 88%
[libx264 @ 0x5635564bd040] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x5635564bd040] ref P L0: 97.3% 0.7% 1.5% 0.3% 0.1% 0.1%
[libx264 @ 0x5635564bd040] ref B L0: 63.5% 31.8% 4.7%
[libx264 @ 0x5635564bd040] kb/s:173.93