[FFMPEG]进行音频音量调整

一、前言

​ 由于我们个人使用的音频和视频数据都是从各个地方获取的,因此其中的音频音量大小不一。而互联网音乐平台如网易云音乐、qq音乐等,几乎所有的音乐音量大小都一致,可能就是使用了音量标准化的方式。

二、正文

​ 当前衡量一个音频音量的常用单位是分贝(db)

1.查看音频分贝

1.1.查看指令

ffmpeg -i 11025.mp3 -filter_complex volumedetect -c:v copy -f null /dev/null

1.2.指令结果

[Parsed_volumedetect_0 @ 0x55ef0a332740] n_samples: 5551838
[Parsed_volumedetect_0 @ 0x55ef0a332740] mean_volume: -17.5 dB
[Parsed_volumedetect_0 @ 0x55ef0a332740] max_volume: 0.0 dB
[Parsed_volumedetect_0 @ 0x55ef0a332740] histogram_0db: 92
[Parsed_volumedetect_0 @ 0x55ef0a332740] histogram_1db: 427
[Parsed_volumedetect_0 @ 0x55ef0a332740] histogram_2db: 1213
[Parsed_volumedetect_0 @ 0x55ef0a332740] histogram_3db: 3159
[Parsed_volumedetect_0 @ 0x55ef0a332740] histogram_4db: 7153

1.3.结果分析

​ 最高分贝(max_volume)为0.0 b,平均分贝(max_volume)为-17.5db

2.音量调整

2.1.基于当前音量倍数处理

<1>将当前音量降低一半:

ffmpeg  -i input.mp3 -filter:a "volume=0.5" output.mp3

<2>将当前音量提升一倍。这种处理相对粗暴,会使音频出现失真现象:

ffmpeg  -i input.mp3 -filter:a "volume=2" output.mp3 

2.2.基于分贝数值的处理

​ 上面基于倍数的处理可能会导致音频失真,而基于分贝数值的处理则相对会保留音频的原声效果

<1>音量提升5分贝(db):

ffmpeg  -i input.mp3 -filter:a "volume=5dB" output.mp3 

<2>音量降低5分贝(db):

ffmpeg  -i input.mp3 -filter:a "volume=-5dB" output.mp3 

3.音量的标准化

​ ffmpeg具备对音量标准化的处理功能,即削峰填谷,使整个音频的音量变得平滑

ffmpeg -i input.mp3 -filter:a "loudnorm=i=-14:tp=0.0" output.mp3 

4.调用FFMPEG代码API实现问题记录

4.1.报错“Changing audio frame properties on the fly is not supported”的问题
[in @ 0x7f06f00b0b40] Changing audio frame properties on the fly is not supported.
[in @ 0x7f06f00b0b40] filter context - fmt: s32p r: 8000 layout: 3 ch: 2, incoming frame - fmt: s32p r: 48000 layout: 3 ch: 2 pts_time: NOPTS

​ 问题原因出现在于音频转码时参数被修改了,包括sample_rate,sample_fmt,channel_layout,channels等,我的环境为将采样率由8khz修改为48khz。再看看打印信息“filter context - fmt: s32p r: 8000 layout: 3 ch: 2, incoming frame - fmt: s32p r: 48000 layout: 3 ch: 2 pts_time: NOPTS”,说明设置的filter上下文参数为“fmt: s32p r: 8000 layout: 3 ch: 2”,但实际进来的帧参数为“fmt: s32p r: 48000 layout: 3 ch: 2 pts_time: NOPTS”,可见是filter上下文的采样率®设置错误。
​ 参考FFMPEG源码transcoding.c中设置filter上下文的源码如下:

snprintf(args, sizeof(args),
        "time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=0x%"PRIx64,
        dec_ctx->time_base.num, dec_ctx->time_base.den, dec_ctx->sample_rate,
        av_get_sample_fmt_name(dec_ctx->sample_fmt),
        dec_ctx->channel_layout);

​ 什么意思呢?也就是说filter上下文参数和解码参数保持一致,因此将dec_ctx->sample_rate修改为enc_ctx->sample_rate即可解决,其他参数也是一样以此类推。
滤镜信息:
在这里插入图片描述

4.2.报错“more samples than frame size (avcodec_encode_audio2)”

​ 这个问题我遇见过两次,分别为<1>音频mp3转码acc时;<2>使用loudnorm滤镜时。意思是给的采样数量和实际需要的数量不匹配。

4.2.1.音频mp3转码acc时报错

​ 参考文件transcode_aac.c里面的程序,应该很容易能提取出增加fifo来处理的程序,此处不详细展开;

4.2.2.使用loudnorm滤镜时

​ 该问题可以采用av_buffersink_set_frame_size来重新设置frame_size来解决,参考ffmpeg.c源码后,写出了以下代码:

	AVFilterGraph *graph=(*filter_ctx)[i].graph;
	if (graph  &&AVMEDIA_TYPE_AUDIO==(*filter_ctx)[i].codec_type) {
		ms_debug("nb_filters:%d", graph->nb_filters);
		int filters_index=0;
		for(filters_index=0;filters_index<graph->nb_filters;filters_index++){
			AVFilterContext *filters=graph->filters[filters_index];
			ms_debug("filters:%s", filters->name);
			if(ms_strncmps_neq(filters->name, "in")&&ms_strncmps_neq(filters->name, "out")){
				if (!(enc_ctx->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE)){
					av_buffersink_set_frame_size(filters, enc_ctx->frame_size);
				}
			}
		}
	}
  • 6
    点赞
  • 26
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论
可以使用 ffmpeg音频过滤器 `volume` 来进行音量调整。以下是一个简单的 C++ 代码示例: ```c++ #include <iostream> #include <string> #include <cstdlib> #include <cstdio> #include <cstring> #include <unistd.h> #include <libavutil/opt.h> #include <libavutil/channel_layout.h> #include <libavutil/samplefmt.h> #include <libavformat/avformat.h> #include <libavfilter/avfiltergraph.h> #include <libavfilter/buffersink.h> #include <libavfilter/buffersrc.h> #include <libavfilter/avfilter.h> using namespace std; int main(int argc, char** argv) { if (argc < 4) { cout << "Usage: " << argv[0] << " <input_file> <output_file> <volume>" << endl; return 1; } const char* input_file = argv[1]; const char* output_file = argv[2]; const char* volume = argv[3]; av_register_all(); avfilter_register_all(); AVFormatContext* fmt_ctx = NULL; if (avformat_open_input(&fmt_ctx, input_file, NULL, NULL) != 0) { cout << "Cannot open input file: " << input_file << endl; return 1; } if (avformat_find_stream_info(fmt_ctx, NULL) < 0) { cout << "Cannot find stream information" << endl; return 1; } AVCodec* codec = NULL; int stream_idx = av_find_best_stream(fmt_ctx, AVMEDIA_TYPE_AUDIO, -1, -1, &codec, 0); if (stream_idx < 0) { cout << "Cannot find audio stream" << endl; return 1; } AVCodecContext* codec_ctx = avcodec_alloc_context3(codec); if (codec_ctx == NULL) { cout << "Cannot allocate codec context" << endl; return 1; } avcodec_parameters_to_context(codec_ctx, fmt_ctx->streams[stream_idx]->codecpar); if (avcodec_open2(codec_ctx, codec, NULL) < 0) { cout << "Cannot open codec" << endl; return 1; } AVFilterGraph* filter_graph = avfilter_graph_alloc(); if (filter_graph == NULL) { cout << "Cannot allocate filter graph" << endl; return 1; } AVFilterContext* src_ctx = NULL; AVFilterContext* sink_ctx = NULL; char args[512]; snprintf(args, sizeof(args), "volume=%s", volume); const AVFilter* src = avfilter_get_by_name("abuffer"); const AVFilter* sink = avfilter_get_by_name("abuffersink"); const AVFilter* volume_filter = avfilter_get_by_name("volume"); if (avfilter_graph_create_filter(&src_ctx, src, "src", NULL, NULL, filter_graph) < 0) { cout << "Cannot create buffer source" << endl; return 1; } if (avfilter_graph_create_filter(&sink_ctx, sink, "sink", NULL, NULL, filter_graph) < 0) { cout << "Cannot create buffer sink" << endl; return 1; } if (avfilter_graph_create_filter(&volume_ctx, volume_filter, "volume", args, NULL, filter_graph) < 0) { cout << "Cannot create volume filter" << endl; return 1; } AVFilterInOut* outputs = avfilter_inout_alloc(); AVFilterInOut* inputs = avfilter_inout_alloc(); outputs->name = av_strdup("in"); outputs->filter_ctx = src_ctx; outputs->pad_idx = 0; outputs->next = NULL; inputs->name = av_strdup("out"); inputs->filter_ctx = sink_ctx; inputs->pad_idx = 0; inputs->next = NULL; if (avfilter_graph_parse_ptr(filter_graph, "anull", &inputs, &outputs, NULL) < 0) { cout << "Cannot parse filter graph" << endl; return 1; } if (avfilter_graph_config(filter_graph, NULL) < 0) { cout << "Cannot configure filter graph" << endl; return 1; } AVPacket pkt; av_init_packet(&pkt); pkt.data = NULL; pkt.size = 0; AVFrame* frame = av_frame_alloc(); FILE* fout = fopen(output_file, "wb"); if (fout == NULL) { cout << "Cannot open output file: " << output_file << endl; return 1; } AVRational time_base = fmt_ctx->streams[stream_idx]->time_base; int ret; while (av_read_frame(fmt_ctx, &pkt) == 0) { if (pkt.stream_index != stream_idx) { av_packet_unref(&pkt); continue; } ret = avcodec_send_packet(codec_ctx, &pkt); if (ret < 0) { cout << "Error sending packet to decoder" << endl; break; } while (ret >= 0) { ret = avcodec_receive_frame(codec_ctx, frame); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) { break; } else if (ret < 0) { cout << "Error receiving frame from decoder" << endl; goto end; } ret = av_buffersrc_add_frame_flags(src_ctx, frame, AV_BUFFERSRC_FLAG_KEEP_REF); if (ret < 0) { cout << "Error submitting frame to source buffer" << endl; goto end; } while (1) { AVFrame* filtered_frame = av_frame_alloc(); ret = av_buffersink_get_frame_flags(sink_ctx, filtered_frame, 0); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) { av_frame_free(&filtered_frame); break; } else if (ret < 0) { av_frame_free(&filtered_frame); cout << "Error getting filtered frame from sink buffer" << endl; goto end; } filtered_frame->pts = av_rescale_q(filtered_frame->pts, codec_ctx->time_base, time_base); filtered_frame->pkt_dts = av_rescale_q(filtered_frame->pkt_dts, codec_ctx->time_base, time_base); ret = avcodec_send_frame(codec_ctx, filtered_frame); if (ret < 0) { cout << "Error sending frame to encoder" << endl; av_frame_free(&filtered_frame); goto end; } while (ret >= 0) { AVPacket enc_pkt; av_init_packet(&enc_pkt); enc_pkt.data = NULL; enc_pkt.size = 0; ret = avcodec_receive_packet(codec_ctx, &enc_pkt); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) { av_packet_unref(&enc_pkt); break; } else if (ret < 0) { cout << "Error receiving packet from encoder" << endl; av_packet_unref(&enc_pkt); goto end; } enc_pkt.pts = av_rescale_q(enc_pkt.pts, codec_ctx->time_base, time_base); enc_pkt.dts = av_rescale_q(enc_pkt.dts, codec_ctx->time_base, time_base); ret = av_write_frame(fmt_ctx, &enc_pkt); if (ret < 0) { cout << "Error writing packet" << endl; av_packet_unref(&enc_pkt); goto end; } av_packet_unref(&enc_pkt); } av_frame_free(&filtered_frame); } av_frame_unref(frame); } av_packet_unref(&pkt); } av_write_trailer(fmt_ctx); end: avfilter_inout_free(&inputs); avfilter_inout_free(&outputs); avfilter_graph_free(&filter_graph); avcodec_free_context(&codec_ctx); avformat_close_input(&fmt_ctx); fclose(fout); return 0; } ``` 这个代码会读取一个音频文件,然后使用 `volume` 过滤器增加或减少音量,并将结果写入另一个文件。你需要把 `volume` 参数设置为一个浮点数,表示音量的缩放因子,例如 `0.5` 表示将音量减半,`2.0` 表示将音量增加一倍。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

酷咪哥

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值