使用FFMPEG进行音频重采样

破浪征程

已于 2023-09-22 12:28:28 修改

阅读量1.5k

点赞数 2

分类专栏： ffmpeg SDL 音视频处理文章标签： ffmpeg 音视频

于 2023-05-20 12:19:35 首次发布

本文链接：https://blog.csdn.net/yunxiaobaobei/article/details/130779928

版权

音视频处理同时被 3 个专栏收录

21 篇文章 1 订阅

订阅专栏

ffmpeg

14 篇文章 4 订阅

订阅专栏

SDL

6 篇文章 0 订阅

订阅专栏

该文介绍了如何利用ffmpeg库的SwrContext结构进行音频重采样，包括设置输入输出参数、初始化上下文、创建输入输出缓冲区、读取PCM数据、进行重采样以及使用SDL2进行播放。关键代码展示了从立体声到单声道的重采样过程，并提供了源码示例。

摘要由CSDN通过智能技术生成

准备

1. ffmpeg 4.4

2. sdl2

3.一段原始的音频PCM数据

重采样流程

1.设置输入音频参数和输出音频参数

2.根据设置的参数初始化SwrContent上下文

3.创建一个输入buffer, 根据输入的音频参数（采样率，通道数，样本位深度）申请空间，填入默认数据，用于存储输入音频数据

4.创建一个输出buffer, 根据输出的音频参数（采样率，通道数，样本位深度）申请空间，填入默认数据，用于存储重采样后的数据

5.读取PCM数据，每次读取的大小等于输入buffer的大小

6.进行重采样swr_convert

7.将输出的buffer拷贝到SDL2音频回调缓冲区中播放，或者直接写入文件，使用ffplay进行测试，也可以封装成Frame送到音频编码器中（如aac），进行编码后保存。

关键代码

设置重采样参数并初始化SWr_Content结构


struct SwrContext* swr_ctx;

swr_ctx = swr_alloc_set_opts(nullptr,
		AV_CH_LAYOUT_MONO, //输出通道
		AV_SAMPLE_FMT_S16, //输出样本格式
		44100, //输出采样率
		AV_CH_LAYOUT_STEREO,  //输入通道
		AV_SAMPLE_FMT_FLT,  //输入样本格式
		44100, //输入采样率
		0, nullptr); 

	swr_init(swr_ctx);

输入/输出buffer 创建

	//输入数据buffer
	uint8_t** pcm_buffer;
	int src_linesize;
	int src_nb_channels = av_get_channel_layout_nb_channels(AV_CH_LAYOUT_STEREO);
	int ret = av_samples_alloc_array_and_samples(&pcm_buffer, &src_linesize, src_nb_channels, frame_nb_samples, AV_SAMPLE_FMT_FLT, 0);
	if (ret < 0) {
		fprintf(stderr, "Could not allocate source samples\n");	
	}

	//输出数据buffer
	uint8_t** out_buffer;
	int dst_linesize;
	int dst_nb_channels = av_get_channel_layout_nb_channels(AV_CH_LAYOUT_MONO);
	ret = av_samples_alloc_array_and_samples(&out_buffer, &dst_linesize, dst_nb_channels, frame_nb_samples, AV_SAMPLE_FMT_S16, 0);
	if (ret < 0) {
		fprintf(stderr, "Could not allocate source samples\n");
	}

读文件并进行重采样

readcount = fread((char *)pcm_buffer[0], 1, src_linesize, fp);

data_count += readcount;
printf("   Now Playing %10d KBytes data.  %d \n", data_count / 1024, readcount);
		
swr_convert(swr_ctx, out_buffer, frame_nb_samples, (const uint8_t**)pcm_buffer, frame_nb_samples);

源码分享

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "sdl.h"

extern "C"
{
	#include <libavutil/opt.h>
	#include <libavutil/channel_layout.h>
	#include <libavutil/samplefmt.h>
	#include <libswresample/swresample.h>
}

static  Uint32  audio_len;
static  Uint8* audio_pos;
int frame_nb_samples = 1024; //一帧数据样本数
struct SwrContext* swr_ctx;


void  fill_audio_pcm(void* udata, Uint8* stream, int len) 
{
	SDL_memset(stream, 0, len);

	if (audio_len == 0)
		return;
	len = (len > audio_len ? audio_len : len);

	SDL_MixAudio(stream, audio_pos, len, SDL_MIX_MAXVOLUME);
	audio_pos += len;
	audio_len -= len;
}

int main(int argc, char* argv[])
{
	if (SDL_Init(SDL_INIT_AUDIO || SDL_INIT_TIMER))
	{
		printf("SDL init error\n");
		return -1;
	}

	swr_ctx = swr_alloc_set_opts(nullptr,
		AV_CH_LAYOUT_MONO, //输出通道
		AV_SAMPLE_FMT_S16, //输出样本格式
		44100, //输出采样率
		AV_CH_LAYOUT_STEREO,  //输入通道
		AV_SAMPLE_FMT_FLT,  //输入样本格式
		44100, //输入采样率
		0, nullptr); 

	swr_init(swr_ctx);

	//SDL_AudioSpec
	SDL_AudioSpec wanted_spec;
	wanted_spec.freq = 44100;
	wanted_spec.format = AUDIO_F32; //AUDIO_S16LSB; //AUDIO_F32;
	wanted_spec.channels = 2;
	wanted_spec.silence = 0;
	wanted_spec.samples = frame_nb_samples;
	wanted_spec.callback = fill_audio_pcm;

	if (SDL_OpenAudio(&wanted_spec, NULL) < 0) {
		printf("can't open audio.\n");
		return -1;
	}
	//Play
	SDL_PauseAudio(0);
	
	//输入数据buffer
	uint8_t** pcm_buffer;
	int src_linesize;
	int src_nb_channels = av_get_channel_layout_nb_channels(AV_CH_LAYOUT_STEREO);
	int ret = av_samples_alloc_array_and_samples(&pcm_buffer, &src_linesize, src_nb_channels, frame_nb_samples, AV_SAMPLE_FMT_FLT, 0);
	if (ret < 0) {
		fprintf(stderr, "Could not allocate source samples\n");	
	}

	//输出数据buffer
	uint8_t** out_buffer;
	int dst_linesize;
	int dst_nb_channels = av_get_channel_layout_nb_channels(AV_CH_LAYOUT_MONO);
	ret = av_samples_alloc_array_and_samples(&out_buffer, &dst_linesize, dst_nb_channels, frame_nb_samples, AV_SAMPLE_FMT_S16, 0);
	if (ret < 0) {
		fprintf(stderr, "Could not allocate source samples\n");
	}

	FILE* fp = nullptr;
	fopen_s(&fp, "D:/工程/音视频分析/source/f32le.pcm", "rb+");
	if (fp == NULL) {
		printf("cannot open this file\n");
		return -1;
	}
	int readcount = -1;
	int data_count = 0;
	while (!feof(fp)) 
	{
		readcount = fread((char *)pcm_buffer[0], 1, src_linesize, fp);

		data_count += readcount;
		printf("   Now Playing %10d KBytes data.  %d \n", data_count / 1024, readcount);
		
		swr_convert(swr_ctx, out_buffer, frame_nb_samples, (const uint8_t**)pcm_buffer, frame_nb_samples);

		//Set audio buffer (PCM data)
		audio_len = dst_linesize; 
		audio_pos =  (Uint8*)out_buffer[0]; //(Uint8*)pcm_buffer[0];

		while (audio_len > 0)
			SDL_Delay(1);
		
	}

	if (pcm_buffer)
		av_freep(&pcm_buffer[0]);
	av_freep(&pcm_buffer);

	if (out_buffer)
		av_freep(&out_buffer[0]);
	av_freep(&out_buffer);

	swr_free(&swr_ctx);

	fclose(fp);
	SDL_Quit();

	return 0;
}

修改版

执行重采样时需要提供输出的样本数，这个参数是需要计算的，这里涉及到两个函数

int64_t delay = swr_get_delay(swrContext , avFrame->sample_rate);

int64_t av_rescale_rnd(int64_t a, int64_t b, int64_t c, enum AVRounding rnd) av_const;

swr_get_delay

:获取下一次输入的样本 , 到对应的样本输出时 , 需要经历的延迟 , 即获取延迟的数据播放时长或样本个数 ( 二选一 ) ;

参数说明

① struct SwrContext *s 参数 : 音频重采样上下文结构体指针 ;

② int64_t base 参数 : 设置成 1 / 1000 获取延迟的时间秒 / 毫秒 , 设置采样率获取延迟的样本个数

原理说明

FFMPEG 转码的过程中 , 可能没有一次性将一帧数据处理完毕 , 如输入了 20 个数据 , 一般情况下 20 个数据都能处理完毕 , 有时还会出现只处理了 19 个 , 剩余的 1 个数据就积压在了缓冲区中的情况 , 如果这种积压在缓冲区中的数据过大 , 会造成很大的音频延迟 , 甚至内存崩溃 ;所以每次音频处理时 , 都尝试将上一次积压的音频采样数据加入到本次处理的数据中 , 防止出现音频延迟的情况 ;

调用 swr_get_delay ( ) 方法 , 可以获取当前积压的音频采样数 , 或播放延迟时间 ;

swr_get_delay ( ) 获取的是下一次的样本数据 A 输入经过多长时间延迟后 , 才能将样本 A 播放出来 , 这个延迟就是积压的数据的播放时间 , 因此每次处理时将少部分积压数据进行处理 , 可以有效降低音频延迟 ;

av_rescale_rnd

该函数传入上述输入音频采样个数 , 输入音频采样率 , 输出音频采样率参数 , 进行上述计算 , 没有溢出问题 ; 计算公式是 a * b / c ;

参数说明

① int64_t a 参数 : 输入音频采样个数 ;

② int64_t b 参数 : 输出音频采样率 ;

③ int64_t c 参数 : 输入音频采样率 ;

④ enum AVRounding rnd 参数 : 小数转为整数的方式 , 如四舍五入 , 向上取整 , 或向下取整等 ;

原理说明

音频重采样操作 , 需要指定一个输出样本个数, 目前已知的是输入音频采样个数 , 输出音频采样率 , 输入音频采样率 , 需要计算出输出的音频采样个数 ;

音频播放时间=输入音频采样个数输入音频采样率

输出音频采样个数=音频播放时间×输出音频采样率

输出音频采样个数=输入音频采样个数输入音频采样率×输出音频采样率

上面涉及到的计算数据过大 , 音频采样率与采样个数相乘 , 如 44100 Hz 采样率 , 10 万采样 , 相乘结果为 4,410,000,000 , 这个数量级有溢出的风险 , 为了解决计算溢出问题 , FFMPEG 给出了专门的函数 av_rescale_rnd ( ) 来处理这个计算 ;

结论

dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) + src_nb_samples, dst_rate, src_rate, AV_ROUND_UP) ;
if (dst_nb_samples > max_dst_nb_samples)
{
	av_freep(&out_buffer[0]);
	ret = av_samples_alloc(out_buffer, &dst_linesize, dst_nb_channels, dst_nb_samples, dst_sample_fmt, 1);

	if (ret < 0)
		break;

	max_dst_nb_samples = dst_nb_samples;
}

//数据重采样
int res = swr_convert(swr_ctx, out_buffer, dst_nb_samples, (const uint8_t**)pcm_buffer, src_nb_samples);

int dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels, res, dst_sample_fmt, 1);

//Set audio buffer (PCM data)
audio_len = dst_bufsize;
audio_pos = (Uint8*)out_buffer[0]; //(Uint8*)pcm_buffer[0];  //(Uint8*)out_buffer[0]

破浪征程

关注

2
点赞
踩
12

收藏

觉得还不错? 一键收藏
打赏
0
评论
使用FFMPEG进行音频重采样

7.将输出的buffer拷贝到SDL2音频回调缓冲区中播放，或者直接写入文件，使用ffplay进行测试，也可以封装成Frame送到音频编码器中（如aac），进行编码后保存。4.创建一个输出buffer, 根据输出的音频参数（采样率，通道数，样本位深度）申请空间，填入默认数据，用于存储重采样后的数据。3.创建一个输入buffer, 根据输入的音频参数（采样率，通道数，样本位深度）申请空间，填入默认数据，用于存储输入音频数据。5.读取PCM数据，每次读取的大小等于输入buffer的大小。
复制链接

扫一扫