ffmpeg解码音频tutorial03（个人分析）

最新推荐文章于 2022-01-12 11:39:52 发布

allen_young_yang

最新推荐文章于 2022-01-12 11:39:52 发布

阅读量1.3k

点赞数

分类专栏： ffmpeg and GStreamer 文章标签： audio callback stream buffer structure application

ffmpeg and GStreamer 专栏收录该内容

29 篇文章 2 订阅

订阅专栏

音频解码部分用到了SDL_AudioSpec结构体，

/**

* When filling in the desired audio spec structure,

* - 'desired->freq' should be the desired audio frequency in samples-per-second.

* - 'desired->format' should be the desired audio format.

* - 'desired->samples' is the desired size of the audio buffer, in samples.

* This number should be a power of two, and may be adjusted by the audio

* driver to a value more suitable for the hardware. Good values seem to

* range between 512 and 8096 inclusive, depending on the application and

* CPU speed. Smaller values yield faster response time, but can lead

* to underflow if the application is doing heavy processing and cannot

* fill the audio buffer in time. A stereo sample consists of both right

* and left channels in LR ordering.

* Note that the number of samples is directly related to time by the

* following formula: ms = (samples*1000)/freq

* - 'desired->size' is the size in bytes of the audio buffer, and is

* calculated by SDL_OpenAudio().

* - 'desired->silence' is the value used to set the buffer to silence,

* and is calculated by SDL_OpenAudio().

* - 'desired->callback' should be set to a function that will be called

* when the audio device is ready for more data. It is passed a pointer

* to the audio buffer, and the length in bytes of the audio buffer.

* This function usually runs in a separate thread, and so you should

* protect data structures that it accesses by calling SDL_LockAudio()

* and SDL_UnlockAudio() in your code.

* - 'desired->userdata' is passed as the first parameter to your callback

* function.

* @note The calculated values in this structure are calculated by SDL_OpenAudio()

typedef struct SDL_AudioSpec {

int freq; /**< DSP frequency -- samples per second */

Uint16 format; /**< Audio data format */

Uint8 channels; /**< Number of channels: 1 mono, 2 stereo */

Uint8 silence; /**< Audio buffer silence value (calculated) */

Uint16 samples; /**< Audio buffer size in samples (power of 2) */

Uint16 padding; /**< Necessary for some compile environments */

Uint32 size; /**< Audio buffer size in bytes (calculated) */

/**

* This function is called when the audio device needs more data.

* @param[out] stream A pointer to the audio data buffer

* @param[in] len The length of the audio buffer in bytes.

* Once the callback returns, the buffer will no longer be valid.

* Stereo samples are stored in a LRLRLR ordering.

void (SDLCALL *callback)(void *userdata, Uint8 *stream, int len);

void *userdata;

} SDL_AudioSpec;

Callback是用户自定义的用于处理音频的函数，当有音频数据需要处理时就调用该函数。Userdata是callback的第一个参数，在SDL_AudioSpec变量初始化时赋值，一般传递的是AVCodecContext变量。

注解中指出，size是音频缓冲的大小，当SDL_OpenAudio().调用时计算得到。跟踪发现，size的值与audio_callback函数的第三个参数len的值相同。

在程序中添加printf输出关键变量：

void audio_callback(void *userdata, Uint8 *stream, int len) {

AVCodecContext *aCodecCtx = (AVCodecContext *)userdata;

int len1, audio_size;

static uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];

static unsigned int audio_buf_size = 0;

static unsigned int audio_buf_index = 0;

printf("audio callback 1 len=%d\n",len);

while(len > 0) {

if(audio_buf_index >= audio_buf_size) {

/* We have already sent all our data; get more */

audio_size = audio_decode_frame(aCodecCtx, audio_buf, sizeof(audio_buf));

if(audio_size < 0) {

/* If error, output silence */

audio_buf_size = 1024; // arbitrary?

memset(audio_buf, 0, audio_buf_size);

} else {

audio_buf_size = audio_size;

}

audio_buf_index = 0;

printf("audio callback 2 (audio_buf_size,audio_buf_index) = (%d,%d)\n",audio_buf_size,audio_buf_index);

}

len1 = audio_buf_size - audio_buf_index;

if(len1 > len)

len1 = len;

memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1);

len -= len1;

stream += len1;

audio_buf_index += len1;

printf("audio callback 3 (len1,len,audio_buf_index) = (%d,%d,%d)\n",len1,len,audio_buf_index); }

}

Len即音频缓冲中数据的大小，也即需待解码的音频数据。

网上资料：通过SDL库对audio_callback的不断调用，不断解码数据，然后放到stream的末尾，SDL库认为stream中数据够播放一帧音频了，就播放它,第三个参数len是向stream中写数据的内存分配尺度，是分配给audio_callback函数写入缓存大小。

假设len=4096，解码后数据块audio_buf的大小为4608，那么一次audio_callback调用不能把audio_buf中全部数据写入stream末尾，就分两次，第一次先把audio_buf的前4096个字节写入stream末尾，第二次调用audio_callback函数时，由于写缓存用光了，又分配4096个字节的缓存，再写剩余的512个字节到stream末尾，写缓存还剩余3584个字节留给下次audio_callback调用使用。

跟踪以后发现的确是这样：

[NULL @ 010B3A60]Invalid and inefficient vfw-avi packed B frames detected

video stream

Compiler did not align stack variables. Libavcodec has been miscompiled

and may be very slow or crash. This is not a bug in libavcodec,

but in the compiler. You may try recompiling using gcc >= 4.2.

Do not report crashes to FFmpeg developers.

[mpeg4 @ 010B3A60]

Invalid and inefficient vfw-avi packed B frames detected

audio callback 1 len=4096

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1, len, audio_buf_index) = (4096,0,4096)

audio callback 1 len=4096

audio callback 3 (len1, len, audio_buf_index) = (512,3584,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1, len, audio_buf_index) = (3584,0,3584)

audio callback 1 len=4096

audio callback 3 (len1, len, audio_buf_index) = (1024,3072,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1, len, audio_buf_index) = (3072,0,3072)

audio callback 1 len=4096

audio callback 3 (len1, len, audio_buf_index) = (1536,2560,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1, len, audio_buf_index) = (2560,0,2560)

这里的audio_buf_size是解码得到的原始音频数据量，每次都是4608，这是解码得到的音频大小。Stream是音频输出，其大小为size即len为4096字节，解码得到的4608个数据无法一次性写入stream中，只有分几次来写。

上面的audio callback 2是解码音频后的输出，每次都解码得到4608字节音频数据，第一次写入4096字节(len1)到stream中，数据还剩余4608-4096＝512字节，第二次解码4608字节数据，写入4096-512＝3584字节数据，还剩余4608-3584＝1024字节数据……

通过上面的分析，我们可以清楚audio_callback解码音频写入缓冲的过程了。

allen_young_yang

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
ffmpeg解码音频tutorial03（个人分析）

音频解码部分用到了SDL_AudioSpec结构体，/** * When filling in the desired audio spec structure, * - 'desired->freq' should be the desired audio fr
复制链接

扫一扫

专栏目录