音频wav文件格式分析

一、音频文件

​ /usr/share/sounds/deepin/stereo/desktop-login.wav

二、文件信息

syli@syli-PC:~/work/repo/Demo/pa$ soxi desktop-login.wav 

Input File     : 'desktop-login.wav'
Channels       : 2
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:07.00 = 308700 samples = 525 CDDA sectors
File Size      : 1.23M
Bit Rate       : 1.41M
Sample Encoding: 16-bit Signed Integer PCM

syli@syli-PC:~/work/repo/Demo/pa$ ls -al desktop-login.wav 
-rw-r--r-- 1 root root 1234878 614 14:53 desktop-login.wav

syli@syli-PC:~/work/repo/Demo/pa$ file desktop-login.wav 
desktop-login.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz

三、分析佐证

1. 音频时长

duration = samples / (sample rate)
7 s = 308700 samples / 44100

2. 文件大小

1 sample = (Sample Encoding) * Channels / 8 bit
1 sample = 16(采样深度) * 2 / 8(bit) = 4 (字节)

size = (1 sample size) * samples
size = 4 * 308700 = 1,234,800 (字节)

整个文件大小 = 1234878 (字节)

非数据文件大小 = 1,234,878 - 1,234,800 = 78(字节)

3. 记录速率

Bit Rate = (Sample Rate) * (1 sample size) (kb/s)
		 = (Sample Rate) * ((Sample Encoding) * Channels) (kb/s)
1.41M 	 = 44100 * 16 * 2 / 1000 / 1000   (Mb/s)

4. 报文头数据

​ 查看十六进制数据

hexdump -C desktop-login.wav

​ 16bit 双声道示例
在这里插入图片描述

syli@syli-PC:~/work/repo/Demo/pa$ head hex.txt 
00000000  52 49 46 46 b6 d7 12 00  57 41 56 45 66 6d 74 20  |RIFF....WAVEfmt |
00000010  10 00 00 00 01 00 02 00  44 ac 00 00 10 b1 02 00  |........D.......|
00000020  04 00 10 00 64 61 74 61  70 d7 12 00 03 00 09 00  |....datap.......|
00000030  01 00 05 00 06 00 08 00  05 00 08 00 01 00 02 00  |................|
00000040  06 00 07 00 05 00 05 00  02 00 03 00 05 00 07 00  |................|
00000050  03 00 04 00 03 00 04 00  05 00 06 00 01 00 02 00  |................|
地址示例说明
1 - 4“RIFF”Marks the file as a riff file. Characters are each 1 byte long.
固定为0x52494646,标识为RIFF格式
5 - 8File size (integer)Size of the overall file - 8 bytes, in bytes (32-bit integer). Typically, you’d fill this in after creation.
块数据域大小(Chunk Size),即从下一个地址开始,到文件末尾的总字节数,或者文件总字节数-8。
从0x08开始一直到文件末尾,都是ID为"RIFF"块的内容,其中会包含两个子块,"fmt “和"data”
0x0012d7b6 = 1,234,870 = 整个文件大小 - 8
9 -12“WAVE”File Type Header. For our purposes, it always equals “WAVE”.
类型码(Form Type),WAV文件格式标记,即"WAVE"四个字母
13-16“fmt "Format chunk marker. Includes trailing null
"fmt "子块(0x666D7420),注意末尾的空格;
17-2016Length of format data as listed above
前面报文数据(SubChunk Size)的长度
21-221Type of format (1 is PCM) - 2 byte integer
编码格式(Audio Format),1代表PCM无损格式;
23-242Number of Channels - 2 byte integer
通道channels数量:2
25-2844100Sample Rate - 32 byte integer. Common values are 44100 (CD), 48000 (DAT).
Sample Rate = Number of Samples per second, or Hertz.
采样率0xAC44 = 44100 采样率也就是每秒的采样数,或者HZ;
29-32176400(Sample Rate * BitsPerSample * Channels) / 8.
传输速率(Byte Rate),每秒数据字节数,SampleRate * Channels * BitsPerSample / 8
0x02 B110 = 176400
33-344(BitsPerSample * Channels) / 8
每个采样所需的字节数,BitsPerSample*Channels/8
35-3616Bits per sample
单个采样位深(Bits Per Sample),可选8、16或32
37-40“data”“data” chunk header. Marks the beginning of the data section.
"data"子块,标识数据部分的开始;0xs64 61 74 61 对应data字符串
41-44File size (data)Size of the data section.
子块数据域大小(SubChunk Size)0x 12 d7 70 = 1,234,800

​ 如果fmt SubChunk Size等于0x10(16),表示头部不包含附加信息,即WAV头部信息长度为44;如果等于0x12(18),则包含附加信息,此时头部信息长度大于44。

​ 当WAV头部包含附加信息时,fmt SubChunk Size长度为18,并且紧随是另一个子块,这个包含了一些自定义的附加信息,接着往下才是"data"子块。

  1. 判断fmt块长度是否为18。
  2. 如果fmt长度为18,那么必然从0x26位置开始为附加信息块,0x30-0x33位置记录着该子块长度。
  3. 根据步骤2获取的子块长度,假定为N(16进制),那么PCM音频信息开始位置为:0x34 + N + 8。

5. PCM数据

pcm size = (bytes per sample) * samples
		 = ((Sample Encoding) * Channels / 8 bits) * samples
		 = 16 * 2 / 8 * 308700
		 = 1,234,800 bytes

在这里插入图片描述

6. 文件末尾格式

在这里插入图片描述

77176 0012d770  ff ff 00 00 01 00 02 00  01 00 01 00 ff ff 00 00  |................|
77177 0012d780  00 00 01 00 00 00 ff ff  ff ff ff ff 02 00 02 00  |................|
77178 0012d790  01 00 ff ff ff ff 00 00  00 00 00 00 4c 49 53 54  |............LIST|
77179 0012d7a0  1a 00 00 00 49 4e 46 4f  49 53 46 54 0e 00 00 00  |....INFOISFT....|
77180 0012d7b0  4c 61 76 66 35 36 2e 34  30 2e 31 30 31 00        |Lavf56.40.101.|
77181 0012d7be

计算文件大小:

0x77180 * 16 - 2 = 1,234,878
PCM音频数据大小 = 1,234,878 - 44(报文头) - 34(报文尾) = 1234800

Lavf56.40.101:说明这个音频文件是用ffmpeg编码的,lavf指的是libavformat,是ffmpeg的一个组件,后面数字是版本号;

四、音频基本概念

​ PCM(Pulse Code Modulation):脉冲编码调制(PCM)是一种用于数字表示采样模拟信号的方法。它是计算机、光盘、数字电话和其他数字音频应用中的标准数字音频形式。在PCM流中,模拟信号的振幅以均匀的间隔被定期采样,每个样本被量化为数字步长范围内最接近的值。

​ channel:声道数,常见单声道(mono)、立体声(stereo)、环绕声;

​ sample:一次采样,通常的sample bit指的是一个channnel上,一次采样的bit数(常见的sample bit 8/16/24/32bits)

​ rate:采样率,即每秒的采样次数,单位是frame;

​ frame:一个frame是一次采样时所有channel上的sample bit.即frame = channels * (sample bit)

​ Interleaved:交错模式,一种音频数据的记录方式,在交错模式下,数据以连续桢的形式存放,即首先记录完桢1的左声道样本和右声道样本(假设为立体声),再开始桢2的记录。而在非交错模式下,首先记录的是一个周期内所有桢的左声道样本,再记录右声道样本,数据是以连续通道的方式存储。多数情况下使用交错模式。

​ period:每当hardware buffer 中有peroid size个frame的空间时,硬件就产生中断,来通知alsa driver来往硬件写数据;

​ Period size:周期,每次硬件中断处理音频数据的Frame个数,对于音频设备的数据读写,单位是Frame。

​ buffer size:数据缓冲区大小,是由多个peroid组成。buffer size = peroid size * peroids,peroids相当于处理完一个buffer数据所需的硬件中断次数。

​ xrun指的是,声卡period一到,引发一个中断,告诉alsa驱动,要填入数据,或读走数据,但是,问题在于alsa的读取和写入操作必须用户调用writei和readi才会发生的,它不会去缓存数据。如果上层没有用户调用writei和readi,那么就会产生 overrun(录制时,数据都满了,还没被alsa驱动读走)和underrun(需要数据来播放,alsa驱动却不写入数据),统称为xrun。

​ softvol:Softvol是一个高级Linux声音架构(ALSA)插件,它将基于软件的音量控制添加到ALSA音频混音器(alsamixer)。当声卡没有硬件音量控制时,这是很有用的。softvol插件内置在ALSA中,不需要单独安装;软音量的另一个用例是当硬件音量控制无法将声音放大到超过某个阈值时,从而使音频文件变得过于安静。在这种情况下,可以创建软件放大器,以提高音量水平,牺牲一些质量的代价。

​ UCM:Alsa用例管理器(Use Case Manager)描述了如何为特定的用例(usecases)(如“播放音频”,“呼叫”)设置混音器。它还描述了如何修改混频器状态,以路由音频到某些输出和输入,以及如何控制这些设备。

frame计算示例:

Here is an alternative example for the above discussion.

Say we want to work with a stereo, 16-bit, 44.1 KHz stream, one-way (meaning, either in playback or in capture direction). Then we have:

'stereo' = number of channels: 2
1 analog sample is represented with 16 bits = 2 bytes
1 frame represents 1 analog sample from all channels; here we have 2 channels, and so:
1 frame = (num_channels) * (1 sample in bytes) = (2 channels) * (2 bytes (16 bits) per sample) = 4 bytes (32 bits)
To sustain 2x 44.1 KHz analog rate - the system must be capable of data transfer rate, in Bytes/sec:
Bps_rate = (num_channels) * (1 sample in bytes) * (analog_rate) = (1 frame) * (analog_rate) = ( 2 channels ) * (2 bytes/sample) * (44100 samples/sec) = 2*2*44100 = 176400 Bytes/sec

在这里插入图片描述

五、精简播放demo

#include <stdio.h>
#include <stdlib.h>
#include "include/asoundlib.h"
 
#define MESSAGE(format, ...)  printf("[%s][%s][%d]: " format "\n", __FILE__, __FUNCTION__, __LINE__, ##__VA_ARGS__)

static snd_output_t *log;
static unsigned buffer_time = 0;
static unsigned period_time = 0;
static int start_delay = 0;
static int stop_delay = 0;

void dump_hw_params(snd_pcm_t *handle, snd_pcm_hw_params_t *params, snd_output_t *log)
{
    fprintf(stderr, "Params of device \"%s\":\n",
                    snd_pcm_name(handle));
    fprintf(stderr, "--------------------\n");
    snd_pcm_hw_params_dump(params, log);
    fprintf(stderr, "--------------------\n");
}

snd_pcm_t* device_create(void)
{
    int ret = -1;                       // return value;
    int n;
    char *hw_name = "default";          // sound card device name;
    int direction = 0;
    int channel = 2;
    int sample_rate = 44100;
    snd_pcm_uframes_t chunk_size = 1024;
    snd_pcm_uframes_t buffer_size = 0;
    snd_pcm_t *handle;                  //PCM设备句柄
    snd_pcm_hw_params_t *hw_params;     //硬件信息和PCM流配置
    snd_pcm_sw_params_t *swparams;
    snd_pcm_uframes_t start_threshold, stop_threshold;

    /* step 1: 打开PCM,最后一个参数为0意味着标准配置 */
    ret = snd_pcm_open(&handle, hw_name, SND_PCM_STREAM_PLAYBACK, 0);
    if (ret < 0) {
        perror("snd_pcm_open");
        return NULL;
    }
    MESSAGE();

    /* step 2: 创建snd_pcm_hw_params_t结构体 */
    ret = snd_pcm_hw_params_malloc(&hw_params);
    if (ret < 0) {
        perror("snd_pcm_hw_params_malloc");
        goto failed;
    }
    MESSAGE();

    /* step 3: 初始化hw_params */
    ret = snd_pcm_hw_params_any(handle, hw_params);
    if (ret < 0) {
        perror("snd_pcm_hw_params_any");
        goto failed;
    }
    MESSAGE();

    /* step 4: 初始化访问权限 */
    // snd_pcm_readi/snd_pcm_writei access
    ret = snd_pcm_hw_params_set_access(handle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED);
    if (ret < 0) {
        perror("snd_pcm_hw_params_set_access");
        goto failed;
    }
    MESSAGE();

    /* step 5: 初始化采样格式SND_PCM_FORMAT_S16_LE */
    ret = snd_pcm_hw_params_set_format(handle, hw_params, SND_PCM_FORMAT_S16_LE);
    if (ret < 0) {
        perror("snd_pcm_hw_params_set_format");
        goto failed;
    }
    MESSAGE();

    /* step 6: 设置采样率,如果硬件不支持我们设置的采样率,将使用最接近的 */
    ret = snd_pcm_hw_params_set_rate_near(handle, hw_params, &sample_rate, &direction);
    if (ret < 0) {
        perror("snd_pcm_hw_params_set_rate_near");
        goto failed;
    }
    MESSAGE();

    /* step 7: 设置通道数量 */
    ret = snd_pcm_hw_params_set_channels(handle, hw_params, channel);
    if (ret < 0) {
        perror("snd_pcm_hw_params_set_channels");
        goto failed;
    }
    MESSAGE();

    /* get the buffer time */
    ret = snd_pcm_hw_params_get_buffer_time_max(hw_params, &buffer_time, 0);
    MESSAGE("buffer_time:%d", buffer_time);
    if (buffer_time > 500000)
        buffer_time = 500000;

    /* calc period time */
    if (buffer_time > 0)
        period_time = buffer_time / 4;

    MESSAGE("period time:%d", period_time);
    /* set period time */
    if (period_time > 0)
            ret = snd_pcm_hw_params_set_period_time_near(handle, hw_params, &period_time, 0);
    MESSAGE("period time:%d", period_time);

    MESSAGE("buffer time:%d", buffer_time);
    /* set buffer time */
    if (buffer_time > 0)
        ret = snd_pcm_hw_params_set_buffer_time_near(handle, hw_params, &buffer_time, 0);
    MESSAGE("buffer time:%d", buffer_time);

    /* step 8: 设置hw_params参数 */
    ret = snd_pcm_hw_params(handle, hw_params);
    if (ret < 0) {
        perror("snd_pcm_hw_params");
        goto failed;
    }
    MESSAGE();

    /* for debug info */
    dump_hw_params(handle, hw_params, log);

#if 0
    /* soft params */
    snd_pcm_hw_params_get_period_size(hw_params, &chunk_size, 0);
    snd_pcm_hw_params_get_buffer_size(hw_params, &buffer_size);

    snd_pcm_sw_params_alloca(&swparams);

    snd_pcm_sw_params_current(handle, swparams);

    n = chunk_size;
    ret = snd_pcm_sw_params_set_avail_min(handle, swparams, n);

    n = buffer_size;
    start_threshold = n + (double) sample_rate * start_delay / 1000000;
    if (start_threshold < 1)
            start_threshold = 1;
    if (start_threshold > n)
            start_threshold = n;

    ret = snd_pcm_sw_params_set_start_threshold(handle, swparams, start_threshold);
    
    stop_threshold = buffer_size + (double) sample_rate * stop_delay / 1000000;
    ret = snd_pcm_sw_params_set_stop_threshold(handle, swparams, stop_threshold);

    ret = snd_pcm_sw_params(handle, swparams);
    
    /* for debug info */
    snd_pcm_sw_params_dump(swparams, log); 
#endif

    return handle;
failed:
    snd_pcm_close(handle);
    return NULL;
}

void device_play(snd_pcm_t *pcm_handle, FILE *fp)
{
    int ret = -1;
    int size = 5512;
    char *buffer;
    int frame;

    buffer = (char *) malloc(size);
    MESSAGE("size=%d\n", size);
    frame = size / 4;

    while (1)
    {
        ret = fread(buffer, 1, size, fp);
        if(ret == 0)
        {
            fprintf(stderr, "end of file on input\n");
            break;
        }

        /* step 9: 写音频数据到PCM设备 */
                                // MESSAGE("fread ret:%d", ret);
        while(ret = snd_pcm_writei(pcm_handle, buffer, frame)<0)
        {
            usleep(2000);
            if (ret == -EPIPE)
            {
                /* EPIPE means underrun */
                fprintf(stderr, "underrun occurred\n");
                //完成硬件参数设置,使设备准备好
                snd_pcm_prepare(pcm_handle);
                MESSAGE();
            }
            else if (ret < 0)
            {
                fprintf(stderr, "error from writei: %s\n", snd_strerror(ret));
                break;
            }
        }
    }
    MESSAGE();
}


void device_destroy(snd_pcm_t *pcm_handle)
{
    //10. 关闭PCM设备句柄
    snd_pcm_drain(pcm_handle);
    snd_pcm_close(pcm_handle);
    MESSAGE();
}

int main(int argc, char *argv[])
{
    FILE *fp;
    snd_pcm_t *handle;          //PCM设备句柄pcm.h
 
    if (argc != 2) {
        printf("error: play [music name]\n");
        return -1;
    }

    fp = fopen(argv[1], "rb");
    if(fp == NULL)
        return -1;
 
    snd_output_stdio_attach(&log, stderr, 0);

    handle = device_create();
    device_play(handle, fp);
    device_destroy(handle);

    snd_output_close(log);
    fclose(fp);

    return 0;
}

附录1–参考网址

​ wav报文头格式说明

​ https://docs.fileformat.com/audio/wav/

​ https://juejin.cn/post/6844904051964903431

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值