Linux音频驱动-WAV文件格式分析

概述

WAV文件格式是Microsoft的RIFF规范的一个子集,用于存储多媒体文件。WAV(RIFF)文件由若干个Chunk组成,分别为:  RIFF WAVE Chunk,Format Chunk,Fact Chunk(可选),Data Chunk。具体格式如下:

RIFF Chunk

根据RIFF的格式,可以抽象出RIFF chunk的结构体:
struct RIFF_CHUNK
{
    char ChunkID[4]; //'R','I','F','F'
    unsigned int ChunkSize;
    char Format[4];  //'W','A','V','E'
};
其中ChunkSize代表的是整个wav_file的大小减去ChunkID和ChunkSize的大小,即wav_file_size=ChunkSize+8。

Format Chunk

Format chunk主要是描述音频数据的格式。根据Format chunk的格式,可以抽象出Format Chunk的数据结构:
struct FORMAT_CHUNK
{
    char FmtID[4]; //'f','m','t'
    unsigned int FmtSize;
    unsigned short FmtTag;
    unsigned short FmtChannels;
    unsigned int SampleRate;
    unsigned int ByteRate;
    unsigned short BlockAilgn;
    unsigned short BitsPerSample;
};
.FmtSize:  通常取值为16或者18,16代表是该音频使用PCM编码方式。
.FmtTag:   如果上述取值为16,则此值通常为1,代表该音频的编码方式是PCM编码。
.FmtChannels:  声道数目,1代表单声道,2代表双声道,就是所谓的立体声。
.SampleRate:  采样频率。如果对此概念不是很了解,可以查看此文章:  Linux音频驱动-声音采集过程
.ByteRate: 每秒所需的字节数。等于SampleRate * NumChannels * BitsPerSample/8。
.BlockAilgn: 数据块对齐单位。等于NumChannels * BitsPerSample/8。
.BitsPerSample:  采样位数。

Data Chunk

Data Chunk主要是描述raw sound数据和大小,根据Data Chunk格式,抽象出Data Chunk的数据结构:
struct DATA_CHUNK
{
    char DataID[4]; //'d','a','t','a'
    unsigned int DataSize;
};
DataSize就是整个raw data的大小。

实例分析

1.  在网上下载wav的音频文件,使用mediainfo显示该音频文件的详细信息。
root@test:~$ mediainfo ~/Download/test.wav 
General
Complete name                            : /home/test/Download/test.wav
Format                                   : Wave
File size                                : 44.2 MiB
Duration                                 : 4mn 22s
Overall bit rate mode                    : Constant
Overall bit rate                         : 1 411 Kbps

Audio
ID                                       : 0
Format                                   : PCM
Format settings, Endianness              : Little
Codec ID                                 : 1
Duration                                 : 4mn 22s
Bit rate mode                            : Constant
Bit rate                                 : 1 411.2 Kbps
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 KHz
Bit depth                                : 16 bits
Stream size                              : 44.2 MiB (100%)
2.  使用vim使用十六进制打开该文件
      1 0000000: 5249 4646 741d c302 5741 5645 666d 7420  RIFFt...WAVEfmt                            
      2 0000010: 1000 0000 0100 0200 44ac 0000 10b1 0200  ........D.......
      3 0000020: 0400 1000 6461 7461 501d c302 0100 0000  ....dataP.......
      4 0000030: ffff 0000 0000 0000 0000 0100 0000 ffff  ................
      5 0000040: 0000 0100 0000 ffff 0000 0000 ffff 0100  ................
      6 0000050: 0200 ffff fdff 0100 0300 ffff ffff 0200  ................
      7 0000060: 0000 feff 0100 0200 ffff feff 0100 0200  ................
      8 0000070: ffff ffff 0100 0000 ffff 0100 0000 ffff  ................
3.   分析上述的数据
" 52 49 46 46"     对应的Ascii码字符为"RIFF"。
" 74 1d c3 02"     对应的就是ChunkSize,对应的十六进制是:0x2c31d74=46341492。那整个wav文件的大小就为:  46341492+8=46341500。将此值转化为MB位单位: 44.2MB,可以验证上述使用mediainfo的信息。
" 57 41 56 45"     对应的Ascii码字符为"WAVE"。
" 66 6d 74 20"     对应的Ascii码字符为"fmt"。
" 10 00 00 00"     四字节对应的是该音频的编码方式,通常为16,代表PCM编码方式。也就是十六进制0x10。
" 01 00"               对应为1,代表PCM编码方式。 
" 02 00"               通道个数,通道数为2,验证上述mediainfo的信息。
" 44 ac 00 00"     采用频率,转化为十六进制为:  0xac44=44100=44.1KHz
" 10 b1 02 00"     每秒所需的字节数,转化为十六进制为:  0x2b110=176400。通过此值可以计算该音频的时长:  46341500/17600=4.37。0.37*60=22.2,则该音频的时长为4mn22s。
" 04 00"               数据对齐单位。
" 10 00"               采样位数,等于0x10=16。
" 64 61 74 61"     对应的Ascill码字符为"data"。
" 50 1d c3 02"     对应该音频的raw数据的大小,转化为十六进制为0x2c31d50=46341456,此值等于wav_size-44=46341500-44。





Audio Components Suite (ACS) is a freeware cross-platform set of components designed to perform different sound-processing tasks.Platforms supported: Windows, LinuxIDEs supported: Delphi 6, 7, Kylix 1, 2, 3Main features inlude:Audio playback and capture Simultaneous operations on the same or different devices are allowed. OSS-compatible, ALSA, AOLive drivers are supported under Linux. CD-ROM playback and direct CDDA data capture Wave files/streams support Raw PCM, MS ADPCM, DVI IMA ADPCM support Append data to existing file/stream capability MP3 format support Encode mp3 files using LAME mp3 playback with smpeg library (Linux only) mp3 to wav files or streams conversion using MAD decoder Ogg Vorbis format support Reading Ogg files/streams (including multi-streamed ones). Storing data in Ogg Vorbis format with wide range of settings for compression/quality tweaks. Ogg comments support Append data to existing file/stream capability FLAC format support Reading FLAC files/streams Storing data in FLAC format with wide range of settings for compressiontweaks. Monkey Audio format support (for Windows only) AudioMixer component for mixing/concatenating audio streams InputList component for building dynamically playback/input lists Set of audio converter components Sample converter for bits per sample conversion. Sample rate converter (resampler) using sinc filtering Mono/Stereo conveter Stereo balance control Sound indicator Windowed sinc and Butterworth filters for changing audio spectrum Convolver component for applying custom sound effects The ACS object model allows you to build such applications as players, rippers and mixers in a fast and easy way. And of course, with ACS you can add sound playing/recording capabilities to any of your applications.ACS is used in QuickPopup sofware as well as in several other projects, including OpenRipper for Linux
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值