# WAV文件格式解析

http://www.codeguru.com/cpp/g-m/multimedia/audio/article.php/c8935/PCM-Audio-and-Wave-Files.htm#page-1

http://www.codeguru.com/dbfiles/get_file/WaveFun_Src.zip?id=8935&lbl=WAVEFUN_SRC_ZIP

WAVE文件支持很多不同的比特率、采样率、多声道音频。WAVE是PC机上存储PCM音频最流行的文件格式，基本上可以等同于原始数字音频。

WAVE文件为了与IFF保持一致，数据采用“chunk”来存储。因此，如果想要在WAVE文件中补充一些新的信息，只需要在在新chunk中添加信息，而不需要改变整个文件。这也是设计IFF最初的目的。

WAVE文件是很多不同的chunk集合，但是对于一个基本的WAVE文件而言，以下三种chunk是必不可少的。

1. 文件头

RIFF/WAV文件标识段

声音数据格式说明段

2. 数据体：

RIFF头chunk表示如下：

1.	struct RIFF_HEADER
2.	{
3.	   TCHAR szRiffID[4];        // 'R','I','F','F'
4.	      DWORD dwRiffSize;
5.
6.	   TCHAR szRiffFormat[4];    // 'W','A','V','E'
7.	};


1.	struct WAVE_FORMAT
2.	{
3.	   WORD wFormatTag;
4.	   WORD wChannels;
5.	   DWORD dwSamplesPerSec;
6.	   DWORD dwAvgBytesPerSec;
7.	   WORD wBlockAlign;
8.	   WORD wBitsPerSample;
9.	};
10.	struct FMT_BLOCK
11.	{
12.	   TCHAR szFmtID[4];    // 'f','m','t',' ' please note the
13.	                        // space character at the fourth location.
14.	      DWORD dwFmtSize;
15.	      WAVE_FORMAT wavFormat;
16.	};


1.	struct DATA_BLOCK
2.	{
3.	   TCHAR szDataID[4];    // 'd','a','t','a'
4.	   DWORD dwDataSize;
5.	};


1.	struct NOTE_CHUNK
2.	{
3.	   TCHAR ID[4];    // 'note'
4.	   long chunkSize;
5.	   long dwIdentifier;
6.	   TCHAR dwText[];
7.	};


The WAVE FileFormat

The WAVE File Format supports a variety of bitresolutions, sample rates, and channels of audio. I would say that this is themost popular format for storing PCM audio on the PC and has become synonymouswith the term "raw digital audio."

The WAVE file format is based on Microsoft's version of theElectronic Arts Interchange File Format method for storing data. In keepingwith the dictums of IFF,data in a Wave file is stored in many different "chunks."So, if a vendor wants to store additional information in a Wave file, he justadds info to new chunks instead of trying to tweak the base file format or comeup with his own proprietary file format. That is the primary goal of the IFF.

As mentioned earlier, a WAVE file is a collection of a numberof different types of chunks. But, there are threechunks that are required tobe present in a valid wave file:

1.   'RIFF', 'WAVE' chunk

2.   "fmt" chunk

3.   'data' chunk

All otherchunks are optional. The Riff wave chunk is the identifier chunkthat tells us that this is a wave file. The "fmt" chunk containsimportant parameters describing the waveform, such as its sample rate, bits per sample,and so forth. The Data chunk contains the actual waveform data.

An application that uses a WAVE file must be able to read the threerequired chunks,although it can ignore the optional chunks.But, all applications that perform a copy operation on wave files should copy all of the chunksin the WAVE.

The Riffchunk is always the first chunk. The fmt chunk should be present before thedata chunk. Apart from this, there are no restrictions upon the order of thechunks within a WAVE file.

Here is an example of the layout for a minimal WAVE file. Itconsists of a single WAVE containing the three required chunks.

While interpreting WAVE files, the unit of measurement usedis a "sample."Literally, it is what it says. A sample represents data captured during asingle sampling cycle. So, if you are sampling at 44 KHz, you will have 44 Ksamples. Each sample could be represented as 8 bits, 16 bits, 24 bits, or 32bits. (There is no restriction on how many bits you use for a sample exceptthat it has to be a multiple of 8.) To some extent, the more the number of bitsin a sample, the better the quality of the audio.

One annoying detail to note is that 8-bit samples arerepresented as "unsigned"values whereas 16-bit and higher are represented by "signed"values. I don't know why this discrepancy exists; that's just the way it is.

The data bits for each sample should be left-justified and padded with0s. For example, consider the case of a 10-bit sample (assamples must be multiples of 8, we need to represent it as 16 bits). The 10bits should be left-justified so that they become bits 6 to 15 inclusive, andbits 0 to 5 should be set to zero.

The analogy I have provided is for mono audio, meaning that you have just one"channel." When you deal with stereo audio, 3Daudio, and so forth, you are in effect dealing with multiplechannels, meaning you have multiple samples describing theaudio in any given moment in time. For example, for stereo audio, at any givenpoint in time you need to know what the audio signal was for the left channel as well as the right channel.So, you will have to read and write two samples at a time.

Say you sample at 44 KHz for stereoaudio; then effectively, you will have 44 K * 2 samples. If you are using 16bits per sample, then given the duration of audio, you can calculate the totalsize of the wave file as:

Size in bytes = sampling rate * numberof channels * (bits per sample / 8) * duration in seconds

When youare dealing with such multi-channelsounds, single sample points from each channel are interleaved. Instead of storing all of the samplepoints for the left channel first, and then storing all of the sample pointsfor the right channel next, you "interleave"the two channels' samples together. You would store the first sample of theleft channel. Then, you would store the first sample of the right channel, andso on.

When adevice needs to reproduce the stored stereo audio (or any multi-channel audio),it will process the left and right channels (or however many channels thereare) simultaneously. This collective piece of information is called a sample frame.

So far,you have covered the very basics of PCM audio and how it is represented in awave file. It is time to take a look at some code and see how you can use C++to manage wave files. Start by laying out the structures for the differentchunks of a wave file.

Thefirst chunk is the riff header chunk and can be represented as follows. You usea TCHAR that is defined as a normal ASCII char or as a wide character dependingupon whether the UNICODE directive has been set on your compiler.

2. {
3.    TCHAR szRiffID[4];       // 'R','I','F','F'
4.       DWORD dwRiffSize;
5.
6.    TCHAR szRiffFormat[4];   // 'W','A','V','E'
7. };

I guessit is self explanatory. The second chunk is the fmt chunk. It describes theproperties of the wave file, such as bits per sample, number of channels, andthe like. You can use a helper structure to neatly represent the chunk as:

1. struct WAVE_FORMAT
2. {
3.    WORD wFormatTag;
4.    WORD wChannels;
5.    DWORD dwSamplesPerSec;
6.    DWORD dwAvgBytesPerSec;
7.    WORD wBlockAlign;
8.    WORD wBitsPerSample;
9. };
10. struct FMT_BLOCK
11. {
12.    TCHAR szFmtID[4];   // 'f','m','t',' ' please note the
13.                        // space character at the fourth location.
14.       DWORD dwFmtSize;
15.       WAVE_FORMAT wavFormat;
16. };

1.
2. struct DATA_BLOCK
3. {
4.    TCHAR szDataID[4];   // 'd','a','t','a'
5.    DWORD dwDataSize;
6. };

That's it. That's all you need todescribe a wave form. Of course, there a lot of optional chunks that you canhave (they should be before the data block and after the fmt block). Just as anexample, here is an optional chunk that you could use:

1. struct NOTE_CHUNK
2. {
3.    TCHAR ID[4];   // 'note'
4.   long chunkSize;
5.   long dwIdentifier;
6.    TCHAR dwText[];
7. };

• 广告
• 抄袭
• 版权
• 政治
• 色情
• 无意义
• 其他

120