John McGowan's AVI Overview: Programming and Other Technical Topics(部分)



WAVE

The Microsoft Windows audio (sound) input/output system, commonly referred to as Wave or WAVE, predates Video for Windows, which is wrapped around WAVE in various ways. The audio tracks in AVI files are simply waveform audio (or WAV) data used by the wave system. Video for Windows parses the AVI files, extracts the WAV data, and pipes the WAV data to the WAVE system. Video for Windows handles the video track if present. Traditionally, audio input and output devices such as Sound Blaster Cards have a WAVE audio input/output driver to play WAV (waveform audio) files. The simplest waveform audio files consists of a header followed by Pulse Coded Modulation (PCM) sound data, usually uncompressed 8 or 16 bit sound samples. WAVE also provides a mechanism for audio codecs. See elsewhere in the AVI Overview for further information on audio codecs and audio compression.

RIFF Files

RIFF files are built from (1) RIFF Form Header 'RIFF' (4 byte file size) 'xxxx' (data) where 'xxxx' identifies the specialization (or form) of RIFF. 'AVI ' for AVI files. where the data is the rest of the file. The data is comprised of chunks and lists. Chunks and lists are defined immediately below. (2) A Chunk (4 byte identifier) (4 byte chunk size) (data) The 4 byte identifier is a human readable sequence of four characters such as 'JUNK' or 'idx1' (3) A List 'LIST' (4 byte list size) (4 byte list identifier) (data) where the 4 byte identifier is a human readable sequence of four characters such as 'rec ' or 'movi' where the data is comprised of LISTS or CHUNKS.

AVI File Format

AVI is a specialization or "form" of RIFF, described below: 'RIFF' (4 byte file length) 'AVI ' // file header (a RIFF form) 'LIST' (4 byte list length) 'hdrl' // list of headers for AVI file The 'hdrl' list contains: 'avih' (4 byte chunk size) (data) // the AVI header (a chunk) 'strl' lists of stream headers for each stream (audio, video, etc.) in the AVI file. An AVI file can contain zero or one video stream and zero, one, or many audio streams. For an AVI file with one video and one audio stream: 'LIST' (4 byte list length) 'strl' // video stream list (a list) The video 'strl' list contains: 'strh' (4 byte chunk size) (data) // video stream header (a chunk) 'strf' (4 byte chunk size) (data) // video stream format (a chunk) 'LIST' (4 byte list length) 'strl' // audio stream list (a list) The audio 'strl' list contains: 'strh' (4 byte chunk size) (data) // audio stream header (a chunk) 'strf' (4 byte chunk size) (data) // audio stream format (a chunk) 'JUNK' (4 byte chunk size) (data - usually all zeros) // an OPTIONAL junk chunk to align on 2K byte boundary 'LIST' (4 byte list length) 'movi' // list of movie data (a list) The 'movi' list contains the actual audio and video data. This 'movi' list contains one or more ... 'LIST' (4 byte list length) 'rec ' // list of movie records (a list) '##wb' (4 byte chunk size) (data) // sound data (a chunk) '##dc' (4 byte chunk size) (data) // video data (a chunk) '##db' (4 byte chunk size) (data) // video data (a chunk) A 'rec ' list (a record) contains the audio and video data for a single frame. '##wb' (4 byte chunk size) (data) // sound data (a chunk) '##dc' (4 byte chunk size) (data) // video data (a chunk) '##db' (4 byte chunk size) (data) // video data (a chunk) The 'rec ' list may not be used for AVI files with only audio or only video data. I have seen video only uncompressed AVI files that did not use the 'rec ' list, only '00db' chunks. The 'rec ' list is used for AVI files with interleaved audio and video streams. The 'rec ' list may be used for AVI file with only video. ## in '##dc' refers to the stream number. For example, video data chunks belonging to stream 0 would use the identifier '00dc'. A chunk of video data contains a single video frame. Alexander Grigoriev writes ... John, ##dc chunk was intended to keep compressed data, whereas ##db chunk nad(sic) to be used for uncompressed DIBs (device independent bitmap), but actually they both can contain compressed data. For example, Microsoft VidCap (more precisely, video capture window class) writes MJPEG compressed data in ##db chunks, whereas Adobe Premiere writes frames compressed with the same MJPEG codec as ##dc chunks. ----End of Alexander The ##wb chunks contain the audio data. The audio and video chunks in an AVI file do not contain time stamps or frame counts. The data is ordered in time sequentially as it appears in the AVI file. A player application should display the video frames at the frame rate indicated in the headers. The application should play the audio at the audio sample rate indicated in the headers. Usually, the streams are all assumed to start at time zero since there are no explicit time stamps in the AVI file. The lack of time stamps is a weakness of the original AVI file format. The OpenDML AVI Extensions add new chunks with time stamps. Microsoft's ASF (Advanced or Active Streaming Format), which Microsoft claims will replace AVI, has time stamp "objects". In principle, a video chunk contains a single frame of video. By design, the video chunk should be interleaved with an audio chunk containing the audio associated with that video frame. The data consists of pairs of video and audio chunks. These pairs may be encapsulated in a 'REC ' list. Not all AVI files obey this simple scheme. There are even AVI files with all the video followed by all of the audio; this is not the way an AVI file should be made. The 'movi' list may be followed by: 'idx1' (4 byte chunk size) (index data) // an optional index into movie (a chunk) The optional index contains a table of memory offsets to each chunk within the 'movi' list. The 'idx1' index supports rapid seeking to frames within the video file. The 'avih' (AVI Header) chunk contains the following information: Total Frames (for example, 1500 frames in an AVI) Streams (for example, 2 for audio and video together) InitialFrames MaxBytes BufferSize Microseconds Per Frame Frames Per Second (for example, 15 fps) Size (for example 320x240 pixels) Flags The 'strh' (Stream Header) chunk contains the following information: Stream Type (for example, 'vids' for video 'auds' for audio) Stream Handler (for example, 'cvid' for Cinepak) Samples Per Second (for example 15 frames per second for video) Priority InitialFrames Start Length (for example, 1500 frames for video) Length (sec) (for example 100 seconds for video) Flags BufferSize Quality SampleSize For video, the 'strf' (Stream Format) chunk contains the following information: Size (for example 320x240 pixels) Bit Depth (for example 24 bit color) Colors Used (for example 236 for palettized color) Compression (for example 'cvid' for Cinepak) For audio, the 'strf' (Stream Format) chunk contains the following information: wFormatTag (for example, WAVE_FORMAT_PCM) Number of Channels (for example 2 for stereo sound) Samples Per Second (for example 11025) Average Bytes Per Second (for example 11025 for 8 bit sound) nBlockAlign Bits Per Sample (for example 8 or 16 bits) Each 'rec ' list contains the sound data and video data for a single frame in the sound data chunk and the video data chunk. Other chunks are allowed within the AVI file. For example, I have seen info lists such as 'LIST' (4 byte list size) 'INFO' (chunks with information on video) These chunks that are not part of the AVI standard are simply ignored by the AVI parser. AVI can be and has been extended by adding lists and chunks not in the standard. The 'INFO' list is a registered global form type (across all RIFF files) to store information that helps identify the contents of a chunk. The sound data is typically 8 or 16 bit PCM, stereo or mono, sampled at 11, 22, or 44.1 KHz. Traditionally, the sound has typically been uncompressed Windows PCM. With the advent of the WorldWide Web and the severe bandwidth limitations of the Internet, there has been increasing use of audio codecs. The wFormatTag field in the audio 'strf' (Stream Format) chunk identifies the audio format and codec.

AVI and Windows Bitmaps (DDB, DIB, ...)

Microsoft Windows represents bitmapped images internally and in files as Device Dependent Bitmaps (DDB), Device Independent Bitmaps (DIB), and DIB Sections. Uncompressed 'DIB ' AVI files represent video frames as DIB's. Various multimedia API's that work with AVI use Windows bitmapped images. Prior to Windows 3.0, Windows relied on Device Dependent Bitmaps for bitmapped images. A DDB is stored in a format understood by the device driver for a particular video card. As the name suggests, DDB's are not generally portable. The structure of a DDB is: typedef struct tagBITMAP { // bm LONG bmType; /* always zero */ LONG bmWidth; /* width in pixels */ LONG bmHeight; /* height in pixels */ LONG bmWidthBytes; /* bytes per line of data */ WORD bmPlanes; /* number of color planes */ WORD bmBitsPixel; /* bits per pixel */ LPVOID bmBits; /* pointer to the bitmap pixel data */ } BITMAP; Usually the pixel data immediately follows the BITMAP header. (BITMAP header)(Pixel Data) The HBITMAP handles used by GDI are handles to Device Dependent Bitmaps. The GDI function BitBlt and StretchBlt are actually using Device Dependent Bitmaps. With Windows 3.0, Microsoft introduced the Device Independent Bitmap or DIB, the reigning workhorse of bitmapped images under Windows. The DIB provided a device independent way to represent bitmapped images, both monochrome and color. Windows retains DDB's despite the introduction of the DIB. For example, to use a DIB, you might call: hBitmap = CreateDIBitmap(...) CreateDIBitmap creats a DDB from a DIB, returning the GDI HBITMAP handle of the DDB for further GDI calls. At a low level, Windows and GDI are still using DDB's. The DIB files have a standard header that identifies the format, size, color palette (if applicable) of the bitmapped image. The header is a BITMAPINFO structure. typedef struct tagBITMAPINFO { BITMAPINFOHEADER bmiHeader; RGBQUAD bmiColors[1]; } BITMAPINFO; The BITMAPINFOHEADER is a structure of the form: typedef struct tagBITMAPINFOHEADER{ // bmih DWORD biSize; LONG biWidth; LONG biHeight; WORD biPlanes; WORD biBitCount DWORD biCompression; /* a DIB can be compressed using run length encoding */ DWORD biSizeImage; LONG biXPelsPerMeter; LONG biYPelsPerMeter; DWORD biClrUsed; DWORD biClrImportant; } BITMAPINFOHEADER; bmiColors[1] is the first entry in an optional color palette or color table of RGBQUAD data structures. True color (24 bit RGB) images do not need a color table. 4 and 8 bit color images use a color table. typedef struct tagRGBQUAD { // rgbq BYTE rgbBlue; BYTE rgbGreen; BYTE rgbRed; BYTE rgbReserved; /* always zero */ } RGBQUAD; A DIB consists of (BITMAPINFOHEADER)(optional color table of RGBQUAD's)(data for the bitmapped image) A Windows .BMP file is a DIB stored in a disk file. .BMP files prepend a BITMAPFILEHEADER to the DIB data structure. typedef struct tagBITMAPFILEHEADER { // bmfh WORD bfType; /* always 'BM' */ DWORD bfSize; /* size of bitmap file in bytes */ WORD bfReserved1; /* always 0 */ WORD bfReserved2; /* always 0 */ DWORD bfOffBits; /* offset to data for bitmap */ } BITMAPFILEHEADER; Structure of Data in a .BMP File (BITMAPFILEHEADER)(BITMAPINFOHEADER)(RGBQUAD color table)(Pixel Data) The Win32 API documentation from Microsoft provides extensive information on the data structures in a DIB. In Windows 95 and Windows NT, Microsoft added the DIBSection to provide a more efficient way to use DIB's in programs. The DIBSection was originally introduced in Windows NT to reduce the number of memory copies during blitting (display) of a DIB.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值