FFmpeg rmvb demuxer中COOK 音频解析

 收集的关于COOK codec的知识,对于理解ffmpeg rmvb demuxer 中解析音频packet部分很有帮助。

 对应的代码:/libavformat/rmdec.c的函数ff_rm_parse_packet中音频解析部分。


首先说一下自己的理解:

      每个音频帧就是一个sub packet.

      多个sub packet组成一个逻辑单元packet.

      sub_packet_h 个 packet组成一个'scrambling unit'..

      最终通过read_packet读取,并最终送给decoder进行解码的就是sub packet.

      为了取得sub packet,需要通过一个数学公式确定每一帧的位置,然后进行读取。

原文地址: http://www.rockbox.org/wiki/CookCodec

From rm file header, the following are some of parameters of interest to an audio decoder :

  • avg_packet_size
  • sub_packet_size
  • sub_packet_h
  • block_align

The mystery of frames, packets and sub-packets

In cook, a packet is a logical unit for storing audio frames. One packet typically contains multiple mixed frames, which rm calls sub_packets.

For almost any rm audio file, block_align == avg_packet_size, which is also synonymous to frame_size in rm header. The 'regular' audio frame, that is an audio buffer which could be sent to a decoder, is called sub_packet. In this context then, rm's frame_size is the size of one logical unit of multiple frames, and sub_packet_size is the size of a regular audio frame.

block_align

As stated in the previous paragraph, in a rm file, the value of block_align is equal to frame_size or avg_packet_size which is the size of one unit of packed frames. That's not the exact case in cook, however. For cook, block_align == sub_packet_size, which is the size of an actual audio frame. This has to be done manually though, an rm header just provides the values of the parameters and a parser has to handle the rest. This means that the parser would check a file to see if it contains cook audio, and then assign the value of sub_packet_size to block_align.

sub_packet_h

This is described in ffmpeg as a 'descrambling parameter'. After packing the frames (sub_packets) into packets, the packets are further packed into into scrambling units, each containing a sub_packet_h multiple of packets not sub_packets. So for a parser to construct proper audio frames that the decoder could handle, it should first loop through the packets 'descrambling them'. For this process, the parser has to determine the position of each audio frame in the scrambling unit according to a crazy mathematical formula. Luckily the ffmpeg developers were capable of figuring out this formula, which is :

sps*(h*x+((h+1)/2)*(y&1)+(y>>1))

  • sps = sub_packet_size;
  • h = sub_packet_h;
  • x = the position of the current frame in its parent packet;
  • y = sub_packet_count; a sub_packet counter for each scrambling unit.

After constructing one scrambling unit, audio frames are then sent to the decoder. The decoder takes in an input buffer of uint8_t* and produces an output buffer of int16_t* .
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值