AAC ADTS码流解析与分割

AAC封装格式

AAC 有两种格式:ADIF,ADTS。

  1. ADIF(Audio Data Interchange Format),音频数据交换格式,这种格式的特点是只在文件头部存储用于音频解码播放的头信息(例如采样率,通道数等),它的解码播放必须从文件头部开始,一般用于存储在本地磁盘中播放。
  2. ADTS(Audio Data Transport Stream),音频数据传输流,这种格式的特点是可以将数据看做一个个的音频帧,而每帧都存储了用于音频解码播放的头信息(例如采样率,通道数等),即可以从任何帧位置解码播放,更适用于流媒体传输。

第三方工具查看AAC音频基本信息

MediaInfo工具

设置视图-文本,调试-Details-0, 然后打开文件

就可以看到aac的数据帧完整信息。

VTCLab Media Analyzer App (media-analyzer.pro)

在浏览器中打开这个工具,"Open File" 打开一个本地文件,也能完整列出aac文件数据帧,附带每帧的偏移地址,更加方便。

aac音频ADTS头

本文讨论的是ADTS格式,网络传输中(直播)普遍使用这种封装类型。

ADTS - MultimediaWiki 这个链接给出了完整的介绍,本文参考这里给出了类型定义。

ADTS 格式的 AAC 码流是由一个个的 ADTS Frame 组成的,结构如下。

其中每个 ADTS Frame 是由头部(固定头部+可变头部)和数据组成,帧头部结构和字段含义如下。

AAC ADTS头格式
序号字段长度(比特)说明
1synword12固定0xFFF,用作同步,说明一个ADTS帧的开始
2id1MPEG标识符,0:MPEG-4, 1:MPEG-2
3layer2一般为00
4protection_absent1crc校验标识,0:有crc校验,1:没有crc校验 (末尾的crc)
5profile2
6sampling_frequency_index4

采样率下标,下标对应的采样率如下:

0: 96000 Hz
1: 88200 Hz
2 : 64000 Hz
3 : 48000 Hz
4 : 44100 Hz
5 : 32000 Hz
6 : 24000 Hz
7 : 22050 Hz
8 : 16000 Hz
9 : 12000 Hz
10 : 11025 Hz
11 : 8000 Hz
12 : 7350 Hz
13 : Reserved
14 : Reserved
15 : frequency is written explictly

7private_bit1私有位,编码时为0,解码时忽略
8channel_configuration3

声道数。
0: Defined in AOT Specifc Config
1: 1 channel : front - center
2 : 2 channels : front - left, front - right
3 : 3 channels : front - center, front - left, front - right
4 : 4 channels : front - center, front - left, front - right, back - center
5 : 5 channels : front - center, front - left, front - right, back - left, back - right
6 : 6 channels : front - center, front - left, front - right, back - left, back - right, LFE - channel
7 : 8 channels : front - center, front - left, front - right, side - left, side - right, back - left, back - right, LFE - channel
8 - 15 : Reserved
front - center:中置声道

front - left:左声道

front - right:右声道

back - left:后置左

back - right:后置右

side - left:侧置左

side - right:侧置右

LFE - channel:低频声道

9

originality

1编码是设置为0,解码时忽略
10home1编码是设置为0,解码时忽略
11

copyright_id_bit

1编码是设置为0,解码时忽略
12

copyright_id_start

1编码是设置为0,解码时忽略
13

frame_length

13一个ADTS帧的⻓度,包括ADTS头和AAC原始流
14

buffer_fullness

11缓冲区充满度,0x7FF说明是码率可变的码流,不需要此字段。CBR可能需要此字段,不同编码器使用情况不同
15

num_raw_data_blocks

2表示ADTS帧中有num_raw_data_blocks + 1个AAC原始帧,为0表示说ADTS帧中只有一个AAC数据.
16

crc

16protection_absent为0就有该字段,否则没有该字段

定义结构体

冒号是位域用法,表示这个字段所占比特数。由于ADTS头各字段跨字节比较多,没有使用到位域在memcpy下的威力,这里只相当于注释了。


struct AdtsHeader {
    // 12 bit 同步字 '1111 1111 1111',说明一个ADTS帧的开始
    unsigned short syncword : 12;

    // 1 bit MPEG 标示符, 0 for MPEG-4,1 for MPEG-2
    unsigned short id : 1;
    unsigned short layer : 2;              // 2 bit 总是'00'
    unsigned short protection_absent : 1; // 1 bit 1表示没有crc,0表示有crc

    unsigned short profile : 2; // 1 bit 表示使用哪个级别的AAC
    unsigned short sampling_frequency_index : 4; // 4 bit 表示使用的采样频率
    unsigned short private_bit : 1;              // 1 bit
    unsigned short channel_configuration : 3;                  // 3 bit 表示声道数
    unsigned short originality : 1;              // 1 bit
    unsigned short home : 1;                     // 1 bit

    /*下面的为改变的参数即每一帧都不同*/
    unsigned short copyright_id_bit : 1;   // 1 bit
    unsigned short copyright_id_start : 1; // 1 bit
    // 13 bit 一个ADTS帧的长度包括ADTS头和AAC原始流
    unsigned short frame_length : 13;
    unsigned short buffer_fullness : 11; // 11 bit 0x7FF 说明是码率可变的码流

    /* number_of_raw_data_blocks_in_frame
     * 表示ADTS帧中有number_of_raw_data_blocks_in_frame + 1个AAC原始帧
     * 所以说number_of_raw_data_blocks_in_frame == 0
     * 表示说ADTS帧中有一个AAC数据块并不是说没有。(一个AAC原始帧包含一段时间内1024个采样及相关数据)
     */
    unsigned short num_raw_data_blocks : 2;
    unsigned short crc;
};

ADTS格式aac文件的解析

从frame_length字段13比特位我们可以看出来,其实一个aac帧最多也就8192字节,这个frame_length是包含了ADTS头长度的。

ADTS头的长度: 如果protection_absent等于0,说明有crc,那么ADTS头就占用9字节。如果protection_absent等于1,说明没有crc字段,那么ADTS头就占用7字节。

payload的长度:就是frame_length - ADTS头长度

下一个ADTS数据帧的位置:前面所有数据帧frame_length加起来就是下一帧的长度。

下面截图和MediaInfo中截图是同一个文件,第一帧218,第二帧192,那么第三帧就是在412处,换算成十六进制就是19C。

代码

#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/**
https://wiki.multimedia.cx/index.php/ADTS

AAAAAAAA AAAABCCD EEFFFFGH HHIJKLMM MMMMMMMM MMMOOOOO OOOOOOPP (QQQQQQQQ
QQQQQQQQ)

Header consists of 7 or 9 bytes (without or with CRC).

Letter	Length (bits)	Description
A	12	Syncword, all bits must be set to 1.
B	1	MPEG Version, set to 0 for MPEG-4 and 1 for MPEG-2.
C	2	Layer, always set to 0.
D	1	Protection absence, set to 1 if there is no CRC and 0 if there
        is CRC. 
E	2	Profile, the MPEG-4 Audio Object Type minus 1.
F	4	MPEG-4 Sampling Frequency Index (15 is forbidden).
G	1	Private bit, guaranteed never to be used by MPEG, set to 0 when
        encoding, ignore when decoding. 
H	3	MPEG-4 Channel Configuration (in
        the case of 0, the channel configuration is sent via an inband PCE (Program
        Config Element)). 
I	1	Originality, set to 1 to signal originality of
        the audio and 0 otherwise. 
J	1	Home, set to 1 to signal home usage of
        the audio and 0 otherwise. 
K	1	Copyright ID bit, the next bit of a
        centrally registered copyright identifier. This is transmitted by sliding over
        the bit-string in LSB-first order and putting the current bit value in this
        field and wrapping to start if reached end (circular buffer).
L	1	Copyright ID start, signals that this frame's Copyright ID bit
        is the first one by setting 1 and 0 otherwise.
M	13	Frame length, length of the ADTS frame including headers and CRC
        check. 
O	11	Buffer fullness, states the bit-reservoir per frame.
        max_bit_reservoir = minimum_decoder_input_size - mean_bits_per_RDB; // for CBR

        // bit reservoir state/available bits (≥0 and <max_bit_reservoir); for the i-th
        frame. bit_reservoir_state[i] = (int)(bit_reservoir_state[i - 1] +
        mean_framelength - framelength[i]);

        // NCC is the number of channels.
        adts_buffer_fullness = bit_reservoir_state[i] / (NCC * 32);
        However, a special value of 0x7FF denotes a variable bitrate, for which buffer
        fullness isn't applicable.

P	2	Number of AAC frames (RDBs (Raw Data Blocks)) in ADTS frame
        minus 1. For maximum compatibility always use one AAC frame per ADTS frame. 
Q   16	CRC check (as of ISO/IEC 11172-3, subclause 2.4.3.1), if Protection
        absent is 0.
 */

struct AdtsHeader {
    // 12 bit 同步字 '1111 1111 1111',说明一个ADTS帧的开始
    unsigned short syncword : 12;

    // 1 bit MPEG 标示符, 0 for MPEG-4,1 for MPEG-2
    unsigned short id : 1;
    unsigned short layer : 2;              // 2 bit 总是'00'
    unsigned short protection_absent : 1; // 1 bit 1表示没有crc,0表示有crc

    unsigned short profile : 2; // 1 bit 表示使用哪个级别的AAC
    unsigned short sampling_frequency_index : 4; // 4 bit 表示使用的采样频率
    unsigned short private_bit : 1;              // 1 bit
    unsigned short channel_configuration : 3;                  // 3 bit 表示声道数
    unsigned short originality : 1;              // 1 bit
    unsigned short home : 1;                     // 1 bit

    /*下面的为改变的参数即每一帧都不同*/
    unsigned short copyright_id_bit : 1;   // 1 bit
    unsigned short copyright_id_start : 1; // 1 bit
    // 13 bit 一个ADTS帧的长度包括ADTS头和AAC原始流
    unsigned short frame_length : 13;
    unsigned short buffer_fullness : 11; // 11 bit 0x7FF 说明是码率可变的码流

    /* number_of_raw_data_blocks_in_frame
     * 表示ADTS帧中有number_of_raw_data_blocks_in_frame + 1个AAC原始帧
     * 所以说number_of_raw_data_blocks_in_frame == 0
     * 表示说ADTS帧中有一个AAC数据块并不是说没有。(一个AAC原始帧包含一段时间内1024个采样及相关数据)
     */
    unsigned short num_raw_data_blocks : 2;
    unsigned short crc;
};

#define AAC_MAX_FRAME_LENGTH 8192

static int AUDIO_SAMPLING_RATES[] = {
    96000, // 0
    88200, // 1
    64000, // 2
    48000, // 3
    44100, // 4
    32000, // 5
    24000, // 6
    22050, // 7
    16000, // 8
    12000, // 9
    11025, // 10
    8000,  // 11
    7350,  // 12
    -1,    // 13
    -1,    // 14
    -1,    // 15
};

FILE *bitstream = NULL; //!< the bit stream file

struct AdtsFrame {
    AdtsHeader header;
    unsigned char *body;
    unsigned short body_length;
};

bool MatchStartCode(unsigned char *pdata) {
    return pdata[0] == 0xFF && (pdata[1] & 0xF0) == 0xF0;
}


int GetNextAacFrame(AdtsFrame *frame) {
    ::bzero(&frame->header, sizeof(frame->header));
    frame->body_length = 0;
    frame->body = nullptr;

    unsigned char *data_buf = new unsigned char[AAC_MAX_FRAME_LENGTH];

    size_t read_len = fread(data_buf, 1, AAC_MAX_FRAME_LENGTH, bitstream);
    if (read_len < 7) {
        delete[] data_buf;
        return 0;
    }

    // data cursor
    unsigned char *pdata = data_buf;
    // pdata must be smaller than this
    unsigned char *pdata_end = data_buf + read_len;

    if (!MatchStartCode(pdata)) {
        delete[] data_buf;
        return 0;
    } else {
        pdata += 1;
    }

    frame->header.syncword = 0xFFF;
    // AAAABCCD
    frame->header.id = (*pdata & 0x08) >> 3; // B
    frame->header.layer = (*pdata & 0x06) >> 1;        // C
    frame->header.protection_absent = *pdata & 0x01;  // D
    ++pdata;

    // EEFFFFGH
    frame->header.profile = (*pdata & 0xC0) >> 6;                  // E
    frame->header.sampling_frequency_index = (*pdata & 0x3C) >> 2; // F
    frame->header.private_bit = (*pdata & 0x02) >> 1;              // G
    // EEFFFFGH HHIJKLMM
    frame->header.channel_configuration =
        (pdata[0] & 0x01) << 2 | (pdata[1] & 0xC0) >> 6; // H
    ++pdata;

    // HHIJKLMM
    frame->header.originality = (pdata[0] & 0x20) >> 5;        // I
    frame->header.home = (pdata[0] & 0x10) >> 4;               // J
    frame->header.copyright_id_bit = (pdata[0] & 0x08) >> 3;   // K
    frame->header.copyright_id_start = (pdata[0] & 0x04) >> 2; // L

    // HHIJKLMM MMMMMMMM MMMOOOOO
    frame->header.frame_length =
        (pdata[0] & 0x03) << 11 | pdata[1] << 3 | (pdata[2] & 0xE0) >> 5; // M

    pdata += 2;

    // MMMOOOOO OOOOOOPP
    frame->header.buffer_fullness =
        (pdata[0] & 0x1F) << 6 | (pdata[1] & 0xFC) >> 2; // O

    ++pdata;
    // OOOOOOPP
    frame->header.num_raw_data_blocks = pdata[0] & 0x03; // P

    ++pdata;
    if (frame->header.protection_absent == 0) {
        frame->header.crc = pdata[0]<<8 & pdata[1]; // Q
        pdata += 2;
    }

    frame->body_length = frame->header.frame_length - (frame->header.protection_absent==0? 9: 7);
    frame->body = new unsigned char[frame->body_length];
    memcpy(frame->body, pdata, frame->body_length);
    pdata += frame->body_length;

    // std::cout << "syncword " << frame->header.syncword << std::endl;
    // std::cout << "mpeg_version " << frame->header.id << std::endl;
    // std::cout << "layer " << frame->header.layer << std::endl;
    // std::cout << "protection_absence " << frame->header.protection_absent
    //           << std::endl;
    // std::cout << "profile " << frame->header.profile << std::endl;

    // std::cout << "sampling_frequency_index "
    //           << frame->header.sampling_frequency_index << std::endl;
    // std::cout << "private_bit " << frame->header.private_bit << std::endl;
    // std::cout << "channel " << frame->header.channel_configuration << std::endl;
    // std::cout << "originality " << frame->header.originality << std::endl;
    // std::cout << "home " << frame->header.home << std::endl;
    // std::cout << "copyright_id_bit " << frame->header.copyright_id_bit
    //           << std::endl;
    // std::cout << "copyright_id_start " << frame->header.copyright_id_start
    //           << std::endl;
    // std::cout << "frame_length " << frame->header.frame_length << std::endl;
    // std::cout << "buffer_fullness " << frame->header.buffer_fullness
    //           << std::endl;
    // std::cout << "frames " << frame->header.num_raw_data_blocks << std::endl;

    // Here, we have found another start code (and read length of startcode
    // bytes more than we should have.  Hence, go back in the file
    ssize_t rewind = pdata - pdata_end;

    if (0 != fseek(bitstream, rewind, SEEK_CUR)) {
        delete[] data_buf;
        printf("GetAnnexbNALU: Cannot fseek in the bit stream file");
        return -1;
    }

    delete[] data_buf;

    return frame->header.frame_length;
}

int simplest_aac_parser(char *url) {

    // FILE *myout=fopen("output_log.txt","wb+");
    FILE *myout = stdout;

    bitstream = fopen(url, "rb+");
    if (bitstream == NULL) {
        printf("Open file error\n");
        return 0;
    }

    AdtsFrame *frame = new AdtsFrame;

    int data_offset = 0;
    int nal_num = 0;
    printf("-----+------------------------------------------ ADTS Table ---------------------------------------------------+\n");
    printf(" NUM |    POS     | Ver | Layer | Abst | Prof | Frq | Pri | Chn | Org | Home | Idb | Ids | Length | Full | Num |\n");
    printf("-----+------------+-----+-------+------+------+-----+-----+-----+-----+------+-----+-----+--------+------+-----+\n");

    while (!feof(bitstream)) {
        int data_lenth;
        data_lenth = GetNextAacFrame(frame);
        if (data_lenth == 0)
            break;

        auto* header = &frame->header;
        fprintf(myout, "%5d| 0x%08X | %4d| %6d| %5d| %5d| %4d| %4d| %4d| %4d| %5d| %4d| %4d| %7d| %5d| %4d|\n", nal_num, data_offset, header->id, header->layer, header->protection_absent,
        header->profile, header->sampling_frequency_index, header->private_bit, header->channel_configuration,
        header->originality, header->home, header->copyright_id_bit, header->copyright_id_start, header->frame_length,
        header->buffer_fullness, header->num_raw_data_blocks);

        data_offset = data_offset + data_lenth;
        nal_num++;

        if (frame->body) 
            delete []frame->body;
    }

    // Free
    if (frame) {
        if (frame->body) 
            delete []frame->body;
        delete frame;
    }
    return 0;
}

int main(int argc, char *argv[]) {
    char *url = argv[1];
    std::cout << "url: " << url << std::endl;

    simplest_aac_parser(url);
    return 0;
}

运行结果

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值