使用mp4v2将H264和AAC封装mp4

最新推荐文章于 2023-06-12 16:59:53 发布

遇见里哦

最新推荐文章于 2023-06-12 16:59:53 发布

阅读量2k

点赞数 4

分类专栏：音视频文章标签： c++

本文链接：https://blog.csdn.net/qq_28581781/article/details/106862139

版权

音视频专栏收录该内容

5 篇文章 0 订阅

订阅专栏

一、下载源码

国内源：https://launchpad.net/ubuntu/+source/mp4v2

国外源：https://code.google.com/p/mp4v2/

下载合适版本的mp2v2源码，我下载的是mp4v2_2.0.0_dfsg0.orig.tar.bz2

二、编译

1、linux编译

tar jxf mp4v2_2.0.0_dfsg0.orig.tar.bz2
cd mp4v2-2.0.0
./configure --disable-debug
make

生成的libmp4v2.a在目录.libs

以下自定义选项已添加到configure：

--disable-debug不生成调试信息。不要直接编译器生成调试信息。默认情况下，如果平台支持，编译器将生成调试信息。

--disable-optimize不要优化。不要指导编译器优化代码。默认情况下，如果平台支持，则启用编译器优化。

--disable-fvisibility不要设置默认的ELF符号可见性。默认情况下，配置尝试检测编译器是否支持此功能的尝试。但是，在某些平台上，检测到此功能的不兼容性可能不准确，在这种情况下，应提供此选项。

--disable-gch默认情况下，某些平台被标记为使用GCC预编译头。通常，这会大大减少构建时间，但可能需要更多的精力来进行迭代开发；也就是说，make clean在更改标头时，可能无法正确跟踪依赖项，并且可能需要更频繁地进行依赖。使用此选项可以禁用GCC预编译头。

--disable-largefile在某些32位平台或配置上，可能需要在没有大文件（LFS）支持的情况下进行构建。默认情况下，配置尝试检测正式的LFS支持并启用（如果找到）。

--disable-util不要构建/安装实用程序。对于希望跳过构建默认情况下启用的实用程序（例如，命令行可执行文件）的用户而言，这是方便的选项。

--enable-bi=ARCH在支持双体系结构的平台上，可以生成32或64位代码。通过在编译或链接时分别添加参数-m32或来支持此功能-m64。使用此选项可以覆盖特定于平台的默认设置。

--enable-ub[=ARCHS]在OSX系统上，可以生成通用二进制文件。通过-arch ARCH在编译或链接时添加一个或多个参数模式来支持此功能。使用此选项可以针对不同于平台默认设置的体系结构，或生成通用二进制文件。

--enable-dependency-tracking对包含文件启用自动依赖项跟踪。默认情况下，此功能处于禁用状态。

2、交叉编译

tar jxf mp4v2_2.0.0~dfsg0.orig.tar.bz2
cd mp4v2-2.0.0
./configure --host=arm-hisiv500-linux CC=arm-hisiv500-linux-gcc CXX=arm-hisiv500-linux-g++ --disable-debug
make

3、windows环境编译

MP4v2项目工程有windows下的工程，但是缺少几个文件，分别为platform_win32.cpp，platform_win32_impl.h，Version.rc

这几个文件可以到http://code.google.com/p/mp4v2/的最新SVN库中下载到，将这几个文件更新到相应工程后，即可编译成功。

也可以这里下载：https://download.csdn.net/download/qq_28581781/12536408

三、API说明

1、创建mp4文件

MP4Create(sFileName); //传入要创建的mp4文件名

2.设置文件时间基

MP4SetTimeScale(m_hFile, 90000) //第一个参数为第一步创建的文件句柄，第二个参数为时间基，这里取视频的采样率90000

3.创建H264视频track

MP4TrackId MP4AddH264VideoTrack( //返回track句柄 
    MP4FileHandle hFile, //创建的文件句柄 
    uint32_t timeScale, //该track的时间基，h264为90000 
    MP4Duration sampleDuration, //每帧的持续时间，以时间基为基准，比如对于25fps，这里填90000/25=3600 
    uint16_t width, //视频宽 
    uint16_t height, //视频高 
    uint8_t AVCProfileIndication, //接下来3个参数代表h264编码的profile-level-id,分别对应sps第二、三、四个字节,sps[1] 
    uint8_t profile_compat, //sps[2] 
    uint8_t AVCLevelIndication, //sps[3] 
    uint8_t sampleLenFieldSizeMinusOne ); //每个NALU单元前有几个字节代表NALU的长度，减去1就是这里要填的值，这里我们填3.接下来我们写h264数据的时候是要去掉NALU分割符 //0 0 0 1，然后在NALU前加4个字节代表NALU的长度(大端字节序)

void MP4SetVideoProfileLevel( MP4FileHandle hFile, uint8_t value ); 

//设置视频遵循的协议，
//第一个参数为文件句柄，
//第二个参数我们设置为1，定义如下: 
MP4SetVideoProfileLevel sets the minumum profile/level of MPEG-4 video support necessary to render the contents of the file. 
ISO/IEC 14496-1:2001 MPEG-4 Systems defines the following values: 
0x00 Reserved 
0x01 Simple Profile @ Level 3 
0x02 Simple Profile @ Level 2 
0x03 Simple Profile @ Level 1 
0x04 Simple Scalable Profile @ Level 2 
0x05 Simple Scalable Profile @ Level 1 
0x06 Core Profile @ Level 2 
0x07 Core Profile @ Level 1 
0x08 Main Profile @ Level 4 
0x09 Main Profile @ Level 3 
0x0A Main Profile @ Level 2 
0x0B N-Bit Profile @ Level 2 
0x0C Hybrid Profile @ Level 2 
0x0D Hybrid Profile @ Level 1 
0x0E Basic Animated Texture @ Level 2 
0x0F Basic Animated Texture @ Level 1 
0x10 Scalable Texture @ Level 3 
0x11 Scalable Texture @ Level 2 
0x12 Scalable Texture @ Level 1 
0x13 Simple Face Animation @ Level 2 
0x14 Simple Face Animation @ Level 1 
0x15-0x7F Reserved 
0x80-0xFD User private 
0xFE No audio profile specified 
0xFF No audio required 

demo里面写的是 MP4SetVideoProfileLevel(recordCtx->m_mp4FHandle, 0x7F); // Simple Profile @ Level 3 参照上面的值，应该是错误的，Simple Profile @ Level 3 应该是0x01。 关于视频的profileLevel,需要根据实际编码的profile设置，不然会出现解码出来的视频不清晰的问题。

void MP4AddH264SequenceParameterSet( //设置sps 
    MP4FileHandle hFile, //文件句柄 
    MP4TrackId trackId, //视频track句柄 
    const uint8_t* pSequence, //sps数据，注意不包括分隔符0 0 0 1 
    uint16_t sequenceLen ); //数据长度

void MP4AddH264PictureParameterSet( //设置pps 
    MP4FileHandle hFile, //文件句柄 
    MP4TrackId trackId, //视频track句柄 
    const uint8_t* pPict, //pps数据，注意不包括分隔符0 0 0 1 
    uint16_t pictLen ); //数据长度

4.创建AAC音频track

MP4TrackId MP4AddAudioTrack( //创建音频track，返回track id 
    MP4FileHandle hFile, //MP4文件句柄 
    uint32_t timeScale, //音频时间基，这里设置为采样率8000 MP4Duration sampleDuration, //每帧时长，以时间基为度量单位，对于AAC，每帧1024个采样，所以这里设置为1024 
    uint8_t audioType DEFAULT(MP4_MPEG4_AUDIO_TYPE) ); //音频type,这里设置为MP4_MPEG4_AUDIO_TYPE

void MP4SetAudioProfileLevel( MP4FileHandle hFile, uint8_t value ); //设置音频遵从的协议，第一参数为mp4文件句柄，第二个我们设置为2，定义如下: 
MP4SetAudioProfileLevel sets the minumum profile/level of MPEG-4 audio support necessary to render the contents of the file. 
ISO/IEC 14496-1:2001 MPEG-4 Systems defines the following values: 
0x00 Reserved 
0x01 Main Profile @ Level 1 
0x02 Main Profile @ Level 2 
0x03 Main Profile @ Level 3 
0x04 Main Profile @ Level 4 
0x05 Scalable Profile @ Level 1 
0x06 Scalable Profile @ Level 2 
0x07 Scalable Profile @ Level 3 
0x08 Scalable Profile @ Level 4 
0x09 Speech Profile @ Level 1 
0x0A Speech Profile @ Level 2 
0x0B Synthesis Profile @ Level 1 
0x0C Synthesis Profile @ Level 2 
0x0D Synthesis Profile @ Level 3 
0x0E-0x7F Reserved 
0x80-0xFD User private 
0xFE No audio profile specified 
0xFF No audio required

bool MP4SetTrackESConfiguration( //设置音频解码配置参数 
    MP4FileHandle hFile, //mp4文件句柄 
    MP4TrackId trackId, //音频track句柄 
    const uint8_t* pConfig, //AAC的audio-specific-config值，两个字节，可以通过ADTS头部计算出来
    uint32_t configSize);//长度

首先，config有2个字节组成，共16位，具体含义如下： 
5 bits | 4 bits | 4 bits | 3 bits 
第一个 第二个 第三个 第四个 

第一个：AAC Object Type 
第二个：Sample Rate Index 
第三个：Channel Number 
第四个：Don't care，設 0

5.写音视频数据

bool MP4WriteSample( //写音视频数据 
    MP4FileHandle hFile, //MP4文件句柄 
    MP4TrackId trackId, //音频或者视频的track句柄 
    const uint8_t* pBytes, //音频或者视频数据。对于AAC，输入纯AAC数据，不带adts头；对于h264，去掉0 0 0 1分隔符， //在NALU前面添加4字节表示NALU长度(大端字节序) 
    uint32_t numBytes, //数据长度 
    MP4Duration duration DEFAULT(MP4_INVALID_DURATION), //帧时长，以对应的时间基为度量 MP4Duration renderingOffset DEFAULT(0), //默认填0就好 
    bool isSyncSample DEFAULT(true) ); //对于h264，如果为IDR帧则为true,非IDR帧则填false 功能：写一帧视频数据或写一段音频数据。 

返回：成功返回true，失败返回false。 
参数：hFile 文件句柄，
    trackId 音频或视频的track id，
    pBytes为要写的数据流指针，
    numBytes为数据字节长度，
    duration为前一视频帧与当前视频帧之间的ticks数，或这是前一段音频数据和当前音频数据之间的ticks。                            
    isSyncSample 对视频来说是否为关键帧。 

注意：
1，duration这个参数是用来实现音视频同步用的，如果设置错了会造成音视频不同步，甚至会出现crash现象（一般出现在调用MP4Close是crash）。 
2，对于视频流MP4WriteSample函数每次调用是录制前一帧数据，用当前帧的时间戳和前一帧的时间戳计算duration值，然后把当前帧保存下来用做下次调用MP4WriteSample时用，写音频数据一样。

6.关闭mp4文件

void MP4Close( //关闭mp4文件 
    MP4FileHandle hFile, //文件句柄 
    uint32_t flags DEFAULT(0) ); //标准位，我们填MP4_CLOSE_DO_NOT_COMPUTE_BITRATE，这样在关闭文件时，不计算整个文件的大小，这样可以更快关闭文件

四. sample代码

mp4v2_mp4.cpp

#include "mp4v2_mp4.h"
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include "get_time.h"

/* AAC object types */
enum
{
    GF_M4A_AAC_MAIN = 1,
    GF_M4A_AAC_LC = 2,
    GF_M4A_AAC_SSR = 3,
    GF_M4A_AAC_LTP = 4,
    GF_M4A_AAC_SBR = 5,
    GF_M4A_AAC_SCALABLE = 6,
    GF_M4A_TWINVQ = 7,
    GF_M4A_CELP = 8,
    GF_M4A_HVXC = 9,
    GF_M4A_TTSI = 12,
    GF_M4A_MAIN_SYNTHETIC = 13,
    GF_M4A_WAVETABLE_SYNTHESIS = 14,
    GF_M4A_GENERAL_MIDI = 15,
    GF_M4A_ALGO_SYNTH_AUDIO_FX = 16,
    GF_M4A_ER_AAC_LC = 17,
    GF_M4A_ER_AAC_LTP = 19,
    GF_M4A_ER_AAC_SCALABLE = 20,
    GF_M4A_ER_TWINVQ = 21,
    GF_M4A_ER_BSAC = 22,
    GF_M4A_ER_AAC_LD = 23,
    GF_M4A_ER_CELP = 24,
    GF_M4A_ER_HVXC = 25,
    GF_M4A_ER_HILN = 26,
    GF_M4A_ER_PARAMETRIC = 27,
    GF_M4A_SSC = 28,
    GF_M4A_AAC_PS = 29,
    GF_M4A_LAYER1 = 32,
    GF_M4A_LAYER2 = 33,
    GF_M4A_LAYER3 = 34,
    GF_M4A_DST = 35,
    GF_M4A_ALS = 36
};


int ReadOneNaluFromBuf(const unsigned char *buffer,unsigned int nBufferSize,unsigned int offSet,MP4ENC_NaluUnit &nalu)
{
    int i = offSet;
    while(i<nBufferSize)
    {
        if(buffer[i++] == 0x00 &&
            buffer[i++] == 0x00 &&
            buffer[i++] == 0x00 &&
            buffer[i++] == 0x01
            )
        {
            int pos = i;
            while (pos<nBufferSize)
            {
                if(buffer[pos++] == 0x00 &&
                    buffer[pos++] == 0x00 &&
                    buffer[pos++] == 0x00 &&
                    buffer[pos++] == 0x01
                    )
                {
                    break;
                }
            }
            if(pos == nBufferSize)
            {
                nalu.size = pos-i;  
            }
            else
            {
                nalu.size = (pos-4)-i;
            }
 
            nalu.type = buffer[i]&0x1f;
            nalu.data =(unsigned char*)&buffer[i];
            return (nalu.size+i-offSet);
        }
    }
    return 0;
}

/* Returns the sample rate index */
static int GetSRIndex(unsigned int sampleRate)
{
    if (92017 <= sampleRate) return 0;
    if (75132 <= sampleRate) return 1;
    if (55426 <= sampleRate) return 2;
    if (46009 <= sampleRate) return 3;
    if (37566 <= sampleRate) return 4;
    if (27713 <= sampleRate) return 5;
    if (23004 <= sampleRate) return 6;
    if (18783 <= sampleRate) return 7;
    if (13856 <= sampleRate) return 8;
    if (11502 <= sampleRate) return 9;
    if (9391 <= sampleRate) return 10;
    return 11;
}

static void GetAudioSpecificConfig(uint8_t AudioType, uint8_t SampleRateID, uint8_t Channel, uint8_t *pHigh, uint8_t *pLow)
{
    uint16_t Config;

    Config = 0xffff&(AudioType & 0x1f);
    Config <<= 4;
    Config |= SampleRateID & 0x0f;
    Config <<= 4;
    Config |= Channel & 0x0f;
    Config <<= 3;

    *pLow  = Config & 0xff;
    Config >>= 8;
    *pHigh = Config & 0xff;
}

int Mp4v2WriteH264toMP4(MP4V2_MP4_T *pHandle, unsigned char *buffer, unsigned int frame_size)
{
    char nalu_type = buffer[4] & 0x1f;
    unsigned char *nalu_data = (unsigned char *) &buffer[4];
    unsigned int nalu_size = frame_size - 4;        
    uint64_t nowvoltime = os_get_reltime_ms();
    // printf("nalu_type=%d, frame_size=%d\n", nalu_type, frame_size);
    if(nalu_type == 0x07 && 1 == pHandle->isFirstSPS) // sps    
    {    
        MP4SetTimeScale(pHandle->hMp4File,pHandle->m_nTimeScale);
        printf("isFirstSPS.\n");    
        pHandle->m_videoId = MP4AddH264VideoTrack    
                            (   pHandle->hMp4File,     
                                    pHandle->m_nTimeScale,     
                                    pHandle->m_nTimeScale / pHandle->m_nFrameRate,     
                                    pHandle->m_nWidth,//1080,  
                                    pHandle->m_nHeight,//720,  
                                    nalu_data[1],                 // sps[1] AVCProfileIndication    
                                    nalu_data[2],                 // sps[2] profile_compat    
                                    nalu_data[3],                 // sps[3] AVCLevelIndication    
                                    3);                             // 4 bytes length before each NAL unit    
        if (pHandle->m_videoId == MP4_INVALID_TRACK_ID)    
        {    
            //MP4Close(pHandle->hMp4File,0); //add in 20180619
            printf("add video track failed.\n");    
            return 0;    
        } 
        MP4SetVideoProfileLevel(pHandle->hMp4File, 0x7F);
        MP4AddH264SequenceParameterSet(pHandle->hMp4File, pHandle->m_videoId, nalu_data, nalu_size);                                             
        pHandle->isFirstSPS = 0;  
    } 
    else if(nalu_type == 0x08 && 1 == pHandle->isFirstPPS) // pps   
    {    
        MP4AddH264PictureParameterSet(pHandle->hMp4File, pHandle->m_videoId, nalu_data, nalu_size); 
        pHandle->isFirstPPS = 0;   
        printf("isFirstPPS.\n");    
    }  
    else if(nalu_type == 0x06)  //sei
    {

    }
    else if(!pHandle->isFirstSPS && !pHandle->isFirstPPS)
    {   
        char hander[4]={0};
        memcpy(hander, buffer, 4);
        buffer[0] = nalu_size >> 24;    
        buffer[1] = nalu_size >> 16;    
        buffer[2] = nalu_size >> 8;    
        buffer[3] = nalu_size & 0xff;                 
        if(1 == pHandle->isFirstFrame)   
        {
            if(nalu_type == 0x05)   //第一帧是IDR
            {
                printf("isFirstFrame.\n"); 
                MP4WriteSample(pHandle->hMp4File, pHandle->m_videoId, buffer, frame_size, 
                    pHandle->m_nTimeScale/pHandle->m_nFrameRate, 0, 1);
                                
                pHandle->videotime=nowvoltime;
                pHandle->audiotime = nowvoltime;
                pHandle->isFirstFrame = 0;
            }
        }   
        else
        {
            char isSyncSample = 0; 
            if(nalu_type == 0x05)
                isSyncSample = 1;  
            pthread_mutex_lock(&pHandle->mutex);
            MP4WriteSample(pHandle->hMp4File, pHandle->m_videoId, buffer, frame_size, 
                (nowvoltime-pHandle->videotime)*90, 0, isSyncSample);   
            pthread_mutex_unlock(&pHandle->mutex);        
            pHandle->videotime=nowvoltime;
        }  
        memcpy(buffer, hander, 4);
    }         
    return 0; 
}

int Mp4v2WriteAACtoMP4(MP4V2_MP4_T *pHandle, unsigned char *buffer, unsigned int frame_size)
{
    if(!pHandle->isFirstFrame)
    {
        //去除adts头
        // const unsigned char *buff = &buffer[7];
        // int size = frame_size -7;
        int ret;
        uint64_t nowvoltime = os_get_reltime_ms();
        uint64_t timestamp = (nowvoltime-pHandle->audiotime)*(pHandle->samplerate/1000);
        pHandle->audiotime = nowvoltime;
        pthread_mutex_lock(&pHandle->mutex);
        ret = MP4WriteSample(pHandle->hMp4File, pHandle->m_audioId, buffer, frame_size, timestamp, 0, 1);
        pthread_mutex_unlock(&pHandle->mutex);
        if(!ret)  
        { 
            printf("MP4WriteSample failed 3600\n");
            return 0;    
        }
    }
    return 0; 
}

MP4V2_MP4_T *Mp4v2CreateMP4File(const char *pFileName,int width,int height,int timeScale/* = 90000*/,int frameRate/* = 25*/)
{
    MP4V2_MP4_T *pHandle = (MP4V2_MP4_T *)malloc(sizeof(MP4V2_MP4_T));
    if(pFileName == NULL || pHandle ==NULL)
    {
        return NULL;
    }
    pthread_mutex_init(&pHandle->mutex, NULL);
    
    // create mp4 file
    //MP4_CREATE_64BIT_DATA 标记允许文件总大小超过64位的数据。我理解的是，允许单个mp4文件的容量超过2^32KB=4GB。
    pHandle->hMp4File = MP4Create(pFileName);
    // pHandle->hMp4File = MP4CreateEx(pFileName,  0, 1, 1, 0, 0, 0, 0);//创建mp4文件
    if (pHandle->hMp4File == MP4_INVALID_FILE_HANDLE)
    {
        printf("ERROR:Open file fialed.\n");
        return NULL;
    }

    //添加aac音频
    pHandle->samplerate = 32000;
    pHandle->m_audioId = MP4AddAudioTrack(pHandle->hMp4File, pHandle->samplerate, MP4_INVALID_DURATION, MP4_MPEG4_AUDIO_TYPE);
    if (pHandle->m_audioId == MP4_INVALID_TRACK_ID)
    {
        printf("add audio track failed.\n");
    }
    MP4SetAudioProfileLevel(pHandle->hMp4File, 0x2);
    if(1)
    {
        uint8_t Type = GF_M4A_AAC_LC;
        uint8_t SampleRate = GetSRIndex(pHandle->samplerate);
        uint8_t Channel = 1;
        uint8_t  aacInfo[2];
        unsigned long  aacInfoSize = 2; 
        GetAudioSpecificConfig(Type,SampleRate, Channel, &aacInfo[0], &aacInfo[1]);
        printf("aacInfo=%#x, %#x\n", aacInfo[0], aacInfo[1]);
        MP4SetTrackESConfiguration(pHandle->hMp4File, pHandle->m_audioId, (uint8_t *)&aacInfo, aacInfoSize );
    }

    pHandle->m_nWidth = width;
    pHandle->m_nHeight = height;
    pHandle->m_nTimeScale = timeScale;
    pHandle->m_nFrameRate = frameRate;
    pHandle->isFirstPPS = 1;
    pHandle->isFirstSPS = 1;
    pHandle->isFirstFrame = 1;
    
    return pHandle;
}
 
void Mp4v2CloseMP4File(MP4V2_MP4_T *pHandle)
{
    if(pHandle)
    {
        if(pHandle->hMp4File)
        {
            MP4Close(pHandle->hMp4File, MP4_CLOSE_DO_NOT_COMPUTE_BITRATE);
            // MP4Close(pHandle->hMp4File, 0);
            pHandle->hMp4File = NULL;
        }

        pthread_mutex_destroy(&pHandle->mutex);
        free(pHandle);
        pHandle = NULL;
    }
}

mp4v2_mp4.h

#ifndef _MP4V2_MP4_H_
#define _MP4V2_MP4_H_

#include "mp4v2/mp4v2.h"
#include <pthread.h>
 
// NALU单元
typedef struct _MP4ENC_NaluUnit
{
    int type;
    int size;
    unsigned char *data;
}MP4ENC_NaluUnit;

typedef struct mp4v2_mp4
{
    int m_nWidth;
    int m_nHeight;
    int m_nFrameRate;
    int m_nTimeScale;
    char isFirstPPS;
    char isFirstSPS;
    char isFirstFrame;
    int samplerate;
    uint64_t videotime;
    uint64_t audiotime;
    MP4TrackId m_videoId;
    MP4TrackId m_audioId;
    MP4FileHandle hMp4File;
    pthread_mutex_t mutex;
}MP4V2_MP4_T;

// open or creat a mp4 file.
MP4V2_MP4_T *Mp4v2CreateMP4File(const char *fileName,int width,int height,int timeScale,int frameRate);
void Mp4v2CloseMP4File(MP4V2_MP4_T *pHandle);
int Mp4v2WriteH264toMP4(MP4V2_MP4_T *pHandle, unsigned char *buffer, unsigned int frame_size);
int Mp4v2WriteAACtoMP4(MP4V2_MP4_T *pHandle, unsigned char *buffer, unsigned int frame_size);

#endif //_MP4V2_MP4_H_

遇见里哦

关注

4
点赞
踩
9

收藏

觉得还不错? 一键收藏
1
评论
使用mp4v2将H264和AAC封装mp4

一、下载源码国内源：https://launchpad.net/ubuntu/+source/mp4v2国外源：https://code.google.com/p/mp4v2/下载合适版本的mp2v2源码，我下载的是mp4v2_2.0.0_dfsg0.orig.tar.bz2二、编译1、linux编译tar jxf mp4v2_2.0.0_dfsg0.orig.tar.bz2cd mp4v2-2.0.0./configure --disable-debugmake
复制链接

扫一扫