音视频开发35 FFmpeg 编码- 将YUV 和 pcm合成一个mp4文件

hunandede

已于 2024-07-13 14:58:39 修改

阅读量429

点赞数 3

文章标签：音视频 ffmpeg pcm

于 2024-07-03 12:05:16 首次发布

本文链接：https://blog.csdn.net/hunandede/article/details/140147425

版权

一程序的目的

/***
*该程序的目的是:
* 将一个pcm文件和一个 yuv文件，合成为一个 0804_out.mp4文件
* pcm文件和yuv文件是从哪里来的呢？是从 sound_in_sync_test.mp4 文件中，使用ffmpeg命令抽取出来的。
* 这样做的目的是为了对比前后两个mp4(sound_in_sync_test.mp4 和 0804_out.mp4 ) 文件。
*
* 1. 从sound_in_sync_test.mp4 文件中抽取 pcm命令如下：
* ffmpeg -i sound_in_sync_test.mp4 -vn -ar 44100 -ac 2 -f s16le 44100_2_s16le.pcm
* -vn 表示不处理视频
*
* 2. 从sound_in_sync_test.mp4 文件中抽取 yuv命令如下：
*
* ffmpeg -i sound_in_sync_test.mp4 -pix_fmt yuv420p 720x576_yuv420p.yuv
*
* 3.播放测试
* 对于 pcm 数据
ffplay -ac 2 -ar 44100 -f s16le 44100_2_s16le.pcm
* 对于 YUV 数据
ffplay -pixel_format yuv420p -video_size 720x576 -framerate 25 720x576_yuv420p.yuv
***/

原始 sound_in_sync_test.mp4 的信息

二流程以及思路

整体思路是这样的：

1. 将video 的yuv数据，变成avframe，然后变成avpacket，

2. 然后需要将avpacket 通过 av_write_frame(this->_avformatContext,avpacket); 发送出去，那么这就需要有 avformatContext。顺便要有avstream，

3. 对于video 来pcm数据来说，应该还需要 audioresample

设计思路

1. 视频编码相关类 videoencoder.h

///设计思路：该函数的作用是，将yuv_data数据存储到 avframe中，然后将avframe中的数据发送到 编码器上下文，经过编码后变成 avpacket。

///核心函数1：初始化h264，要初始化h264，核心是找到编码器；初始化编码器上下文；设置编码器上下文参数；而设置编码器参数，如下的5个参数都需要
/// yuv视频的宽 yuvwidth，yuv视频的高 yuvheight， yuv视频格式 AVPixelFormat yuvpix_fmt, yuv视频fps int yuvfps，视频 比特率 video_bit_rate
///     int InitH264(int yuvwidth, int yuvheight, AVPixelFormat yuvpix_fmt, int yuvfps, int video_bit_rate);
/// 在初始化中，我们还可以定义一个类变量avframe，顺便设定avframe的参数，为了后续方便使用 avframe 做准备。
/// 此方法中和 audio 不同的一点是，会设置 avcodecContext的timebase为1000000，这是因为

///核心函数2：将yuv数据变成 avpacket
///第一个参数为 yuv_data,是一张yuv数据的指针，
/// 第二个参数为 第一个参数yuv_data的大小
/// 第三个参数为 avstream中 video_index，这个参数从 avstream中获得，通过muxer内部变量传递过来，作用是 给avpacket设置 video_index。
/// 第四个参数为 yuv数据一帧图片需要花费的时间，当yuv_fps为25时候，那么一帧数据显示的时刻就是 1/25秒，也就是1000000 * (1/25) 微秒，注意时间。
/// 那么第一张yuv图片 pts 就是 40 000  微秒，
///    第二章yuv图片 pts 就是 80 000  微秒，
/// 第四个参数 pts 是yuv数据的 读取到这时候的时间，参考调用时候 double video_frame_duration = 1.0/yuv_fps * video_time_base;
/// 第五个参数为 yuv数据的时间基，我们设定为 1000000
/// 第四个参数和第五个参数的目的是：使用 yuv的pts  和 yuv 的时间基 转化成 avframe的 pts，这里需要使用到 avcodecContex的timebase，这个avcodecContext 的timebase 是在前面 InitH264 函数中设定的。
/// 注意，video的avcodecContext的timebase 和 audio 的avcodecContext 的timebase 获得方式是不同的，
/// audio 的avcodecContext 的timebase 是 ffmpeg 中在avcodec_open2方法中设定的，video的avcodecContext 的timebase 是程序员自己设定的
/// 然后存储于到 最后一个参数 vector<AVPacket *> 中
/// return 小于0没有packet
/// int videoencoder::Encode(uint8_t *yuv_data, int yuv_size,int stream_index, int64_t pts, int64_t time_base,std::vector<AVPacket *> &packets)

videoencoder.h 代码

#ifndef VIDEOENCODER_H
#define VIDEOENCODER_H


///设计思路：该函数的作用是，将yuv_data数据存储到 avframe中，然后将avframe中的数据发送到 编码器上下文，经过编码后变成 avpacket。

///核心函数1：初始化h264，要初始化h264，核心是找到编码器；初始化编码器上下文；设置编码器上下文参数；而设置编码器参数，如下的5个参数都需要
/// yuv视频的宽 yuvwidth，yuv视频的高 yuvheight， yuv视频格式 AVPixelFormat yuvpix_fmt, yuv视频fps int yuvfps，视频 比特率 video_bit_rate
///     int InitH264(int yuvwidth, int yuvheight, AVPixelFormat yuvpix_fmt, int yuvfps, int video_bit_rate);
/// 在初始化中，我们还可以定义一个类变量avframe，顺便设定avframe的参数，为了后续方便使用 avframe 做准备。
/// 此方法中和 audio 不同的一点是，会设置 avcodecContext的timebase为1000000，这是因为

///核心函数2：将yuv数据变成 avpacket
///第一个参数为 yuv_data,是一张yuv数据的指针，
/// 第二个参数为 第一个参数yuv_data的大小
/// 第三个参数为 avstream中 video_index，这个参数从 avstream中获得，通过muxer内部变量传递过来，作用是 给avpacket设置 video_index。
/// 第四个参数为 yuv数据一帧图片需要花费的时间，当yuv_fps为25时候，那么一帧数据显示的时刻就是 1/25秒，也就是1000000 * (1/25) 微秒，注意时间。
/// 那么第一张yuv图片 pts 就是 40 000  微秒，
///    第二章yuv图片 pts 就是 80 000  微秒，
/// 第四个参数 pts 是yuv数据的 读取到这时候的时间，参考调用时候 double video_frame_duration = 1.0/yuv_fps * video_time_base;
/// 第五个参数为 yuv数据的时间基，我们设定为 1000000
/// 第四个参数和第五个参数的目的是：使用 yuv的pts  和 yuv 的时间基 转化成 avframe的 pts，这里需要使用到 avcodecContex的timebase，这个avcodecContext 的timebase 是在前面 InitH264 函数中设定的。
/// 注意，video的avcodecContext的timebase 和 audio 的avcodecContext 的timebase 获得方式是不同的，
/// audio 的avcodecContext 的timebase 是 ffmpeg 中在avcodec_open2方法中设定的，video的avcodecContext 的timebase 是程序员自己设定的
/// 然后存储于到 最后一个参数 vector<AVPacket *> 中
/// return 小于0没有packet
/// int videoencoder::Encode(uint8_t *yuv_data, int yuv_size,int stream_index, int64_t pts, int64_t time_base,std::vector<AVPacket *> &packets)








#include "iostream"
using namespace std;
extern "C" {
    #include "libavutil/avassert.h" // include 后面<> 表示会从标准库路径中查找指定的文件，""表示从当前当前目录（即包含 #include 指令的文件所在的目录）中查找指定的文件
    #include "libavutil/channel_layout.h"
    #include "libavutil/opt.h"
    #include "libavutil/mathematics.h"
    #include "libavutil/timestamp.h"
    #include "libswscale/swscale.h"
    #include "libswresample/swresample.h"
    #include "libavutil/error.h"
    #include "libavutil/common.h"
    #include "libavcodec/avcodec.h"
    #include "libavformat/avformat.h"
    #include "libavutil/imgutils.h"
}
#include <vector>

class videoencoder
{
public:
    videoencoder();
    ~videoencoder();
    //该函数的作用是 找到h264编码器，通过h264编码器上下文，打开h264编码器上下文，设置h264编码器上下文参数，分配avframe，以及设置avframe参数
    int InitH264(int yuvwidth, int yuvheight, AVPixelFormat yuvpix_fmt, int yuvfps, int video_bit_rate);
    void DeInit();

    ///第一个参数为 yuv_data,是yuv数据的指针，
    /// 第二个参数为 第一个参数yuv_data的大小
    /// 第三个参数为 avstream中 video_index
    /// 第四个参数为 yuv数据一帧图片需要花费的时间，当yuv_fps为25时候，那么一帧数据显示的时刻就是 1/25秒，也就是1000000 * (1/25) 微秒，注意时间。
    /// 那么第一张yuv图片 pts 就是 40 000  微秒，
    ///    第二章yuv图片 pts 就是 80 000  微秒，
    /// 第四个参数 参考调用时候 double video_frame_duration = 1.0/yuv_fps * video_time_base;
    /// 第五个参数为 时间基 1000000
    /// 该函数的作用是，将yuv_data数据存储到 avframe中，然后将avframe中的数据发送到 编码器上下文，经过编码后变成 avpacket。返回该avpacket
    AVPacket *Encode(uint8_t *yuv_data, int yuv_size,
                     int stream_index, int64_t pts, int64_t time_base);

    ///第一个参数为 yuv_data,是yuv数据的指针，
    /// 第二个参数为 第一个参数yuv_data的大小
    /// 第三个参数为 avstream中 video_index
    /// 第四个参数为 yuv数据一帧图片需要花费的时间，当yuv_fps为25时候，那么一帧数据显示的时刻就是 1/25秒，也就是1000000 * (1/25) 微秒，注意时间。
    /// 那么第一张yuv图片 pts 就是 40 000  微秒，
    ///    第二章yuv图片 pts 就是 80 000  微秒，
    /// 第四个参数 参考调用时候 double video_frame_duration = 1.0/yuv_fps * video_time_base;
    /// 第五个参数为 时间基 1000000
    /// 该函数的作用是，将yuv_data数据存储到 avframe中，然后将avframe中的数据发送到 编码器上下文，经过编码后变成 avpacket。
    /// 然后存储于到 最后一个参数 vector<AVPacket *> 中
    /// 小于0没有packet
    int Encode(uint8_t *yuv_data, int yuv_size, int stream_index, int64_t pts, int64_t time_base,
               std::vector<AVPacket *> &packets);
    AVCodecContext *GetCodecContext();

private:
    int _width = 0;
    int _height = 0;
    AVPixelFormat _pix_fmt = AV_PIX_FMT_NONE;
    int _fps = 25;  // 这个默认值，也不会用到，所有的参数都会在构造方法中传递真正的参数
    int _bit_rate = 500*1024; //  这个默认值，也不会用到，所有的参数都会在构造方法中传递真正的参数
    int64_t _pts = 0;
    AVCodecContext * _avcodecContext = NULL;
    AVFrame *_avframe = NULL;
    AVDictionary *_avdictionary = NULL;

};

#endif // VIDEOENCODER_H

videoencoder.cpp

#include "videoencoder.h"

videoencoder::videoencoder()
{

}

videoencoder::~videoencoder()
{
    if(this->_avcodecContext) {
        DeInit();
    }
}


int videoencoder::InitH264(int yuvwidth, int yuvheight, AVPixelFormat yuvpix_fmt, int yuvfps, int video_bit_rate)
{
    int ret =0;
    cout <<" videoencoder.InitH264 call yuvwidth = " << yuvwidth
         <<" yuvheight = "<< yuvheight
         <<" yuvpix_fmt = " << yuvpix_fmt
         <<" yuvfps = " <<yuvfps
         <<" video_bit_rate = " << video_bit_rate
         << endl;

    //1.使用VideoEncoder内部的变量记住 传递进来的值

    this->_width = yuvwidth;
    this->_height = yuvheight;
    this->_pix_fmt = yuvpix_fmt;
    this->_fps = yuvfps;
    this->_bit_rate = video_bit_rate;

    //2.找到视频编码器
    const AVCodec * avcodec =  avcodec_find_encoder(AV_CODEC_ID_H264);
    if(avcodec == nullptr){
        ret = -1;
        cout<<" func InitAAC error because avcodec_find_encoder(AV_CODEC_ID_H264) error "<<endl;
        return ret;
    }

    //3.通过视频编码器找到视频编码器上下文
    this->_avcodecContext = avcodec_alloc_context3(avcodec);
    if(_avcodecContext == nullptr){
        ret = -1;
        cout<<" func InitAAC error because avcodec_alloc_context3(avcodec) error "<<endl;
        return ret;
    }


    //3.1 设定 视频编码器上下文 的参数，这里除了要设定 三要素之外，需要设置 flag 的值为 AV_CODEC_FLAG_GLOBAL_HEADER

    _avcodecContext->width = yuvwidth;
    _avcodecContext->height = yuvheight;
    _avcodecContext->pix_fmt = yuvpix_fmt;
    _avcodecContext->framerate = {yuvfps, 1};


    ///AV_CODEC_FLAG_GLOBAL_HEADER参数相关--中文翻译：将全局标头放置在extradata中，而不是每个关键帧中。
    ///这里要明白为什么加这个参数，需要知道如下的两个知识点：
    /// 1. mp4文件中的 aac 是不带 adst header的，因此我们在将 aac 合成为mp4的时候，不能给每个aac帧的前面加 adst header
    /// 2. AV_CODEC_FLAG_GLOBAL_HEADER 参数的含义就是：将全局标头放置在extradata中，而不是每个关键帧中。，对于aac来说，在每一帧的前面不加 adst header
    /// 这里扩展一下h264，对于h264，有Annexb 和 AVCC 两种存储模式，MP4中的存储的是h264是AVCC格式的，AVCC格式是只有一个头文件在最前面，后面的都是h264纯数据，因此，h264编码成mp4的时候，应该也需要添加 AV_CODEC_FLAG_GLOBAL_HEADER这个标志flag
    _avcodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;

    // 视频的 如果不主动设置：The encoder timebase is not set
    _avcodecContext->time_base = {1, 1000000};   // 单位为微妙


    ///设置比特率
    _avcodecContext->bit_rate = video_bit_rate;

    ///设置 gop 的大小 和 fps一致。gop size 可以是fps的整数倍
    _avcodecContext->gop_size = yuvfps;

    //设置所有的 图像没有b帧
    _avcodecContext->max_b_frames = 0;

    //设置 aac 额外的参数，通过 _avdictionary设置
    //    av_dict_set(&_avdictionary, "tune", "zerolatency", 0);


    //4.打开编码器上下文
    ret = avcodec_open2(_avcodecContext, NULL, &_avdictionary);
    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avcodec_open2 failed:%s\n", errbuf);
        return -1;
    }


    //5.分配 avframe内存，为什么要在这里分配呢？

    this->_avframe = av_frame_alloc();
    if(!_avframe) {
        printf("av_frame_alloc failed\n");
        return -1;
    }
    _avframe->width = _width;
    _avframe->height = _height;
    _avframe->format = _avcodecContext->pix_fmt;
    printf("Inith264 success\n");

    return ret;


}

void videoencoder::DeInit()
{
    if(this->_avcodecContext) {
        avcodec_free_context(&_avcodecContext);
    }
    if(this->_avframe) {
        av_frame_free(&_avframe);
    }
    if(this->_avdictionary) {
        av_dict_free(&_avdictionary);
    }
}



///第一个参数为 yuv_data,是yuv数据的指针，
/// 第二个参数为 第一个参数yuv_data的大小
/// 第三个参数为 avstream中 video_index
/// 第四个参数为 yuv数据一帧图片需要花费的时间，当yuv_fps为25时候，那么一帧数据显示的时刻就是 1/25秒，也就是1000000 * (1/25) 微秒，注意时间。
/// 那么第一张yuv图片 pts 就是 40 000  微秒，
///    第二章yuv图片 pts 就是 80 000  微秒，
/// 第四个参数 参考调用时候 double video_frame_duration = 1.0/yuv_fps * video_time_base;
/// 第五个参数为 时间基 1000000
/// 该函数的作用是，将yuv_data数据存储到 avframe中，然后将avframe中的数据发送到 编码器上下文，经过编码后变成 avpacket。
AVPacket *videoencoder::Encode(uint8_t *yuv_data,
                               int yuv_size,
                               int stream_index,
                               int64_t pts,
                               int64_t time_base)
{
    if(!this->_avcodecContext) {
        printf("codec_ctx_ null\n");
        return NULL;
    }
    int ret = 0;
    /// 时间基 转换, 当第一帧的时候： pts = 40000；
    /// 视频编码器上下文的 time_base是user自己设定的，音频编码器上下文中的timebase是 ffmpeg在 avcodec_open2 方法中实现的。
    /// 但是视频编码器上下文的time_base并没有在 ffmpeg源码中实现，需要user自己设置，我们在前面的代码中已经设置了
    /// 前面设置的timebase为1,1000000   ：    _avcodecContext->time_base = {1, 1000000}
    /// 计算值：第一帧时候重新计算的pts的值为：  (40000 *(1/1000000)) / (1/1000000) = 40000
    /// 那为什么还要计算呢？
    pts = av_rescale_q(pts, AVRational{1, (int)time_base}, _avcodecContext->time_base);
    this->_avframe->pts = pts;


    //将yuv_data数据填充到avframe中,调用的时候有两种方式，一种是yuv_data真的有数据，一种是yuv_data的值为nullptr，目的是刷新编码器上下文

    if(yuv_data){
        int ret_size = av_image_fill_arrays(this->_avframe->data, this->_avframe->linesize,
                                            yuv_data, (AVPixelFormat)this->_avframe->format,
                                            this->_avframe->width, this->_avframe->height, 1);
        if(ret_size != yuv_size) {
            printf("ret_size:%d != yuv_size:%d -> failed\n", ret_size, yuv_size);
            return NULL;
        }
        ret = avcodec_send_frame(this->_avcodecContext, this->_avframe);

    } else {
        ret = avcodec_send_frame(this->_avcodecContext, NULL);
    }


    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avcodec_send_frame failed:%s\n", errbuf);
        return NULL;
    }
    AVPacket *packet = av_packet_alloc();
    ret = avcodec_receive_packet(this->_avcodecContext, packet);
    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("h264 avcodec_receive_packet failed:%s\n", errbuf);
        av_packet_free(&packet);
        return NULL;
    }
    packet->stream_index = stream_index;
    return packet;
}


int videoencoder::Encode(uint8_t *yuv_data, int yuv_size,
                         int stream_index, int64_t pts, int64_t time_base,
                         std::vector<AVPacket *> &packets)
{
    if(!this->_avcodecContext) {
        printf("codec_ctx_ null\n");
        return -1;
    }
    int ret = 0;

    pts = av_rescale_q(pts, AVRational{1, (int)time_base}, this->_avcodecContext->time_base);
    this->_avframe->pts = pts;
    if(yuv_data) {
        int ret_size = av_image_fill_arrays(_avframe->data, _avframe->linesize,
                                            yuv_data, (AVPixelFormat)_avframe->format,
                                            _avframe->width, _avframe->height, 1);
        if(ret_size != yuv_size) {
            printf("ret_size:%d != yuv_size:%d -> failed\n", ret_size, yuv_size);
            return -1;
        }
        ret = avcodec_send_frame(this->_avcodecContext, this->_avframe);
    } else {
        ret = avcodec_send_frame(this->_avcodecContext, NULL);
    }

    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avcodec_send_frame failed:%s\n", errbuf);
        return -1;
    }
    while(1)
    {
        AVPacket *packet = av_packet_alloc();
        ret = avcodec_receive_packet(this->_avcodecContext, packet);
        packet->stream_index = stream_index;
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
            ret = 0;
            av_packet_free(&packet);
            break;
        } else if (ret < 0) {
            char errbuf[1024] = {0};
            av_strerror(ret, errbuf, sizeof(errbuf) - 1);
            printf("h264 avcodec_receive_packet failed:%s\n", errbuf);
            av_packet_free(&packet);
            ret = -1;
        }
        printf("h264 pts:%lld\n", packet->pts);
        packets.push_back(packet);
    }
    return ret;
}

AVCodecContext *videoencoder::GetCodecContext()
{
    if(this->_avcodecContext){
        return this->_avcodecContext;
    }
    return nullptr;
}

2.重采样编码相关类 audioresample.h

audioresample.h

#ifndef AUDIORESAMPLER_H
#define AUDIORESAMPLER_H
extern "C" {
#include "libavutil/channel_layout.h"
#include "libavutil/samplefmt.h"
#include "libswresample/swresample.h"
}
#include "iostream"
using namespace std;

///该类的作用是将 原始的 pcm数据转化成 FFMPEG 自带的 AAC编码器 可以处理的 pcm格式
///
///核心函数一，初始化swrcontext，实际上就是将 swr_alloc_set_opts2 函数 和 swr_init 函数整合在一起
///
/// 因此需要的参数为 const AVChannelLayout *out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
/// const AVChannelLayout *in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,
/// 因此我们的 initResampler 函数，应该至少有如上的6个参数，还应该有 SwrContext 的实例，实际上我们在这里处理的时候使用swrcontext参数的时候，使用的就是 类内部变量
///     int InitResampler(const AVChannelLayout *out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
///                       const AVChannelLayout *in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate);

/// /**
///* Allocate SwrContext if needed and set/reset common parameters.
///*
///* This function does not require *ps to be allocated with swr_alloc(). On the
///* other hand, swr_alloc() can use swr_alloc_set_opts2() to set the parameters
///* on the allocated context.
///*
///* @param ps              Pointer to an existing Swr context if available, or to NULL if not.
///*                        On success, *ps will be set to the allocated context.
///* @param out_ch_layout   output channel layout (e.g. AV_CHANNEL_LAYOUT_*)
///* @param out_sample_fmt  output sample format (AV_SAMPLE_FMT_*).
///* @param out_sample_rate output sample rate (frequency in Hz)
///* @param in_ch_layout    input channel layout (e.g. AV_CHANNEL_LAYOUT_*)
///* @param in_sample_fmt   input sample format (AV_SAMPLE_FMT_*).
///* @param in_sample_rate  input sample rate (frequency in Hz)
///* @param log_offset      logging level offset
///* @param log_ctx         parent logging context, can be NULL
///*
///* @see swr_init(), swr_free()
///* @return 0 on success, a negative AVERROR code on error.
///*         On error, the Swr context is freed and *ps set to NULL.
///*/
///int swr_alloc_set_opts2(struct SwrContext **ps,
///                       const AVChannelLayout *out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
///                       const AVChannelLayout *in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,
///                       int log_offset, void *log_ctx);


///核心函数二 创建输入缓冲区，也就是原始的pcm数据。创建输出缓冲区，也就是我们audioencoder支持的pcm数据。
///如何创建这个输入缓冲区呢？又根据哪些参数创建这个输入缓冲区呢？
///很显然，输入缓冲区是要根据 输入的音频的三要素 来创建的。但是没有用到的的是采样率，类似 44100这个。创建出来的缓冲区放在哪里呢？

/// 核心作用一
///int av_samples_alloc_array_and_samples(uint8_t ***audio_data,
/// int *linesize,
/// int nb_channels, //采样 声道数量 2
/// int nb_samples, //一帧有多少个采样点 1152
/// enum AVSampleFormat sample_fmt, //采样格式 s16le，
/// int align);
///
/// 第一个参数audio_data为：输入缓冲区的首地址，是个三级指针，本质上是对于 一个二级指针的 取地址，out参数
/// 这里要说明一下为什么 audio_data 是个三级指针，首先是一个输出参数，那么意味着，我们传递进来的要改动的就是二级指针，
/// 这个二级指针可以想象成是一个 uint8_t * audiodata[8], 每一个audiodata[i] 都是指向的 每个planar的具体数据。
/// 实际上这里就是为了兼容planar才弄了个三级指针。如果不考虑planar 的，二级指针就够了。
///
/// 第二个参数linesize为：输入缓冲区对齐的音频缓冲区大小，可能为 NULL，out参数
/// 这个linesize 是每个 audio_data[x]的大小，并不是输入缓冲区整体的大小。
/// 这个文档中并没有明确的说明，但是debug的时候可以发现。
/// 我们以 2声道，1024个样本数， 每个样本都是s16le（2个字节），交错模式计算：
/// 那么linesize 的 大小为：2*1024*2 = 4096字节
/// 我们以 2声道，1024个样本数， 每个样本都是s16le（2个字节），planar模式计算：
/// 那么linesize 的 大小为：2*1024 = 2048字节  （这里不用乘2声道）因为在planar模式下，audio_data[0] 中存储的是LLLLLLLLL，audio_data[1] 中存储的是RRRRRRRR
/// 当我们在交错模式的时候，所有的pcm的数据都是存储在 audio_data[0]中的，因此linesize就表示的全部的大小
/// 第三个参数nb_channels为：输入源的 声道数
/// 第四个参数nb_samples为：输入源每个声道的样本数，aac 为1024。也就是说，aac每一帧有1024个样本帧，还记得采样率吗？采样率是44100的话，就说明1秒钟采集44100个样本帧。这里不要搞混淆了。
/// 第五个参数sample_fmt为：输入源的AVSampleFormat -- 类似AV_SAMPLE_FMT_DBL
/// 第六个参数align为：是否要字节对齐，0为对齐，1为不对齐，一般都要对齐
/// 为什么通过3,4,5参数，就能计算出来 输入缓存大小呢？
/// 还记得这个吗？ 每一帧的大小 = 声道 * 每个声道的样本数量 * 每个样本的大小
/// 这就对应着，参数3,4,5呀。通过5可以得到每个样本的大小。因此这么设计的内部实现，估计也就是这几个参数相乘得到的。再加上是否需要字节对齐。

///核心作用二，创建输出缓冲区
/// 前面我们在计算输入缓冲区大小的时候，用到了 三个要素，如下
/// 第三个参数nb_channels为：输入源的 声道数
/// 第四个参数nb_samples为：输入源每个声道的样本数，aac 一帧为1024个样本。
/// 第五个参数sample_fmt为：输入源的AVSampleFormat -- 类似AV_SAMPLE_FMT_DBL
/// 为什么需要这三个参数也讲清楚了。那么问题来了，在 创建输出缓冲区 的时候这三个参数应该是多少呢？
/// 第三个参数nb_channels为：输出源的 声道数，这个是我们写代码前就规定的，
/// 比如说我们的目的就是将一个  -ar 44100 -ac 2 -f f32le 变成  -ar 48000 -ac 1 -f s16le 的.
/// 那么这个  输出源的nb_channels 就是1，输出源的 sample_fmt就是  AV_SAMPLE_FMT_S16。
/// 第四个参数 nb_samples 为：输出源每个声道的样本数，这个值是需要变化的，因为sample_fmt变化了
/// 不管怎么变化，你将一首2分钟的歌曲，转化后应该还是2分钟的歌曲，时间是不能变化的。
/// 有了这个认识，我们再来看，下面就比较好理解了。
/// 我们要从 44100 ---- 变成 48000，也就是说，之前1秒钟，采集的样本数量是44100个，我们一帧是1024个样本，花费的时间是 1024/44100
/// 我们变成48000后，花费的时间应该是不变的，那么有变化的只能是 采样的样本数量
/// 因此可以推断出 第四个参数为： 1024/44100 = nb_samples/48000
/// nb_samples = (1024/44100) * 48000 = 1024 * 48000 / 44100 = 输出源每个声道的样本个数 = 输出源采样率 * 输入源每个声道的样本数 / 输入源的采样率
/// 使用 ffmpeg 的api  av_rescale_rnd 完成，第三个参数 填写AV_ROUND_UP
/// int64_t av_rescale_rnd(int64_t a, int64_t b, int64_t c, enum AVRounding rnd) av_const;

///基于此，我们设计如下的api，要用到的 输入pcm的三要素，和输出pcm的三要素 在初始化的时候会保存到 内部变量，因此没有必要再api中,加入这些参数。
///int resamples_alloc_inbuffer_and_outbuffer(uint8_t ***in_audio_data,
///                                           int *in_linesize,
///                                           uint8_t ***out_audio_data,
///                                           int *out_linesize);

/// 到这里，我们再来回顾一下：我们的目的是 转化成 aac encoder 可以处理的 pcm ，进而转换成可以处理的avframe
/// 也就是我们 知道 重采样 输出 nb_number 为1024，那么 输入的 nb_number 就应该是计算出来的 （此处只是自己的猜想，并没有十足的把我，待测试，）
/// 输入源每个声道的样本个数 = 输入源采样率 * 输出源每个声道的样本数 / 输出源的采样率
///  我们这里输入源采样率 是 44100，输出源每个声道的样本数是1024，输出源的采样率是我们可以设置的，我们这里为了方便也设置为 44100（当然可以是48000，或者什么的，主要符合ffmpeg aac支持的采样率都是可以的）


///核心函数三
///
/// 这时候已经有了 输入缓冲区 和输出缓冲区了。当我们从 原始的文件中读取数据后，会放在 输入缓冲区中，然后通过 swr_convert 方法去做转换
/// int swr_convert(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count);
/// 假设我们都转换成功了，那么数据就会在 uint8_t **out, int out_count中，那么我们怎么将 uint8_t **out, int out_count变成avframe呢？
/// 可以调用 AVFrame *av_frame_alloc(void);  或者 int av_frame_get_buffer(AVFrame *frame, int align);
/// 设置该avframe 的参数：三要素 需要和 我们的audioencoder一样才行，
/// 然后将 avframe 的参数    uint8_t *data[AV_NUM_DATA_POINTERS] = out
/// 将 avframe 的参数     int linesize[AV_NUM_DATA_POINTERS] = out_count;
/// 为了方便期间，我们可以 给 swr_convert 方法实际调用的 传递参数为
/// int swr_convert(struct SwrContext *s, avframe.data[x], avframe.cout_count[x], const uint8_t **in , int in_count);
/// 也就是说，我们在转换数据之前，最好先弄一个 avframe 准备存储转换好的数据
/// 因此最好先弄一个avframe出来，显然的是，我们的三要素是要和audioencoder的参数一样的， AVSampleFormat必须是AV_SAMPLE_FMT_FLTP，int out_nb_samples 必须是1024
///
/// AVFrame *AllocFltpPcmFrame(AVChannelLayout *out_ch_layout, AVSampleFormat out_sample_fmt, int out_nb_samples);


///核心函数四，真正的 swr数据
/// 我们现在已经init swr，然后有了输入输出缓冲区，还 AllocFltpPcmFrame 出来了一个 avframe，那么下来应该真正的转化数据了，并将数据放置到 avframe中了
///  /// int swr_convert(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count);

///int swr_convert_and_fill_avframe(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count, AVFrame * outavframe);



class AudioResampler
{
public:
    AudioResampler();
    ~AudioResampler();

//AVFrame
    int InitResampler(const AVChannelLayout *out_ch_layout,
                      enum AVSampleFormat out_sample_fmt,
                      int out_sample_rate,
                      const AVChannelLayout *in_ch_layout,
                      enum AVSampleFormat  in_sample_fmt,
                      int  in_sample_rate);
    void DeInitResampler();

    int resamples_alloc_inbuffer_and_outbuffer(uint8_t ***in_audio_data,
                                               int *in_linesize,
                                               uint8_t ***out_audio_data,
                                               int *out_linesize);
    AVFrame *AllocFltpPcmFrame(AVChannelLayout out_ch_layout, AVSampleFormat out_sample_fmt, int out_sample_rate,int out_nb_samples);


    int swr_convert_and_fill_avframe(uint8_t **out, int out_count, const uint8_t **in , int in_count, AVFrame * outavframe);

    int swr_convert_and_fill_avframe_changed( AVFrame * avframe, const uint8_t **in , int in_count);

    void FreeFltpavframe(AVFrame * avframe);
public:
    SwrContext* _swrContext = nullptr;

    const AVChannelLayout *_out_ch_layout;
    AVSampleFormat _out_sample_fmt;
    int _out_sample_rate;

    const AVChannelLayout *_in_ch_layout;
    enum AVSampleFormat  _in_sample_fmt;
    int  _in_sample_rate;

    //重采样输入缓冲区 和输出缓冲区相关
    uint8_t **in_audio_data_buf;
    int in_linesize_buf;
    uint8_t **out_audio_data_buf;
    int out_linesize_buf;

};

#endif // AUDIORESAMPLER_H

audioresample.cpp

#include "audioresampler.h"

//这个类是用来 音频重采样相关
AudioResampler::AudioResampler()
{

}

AudioResampler::~AudioResampler()
{
    DeInitResampler();
}

#define ERROR_BUF \
    char errbuf[1024]; \
    av_strerror(ret, errbuf, sizeof (errbuf));

#define CODE(func, code) \
    if (ret < 0) { \
    ERROR_BUF; \
    qDebug() << #func << "error" << errbuf; \
    code; \
    }

#define END(func) CODE(func, fataError(); return;)
#define RET(func) CODE(func, return ret;)
#define CONTINUE(func) CODE(func, continue;)
#define BREAK(func) CODE(func, break;)

int AudioResampler::InitResampler(const AVChannelLayout *out_ch_layout,
                                  enum AVSampleFormat out_sample_fmt,
                                  int out_sample_rate,
                                  const AVChannelLayout *in_ch_layout,
                                  enum AVSampleFormat  in_sample_fmt,
                                  int  in_sample_rate){
    cout << "InitResampler func "
         << " out_ch_layout = " << out_ch_layout
         << " out_sample_fmt = "<< out_sample_fmt
         << " out_sample_rate = " << out_sample_rate
         <<" in_ch_layout = " << in_ch_layout
        <<" in_sample_fmt = " << in_sample_fmt
       <<" in_sample_rate = " << in_sample_rate
      << endl;

    int ret = 0;
    this->_out_ch_layout = out_ch_layout;
    this->_out_sample_fmt = out_sample_fmt;
    this->_out_sample_rate = out_sample_rate;

    this->_in_ch_layout = in_ch_layout;
    this->_in_sample_fmt = in_sample_fmt;
    this->_in_sample_rate = in_sample_rate;
    ret = swr_alloc_set_opts2(&this->_swrContext,
                              out_ch_layout,
                              out_sample_fmt,
                              out_sample_rate,
                              in_ch_layout,
                              in_sample_fmt,
                              in_sample_rate,
                              0,nullptr);
    if(ret < 0){
        cout << "InitResampler func error because swr_alloc_set_opts2 error "
             << " out_ch_layout = " << out_ch_layout
             << " out_sample_fmt = "<< out_sample_fmt
             << " out_sample_rate = " << out_sample_rate
             <<" in_ch_layout = " << in_ch_layout
            <<" in_sample_fmt = " << in_sample_fmt
           <<" in_sample_rate = " << in_sample_rate
          << endl;
        ERROR_BUF;
        return ret;
    }

    ret = swr_init(this->_swrContext);
    if(ret < 0){
        cout<<"swr_init error "<<endl;
        ERROR_BUF;
        return ret;
    }
    return ret ;
}

void AudioResampler::DeInitResampler()
{
    if(this->_swrContext){
        swr_free(&this->_swrContext);
        this->_swrContext = nullptr;
    }
}


int AudioResampler::resamples_alloc_inbuffer_and_outbuffer(uint8_t ***in_audio_data,
                                           int *in_linesize,
                                           uint8_t ***out_audio_data,
                                           int *out_linesize){
    int ret = 0;
    int out_nb_number = 1024; //我们输出的每次要是1024个样本帧，这是 ffmpeg aac决定的。
    /// 输入源每个声道的样本个数 = 输入源采样率 * 输出源每个声道的样本数 / 输出源的采样率

    int in_nb_samplerate = av_rescale_rnd(this->_in_sample_rate,out_nb_number,this->_in_sample_rate,AV_ROUND_UP);
//    int av_samples_alloc_array_and_samples(uint8_t ***audio_data, int *linesize, int nb_channels,
//                                           int nb_samples, enum AVSampleFormat sample_fmt, int align);
     ret = av_samples_alloc_array_and_samples(in_audio_data,
                                              in_linesize,
                                              this->_in_ch_layout->nb_channels,
                                              in_nb_samplerate,
                                              this->_in_sample_fmt,
                                              0);
     if(ret <0 ){
         cout<<"av_samples_alloc_array_and_samples inbuffer error"<<endl;
         ERROR_BUF;
         return ret;
     }

     ret = av_samples_alloc_array_and_samples(out_audio_data,out_linesize,
                                        this->_out_ch_layout->nb_channels,
                                        out_nb_number,
                                        this->_out_sample_fmt,
                                        0);
     if(ret <0 ){
         cout<<"av_samples_alloc_array_and_samples outbuffer error"<<endl;
         ERROR_BUF;
         return ret;
     }

    return ret;
}

AVFrame * AudioResampler::AllocFltpPcmFrame(AVChannelLayout out_ch_layout, AVSampleFormat out_sample_fmt, int out_sample_rate,int out_nb_samples){

    AVFrame * fltppcmavframe = nullptr ;
    fltppcmavframe = av_frame_alloc();
    if(fltppcmavframe == nullptr){
        cout<<"AllocFltpPcmFrame func error bacause av_frame_alloc erorr"<<endl;
        return nullptr;
    }
    fltppcmavframe->ch_layout = (const AVChannelLayout )out_ch_layout;
    fltppcmavframe->format = out_sample_fmt;
    fltppcmavframe->sample_rate = out_sample_rate;
    fltppcmavframe->nb_samples = out_nb_samples;
    int ret  = av_frame_get_buffer(fltppcmavframe, 0);
    if(ret < 0 ){
        ERROR_BUF;
        cout<<"AllocFltpPcmFrame func error bacause av_frame_get_buffer erorr"<<endl;
        return nullptr;
    }

    return fltppcmavframe;
}

int AudioResampler::swr_convert_and_fill_avframe_changed( AVFrame * avframe, const uint8_t **in , int in_count){
    int ret =0;
    int realconvert =  swr_convert(this->_swrContext,avframe->data,avframe->nb_samples,in,in_count);
    if(realconvert < 0 ){
        ERROR_BUF;
        cout<<"swr_convert_and_fill_avframe func error because swr_convert error"<<endl;
    }
    return ret;
}
int AudioResampler::swr_convert_and_fill_avframe(
                                                 uint8_t **out, int out_count,
                                                 const uint8_t **in , int in_count,
                                                 AVFrame * outavframe){
   int ret =0;
   int realconvert =  swr_convert(this->_swrContext,out,out_count,in,in_count);
   if(realconvert < 0 ){
       ERROR_BUF;
       cout<<"swr_convert_and_fill_avframe func error because swr_convert error"<<endl;
   }

   // 从 in 开始的地址，转化最多 in_count个字节，到 out，最多 out_count个字节
   int isplanner = av_sample_fmt_is_planar((AVSampleFormat)outavframe->format);
   if(isplanner == 1){
       //我们使用的ffmpeg 的aac ，是planner 模式
       for(int i=0; i< outavframe->ch_layout.nb_channels; ++i){
           outavframe->data[i] = out[i];
           //realconvert为真正读取到的数据的长度，在planner模式下，每个planner的数据都是一样的。
           outavframe->linesize[i] = realconvert;
       }
   }else {
       //交错模式,就只有 out[0]有数据
       outavframe->data[0] = out[0];
       outavframe->linesize[0] = realconvert;
   }

   return ret;

}

void AudioResampler::FreeFltpavframe(AVFrame * avframe){
    if(avframe)
       av_frame_free(&avframe);
}

3. 音频编码相关类 audioencoder.h

audioencoder.h

#ifndef AUDIOENCODER_H
#define AUDIOENCODER_H

/// 设计思路，该类主要的目的是将 avframe数据变成 aac 的avpacket。
///
/// 核心函数1，编码函数 encode：
/// 那么该用于编码的函数，假设我们叫做Encode函数，第一个参数就应该是 avframe，返回值理论上可以是avpacket，或者参数中有一个 AVPacket *，

/// 1.1我们知道一个avframe在 send到编码器上下文的时候，调用函数为    ret = avcodec_send_frame(this->_avcodecContext,frame);
/// 然后我们再调用        ret = avcodec_receive_packet(this->_avcodecContext, avpacket); 这时候 aac 编码的数据就到了 avpacket中了
/// 我们还知道，一个avframe 在send到编码器后，可能 avcodec_receive_packet 要多次才能接受完成。
/// 因此前面Encode函数中的参数要改动一下，不能是返回一个avpacket，应该是返回 一堆 avpacket，我们这里使用vector<AVPacket *>就可以，或者说参数中不能只有一个 AVPacket *，而应该是有 vector<AVPacket *> &packets

/// 1.2那么返回的这个的vector<AVPacket *> 中的AVPacket需不需要什么参数呢？我们可以想象一下，我们这里的aac会变成一堆avpacket，那么h264也会变成一堆avpacket，当返回这些AVPacket 都是会给 AVStream ，那么AVStream一定需要区分哪些是aac的avpacket，哪些是h264的avpacket
/// 因此这里avpacket是需要 一个 stream_index，表明自己是 aac的 stream_index,还是 h264的stream_index，那么这个参数怎么传递的呢？因此encode方法应该还有一个参数就做 stream_index
/// 而且最后需要执行     packet->stream_index = stream_index;

/// 1.3时间相关问题，
/// pcm阶段的计算：
/// 我们在还是pcm数据的时候，会计算出来1024个样本帧的花费的时间，具体的计算公式为 1024/44100 * 1000,000。这里1000000 是转化为微秒的时间基
/// 也就是在pcm的时候，计算出来的1024个样本花费的时间，那么每从pcm文件中读取1024个数据，就会给 pts +=1024/44100 * 1000,000
/// 总结在pcm阶段，我们从pcm文件中读取1024个样本，就需要给pts+1024/44100 * 1000,000， pcm 时间基是1000,000

/// avframe阶段的计算
/// 当我们将读取的数据要存储到avframe中的时候，就需要重新计算avframe的pts，使用avframe对应的时间基
/// 首先要明确的是的 对于avframe 对应的时间基用的是avcodec中的timebase，那么avcodec中的timebase是怎么知道的呢？观察源码会发现，音频编码器上下文的timebase 是在 avcodec_open2方法中 设置的，且设置的值的为 avcodecContext中的 sample_rate 的倒数，也就是1/44100
/// 注意，avcodecContext中的 sample_rate实际上是需要user 手动的设定的，因此你就明白了为什么在 编码的时候一定要设置 编码器上下文的 sample_rate了
/// 那么现在我们有了三个值了，一个是 pcm的pts，一个是pcm的时间基，一个是avframe的时间基，求avframe的pts？
///  pcm的pts * pcm 的时间基 = avframe 的pts * avframe的时间基(也就是avcodecContext的timebase)
/// 结论为 ：avframe 的pts  = pcm的pts * pcm 的时间基 / avframe的时间基（avcodecContext的timebase）
/// 为了防止内存溢出等问题，使用ffmpeg给我们提供的方法 ：av_rescale_q(pts, AVRational{1, (int)time_base}, this->_avcodecContext->time_base)
/// 可以观察 ：av_rescale_q方法的本质，就是第一个参数乘以第二个参数，最后除以第三个参数

/// avpacket阶段的计算
/// 到这里，我们最终生成的avpacket的 pts，dts，duration的值又是多少呢？
/// 实际上，在我们这个阶段，avpacket的 pts，dts，duration的值会直接 copy avframe中的pts，dts，duration（都是通过avcodec_receive_packet(this->_avcodecContext, avpacket) 方法传递的）
/// 这明显是不合理的。我们可以回顾一下 avframe 和avpacket 的本质，avframe装的是未压缩的pcm数据，avpacket装的是压缩过的pcm数据（也就是类似aac这样的数据）
/// 那么依然存在着  avframe的pts（当前已经存储在avpacket的pts） 转化成 真正的avpacket的pts的问题，那么avframe的时间基我们是知道的--是avcodec中的timebase，那么 这个真正的avpacket的时间基是什么呢？是avstream中的timebase
/// 那么avstream中的timebase怎么知道呢？实际上不同的编码器，它的avstream中的timebase是不同的，ffmpeg在你调用 avformat_write_header(fmt_ctx_, NULL);函数的时候就确定了您的timebase 是多少了，这也很容易想到，对于不同的编码器，它的头部信息一定就含有了 timebase这样重要的信息
/// avframe 的pts（当前旧的avpacket的pts） * avframe的时间基（avcodec的时间基） = 真正的avpacket的pts * 真正的avpacket的时间基（也就是avstream的时间基）
/// 结论为 ：真正的avpacket的pts  = avframe 的pts（当前旧的avpacket的pts） * avframe的时间基（avcodec的时间基） / 真正的avpacket的时间基（也就是avstream的时间基）
/// 对应代码在 muxer.cpp的sendPacket中。
///
/// 关于时间问题，看了网上的资料，大致结论如下，
/// 不同结构体的 time_base
///1、AVStream的time_base的单位是秒。每种格式的time_base的值不一样，根据采样来计算，比如mpeg的pts、dts都是以90kHz来采样的，所以采样间隔就是1/900000秒。
///2、AVCodecContext的time_base单位同样为秒，不过精度没有AVStream->time_base高，大小为1/framerate。
///3、AVPacket下的pts和dts以AVStream->time_base为单位(数值比较大)，时间间隔就是AVStream->time_base。
///4、AVFrame里面的pkt_pts和pkt_dts是拷贝自AVPacket，同样以AVStream->time_base为单位；而pts是为输出(显示)准备的，以AVCodecContex->time_base为单位。
///5、输入流InputStream下的pts和dts以AV_TIME_BASE为单位(微秒)，至于为什么要转化为微秒，可能是为了避免使用浮点数。
///6、输出流OutputStream涉及音视频同步，结构和InputStream不同，暂时只作记录，不分析

///1.4 好，我们回到正题，还是看encode的参数，应该还有哪些？如果上述时间问题，明白了6,7成，那么就应该能感觉到，我们encode方法应该还有两个参数，一个是pcm的pts，一个是pcm的timebase

///1.5 该方法设计原型为如下
///int Encode(AVFrame *frame, int stream_index, int64_t pcmpts, int64_t pcmtime_base, vector<AVPacket *> &packets);
///到这里的时候，有必要回顾一下 initaac方法的各个参数，就能理解即使 重采样后应该也没有问题，即使audioresample后 sampleformat 和samplerate发生变化，由于我们在initaac的参数是能处理的编码器上下文参数， 传递进来的avframe需要和 编码器上下文 需要的参数一致



///核心函数2 初始化函数initAAC 初始化编码器和编码器上下文，传递的参数该编码器最终要处理的pcm的格式，声道数，采样率。最后一个参数是 比特率。
/// 首先，我们就需要一个编码器，以及一个编码器上下文，而且因为我们使用的是ffmpeg自带的aac编码器，支持的 采样格式只能是 AV_SAMPLE_FMT_FLTP 的，
/// 而且 支持的采样率 和声道格式 也是有要求的。
/// 因此我们的initaac函数应该是 参数的，需要的有四个 ： AVChannelLayout， AVSampleFormat， int samplerate, int bit_rate

///2.1 实际上 AVSampleFormat 参数是不需要的，因为初始化的编码器只能是ffmpeg的aac，而ffmepg自带的aac编码器只支持一种格式--AV_SAMPLE_FMT_FLTP
/// 那么你传递一个 AV_SAMPLE_FMT_S16 进来去 初始化这个编码器，一定会失败。
/// 那么 AVChannelLayout，int samplerate, int bit_rate 这三个传递就来是干啥的呢？
///     const AVCodec * avcodec =  avcodec_find_encoder(AV_CODEC_ID_AAC);
///     this->_avcodecContext  = avcodec_alloc_context3(avcodec);
/// 主要是给avcodecContext中的关键参数赋值的。
/// _avcodecContext->bit_rate = bit_rate;
/// _avcodecContext->sample_fmt = avsampleformat;
/// _avcodecContext->sample_rate = samplerate;
/// _avcodecContext->ch_layout = avchannel;
/// 而且合理的写法是，我们应该在 avcodec_find_encoder(AV_CODEC_ID_AAC);后，根据 得到的 const AVCodec * avcodec 中的 如下三个参数判断传递进来的参数是否合理，如果不合理，应该提示user或者退出程序
/// const AVRational *supported_framerates; ///< array of supported framerates, or NULL if any, array is terminated by {0,0}
/// const int *supported_samplerates;       ///< array of supported audio samplerates, or NULL if unknown, array is terminated by 0
/// const enum AVSampleFormat *sample_fmts; ///< array of supported sample formats, or NULL if unknown, array is terminated by -1
/// 当前程序应该是直接忽略检查 传递进来的参数 的写法。

/// 2.2 为了方便期间，我们可以在audioencoder类中，定义一些变量，让这些变量记住传递进来的这些参数，
/// 还应该用变量记住当前的 avcodecContext，因为avcodecContext 非常重要。


/// 3 核心函数 获取一帧数据在单个通道有多少个采样点 函数原型设计为：int AudioEncoder::GetFrameSize()；本质上是获得 avctx->frame_size

///该函数用来获取 一帧数据在单个通道有多少个采样点，我们前面并没有给avcodecContex设置 framesize，为什么在这里framesize就有值了呢？
///这是因为 在InitAAC 的第三步 打开编码器上下文（也就是 avcodec_open2 方法内部实现中），会设置frame_size.
///那么这里有一个问题 avcodec_open2 方法是 ffmepg内部实现的，frame_size是 aac这个具体的编码器的值（比如aac是1024，mp3是1152），那么这个逻辑是怎么串起来的呢？
/// 3.1从最开始的avcodec_find_encoder 或者 avcodec_find_decoder开始看起。
/// 方法在 D:\Ctool\yinshipin\ffmpeg-6.0source\libavcodec\allcodecs.c 文件中。
///
/// const AVCodec *avcodec_find_encoder(enum AVCodecID id)
/// {
///     return find_codec(id, av_codec_is_encoder);
/// }
///
/// 3.2 find_codec的核心是av_codec_iterate方法
///
/// 3.3 av_codec_iterate 方法的本质是从codec_list中查找     const FFCodec *c = codec_list[i];
///
/// 3.4 在 codec_list 对应的aac有多个，其中ffmpeg自带的aac就是 ff_aac_encoder
///
/// 3.5 我们在源码中查找 ff_aac_encoder关键字，可以看到在 D:\Ctool\yinshipin\ffmpeg-6.0source\libavcodec\aacenc.c
/// const FFCodec ff_aac_encoder = {
/// .p.name         = "aac",
/// CODEC_LONG_NAME("AAC (Advanced Audio Coding)"),
/// .p.type         = AVMEDIA_TYPE_AUDIO,
/// .p.id           = AV_CODEC_ID_AAC,
/// .p.capabilities = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_DELAY |AV_CODEC_CAP_SMALL_LAST_FRAME,
/// .priv_data_size = sizeof(AACEncContext),
/// .init           = aac_encode_init,
/// FF_CODEC_ENCODE_CB(aac_encode_frame),
/// .close          = aac_encode_end,
/// .defaults       = aac_encode_defaults,
/// .p.supported_samplerates = ff_mpeg4audio_sample_rates,
/// .caps_internal  = FF_CODEC_CAP_INIT_CLEANUP,
/// .p.sample_fmts  = (const enum AVSampleFormat[]){ AV_SAMPLE_FMT_FLTP, AV_SAMPLE_FMT_NONE },
/// .p.priv_class   = &aacenc_class,
/// };
///
/// 3.6 如何启动的呢？
/// ret = avcodec_open2(avcodecContext,nullptr,nullptr); 内部是做了这个事情的。
/// if (codec2->init) {
/// 实际上就会调用到：
/// static av_cold int aac_encode_init(AVCodecContext *avctx)
/// aac_encode_init 函数中会设置     avctx->frame_size = 1024;




#include "iostream"
using namespace std;
extern "C" {
    #include "libavutil/avassert.h" // include 后面<> 表示会从标准库路径中查找指定的文件，""表示从当前当前目录（即包含 #include 指令的文件所在的目录）中查找指定的文件
    #include "libavutil/channel_layout.h"
    #include "libavutil/opt.h"
    #include "libavutil/mathematics.h"
    #include "libavutil/timestamp.h"
    #include "libswscale/swscale.h"
    #include "libswresample/swresample.h"
    #include "libavutil/error.h"
    #include "libavutil/common.h"
    #include "libavcodec/avcodec.h"
    #include "libavformat/avformat.h"
}
#include <vector>

#define ERROR_BUF \
    char errbuf[1024]; \
    av_strerror(ret, errbuf, sizeof (errbuf));

#define CODE(func, code) \
    if (ret < 0) { \
        ERROR_BUF; \
        cout << #func << "error" << errbuf; \
        code; \
    }

class AudioEncoder
{
public:
    AudioEncoder();
    ~AudioEncoder();

    ///
    int InitAAC(AVChannelLayout avchannel, AVSampleFormat aacavsampleformat,int aacsamplerate, int aacbit_rate);
//    int InitAAC(AVChannelLayout avchannel, int samplerate,int bit_rate); //要AVSampleFormat avsampleformat参数没有啥用

    //为mp3编码预留接口
    int InitMP3(AVChannelLayout avchannel, AVSampleFormat avsampleformat,int samplerate);

    //释放资源
    int Deinit();


    //该函数用来获取 一帧数据在单个通道有多少个采样点，
    int GetFrameSize();

    //该函数用来获取 pcm格式，
    int GetSampleFormat();

///  将传进来的avframe转编码成AVPacket，第一个参数是avframe，实际上是一帧数据。
//   对于pts而言，pts的增长如下 double audio_frame_duration = 1.0 * audio_encoder.GetFrameSize()/pcm_sample_rate*audio_time_base;
//   1024 / 44100 * 1000000 = 23219.95464852608 微秒 第一帧数据在pcm阶段 显示的时间为23219


    /**
     * @brief Encode 该函数的作用是将avframe 转换成 avpacket
     * @param frame  要转换的avframe，注意的是这个avframe是音频重采样后的avframe，
     * @param stream_index  是video index 还是 audio index
     * @param pts          pcm 花费的时间
     * @param time_base    pcm 的时间基，
     * pts 和 time_base的作用是 ：加上avframe的timebase（avframe的timebase 就是 avcodecContext的timebase）  最终计算出 avframe的pts。
     * @return nullptr 表示失败，成功则返回AVPacket，注意返回的AVPacket中的 pts 是从 avframe拷贝过去的，因此 这时候avpacket中的pts是有问题的，不是真正的pts
     * 这里要注意的是，pts是原pcm的pts，第一次参数frame是重采样后的frame，
     * 这里有个问题：如果原始的pcm是 48000 ， pts是一次读取1024个样本帧的时间，
     * 而音频重采样后的pcm是 44100 ，那么这个pts参数还是利用 1024/48000 * 1000000吗？
     * 答案是：也是一样的，虽然重采样 从48000变成44100，avcodec中的timebase 是传递的该编码器处理的采样率，应该传递的也是44100
     *
     */
    AVPacket *Encode(AVFrame *frame , int stream_index, int64_t pts, int64_t time_base);
    int Encode(AVFrame *frame, int stream_index, int64_t pts, int64_t time_base,
               vector<AVPacket *> &packets);


    AVCodecContext *GetCodecContext();
    AVChannelLayout GetChannelLayout();
    int GetSampleRate();

private:
    //pcm声道格式 -ac 2, 这里为什么要给一个初始值为AV_CHANNEL_LAYOUT_STEREO？猜测为AVChannelLayout没有none的值，因此随便给了一个值，这个默认值不会用到，所有的参数都需要在构造方法中传递真正的值
    AVChannelLayout _avchannel = AV_CHANNEL_LAYOUT_STEREO;

    //pcm格式 -f s16le
    AVSampleFormat _avsampleFormat = AV_SAMPLE_FMT_NONE;

    //pcm采样率 -ar 44100
    int _samplerates = 0;

    //aac encoder 的比特率。
    int _bitrate = 0; // aac的比特率，对于不同格式的aac是有比特率的，这里这个默认值应该也不会用到，所有的参数都需要在构造方法中传递真正的值


    //编码后的 AVCodecContext
    AVCodecContext * _avcodecContext = nullptr;

};

#endif // AUDIOENCODER_H

audioencoder.cpp

#include "audioencoder.h"

AudioEncoder::AudioEncoder()
{

}

AudioEncoder::~AudioEncoder()
{
    if(this->_avcodecContext){
        Deinit();
    }

}

//    int InitAAC(AVChannelLayout avchannel, int samplerate,int bit_rate); teacher代码

//传递进来的 avsampleformat 必须是 AV_SAMPLE_FMT_FLTP，因为我们用的是 ffmpeg 自带的encoder aac
//这里的参数是传递进来的，而不是在AudioEncoder中参数默认的值
int AudioEncoder::InitAAC(AVChannelLayout avchannel,
                          AVSampleFormat avsampleformat,
                          int samplerate,
                          int bit_rate)
{

    cout<<"audioencoder.InitH264 avchannel.nb_channels = " << avchannel.nb_channels
       <<" avsampleformat = "<< avsampleformat
       <<" samplerate = " << samplerate
       <<" bit_rate = " << bit_rate
       <<endl;

    int ret = 0;
    //1.使用AudioEncoder内部的变量记住 传递进来的值
    this->_avchannel = avchannel;
    this->_avsampleFormat = avsampleformat;
    this->_samplerates = samplerate;
    this->_bitrate = bit_rate;

    //2.找到编码器
    const AVCodec * avcodec =  avcodec_find_encoder(AV_CODEC_ID_AAC);
    if(avcodec == nullptr){
        ret = -1;
        cout<<" func InitAAC error because avcodec_find_encoder(AV_CODEC_ID_AAC) error "<<endl;
        return ret;
    }

    //3.通过编码器找到编码器上下文
    this->_avcodecContext  = avcodec_alloc_context3(avcodec);
    if(_avcodecContext == nullptr){
        ret = -1;
        cout<<" func InitAAC error because avcodec_alloc_context3(avcodec) error "<<endl;
        return ret;
    }


    //3.1 设置 音频编码器上下文的参数，很显然，除了 音频三要素 要设定外，我们还要设定的是 bit_rate，以及 flags（用途是保证aac文件不带adst header），还要设定 bit_rate

    ///AV_CODEC_FLAG_GLOBAL_HEADER参数相关--中文翻译：将全局标头放置在extradata中，而不是每个关键帧中。
    ///这里要明白为什么加这个参数，需要知道如下的两个知识点：
    /// 1. mp4文件中的 aac 是不带 adst header的，因此我们在将 aac 合成为mp4的时候，不能给每个aac帧的前面加 adst header
    /// 2. AV_CODEC_FLAG_GLOBAL_HEADER参数的含义就是：将全局标头放置在extradata中，而不是每个关键帧中。，对于aac来说，在每一帧的前面不加 adst header
    /// 这里扩展一下h264，对于h264，有Annexb 和 AVCC 两种存储模式，MP4中的存储的是h264是AVCC格式的，AVCC格式是只有一个头文件在最前面，后面的都是h264纯数据，因此，h264编码成mp4的时候，应该也需要添加 AV_CODEC_FLAG_GLOBAL_HEADER这个标志flag
    _avcodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;

    //将avcodecContext中的关键参数也使用 传递进来的值赋值。
    _avcodecContext->bit_rate = bit_rate;
    _avcodecContext->sample_fmt = avsampleformat;
    _avcodecContext->sample_rate = samplerate;
    _avcodecContext->ch_layout = avchannel;

    //3.打开编码器上下文 - 使用avcodec_open2 方法，第二个参数这里传递为nullptr，是因为avcodecContext 是用
    ret = avcodec_open2(_avcodecContext,nullptr,nullptr);
    if(ret < 0 ){
        char errinfo[1024] = "";
        av_strerror(ret, errinfo,1024);
        cout << "Init_Muxer func error because avformat_alloc_output_context2 error errinfo = " << errinfo << endl;
        avcodec_free_context(&_avcodecContext);
        return ret;
    }


    cout<<"init aac success..."<<endl;

    return ret;
}

int AudioEncoder::Deinit()
{
    if(this->_avcodecContext){
        avcodec_free_context(&this->_avcodecContext);
        this->_avcodecContext = nullptr;  //这一句实际上不需要，因为avcodec_free_context中已经值置空了。
    }
}

//该函数用来获取 一帧数据在单个通道有多少个采样点，我们前面并没有给avcodecContex设置 framesize，为什么在这里framesize就有值了呢？
//这是因为 在InitAAC 的第三步 打开编码器上下文（也就是 avcodec_open2 方法内部实现中），会设置frame_size.
//那么这里有一个问题 avcodec_open2 方法是 ffmepg内部实现的，frame_size是 aac这个具体的编码器的值（比如aac是1024，mp3是1152），那么这个逻辑是怎么串起来的呢？
//
int AudioEncoder::GetFrameSize()
{
    if(this->_avcodecContext){
        return this->_avcodecContext->frame_size;
    }
    return -1;
}

//该函数用来获取 pcm格式，pcm格式是user设定的。
int AudioEncoder::GetSampleFormat()
{
    if(this->_avcodecContext){
        return this->_avcodecContext->sample_fmt;
    }
    return AV_SAMPLE_FMT_NONE; //-1
}


//将传进来的frame转编码成AVPacket，第一个参数是avframe，实际上是一帧数据。
//第二个参数是 int stream_index ，这个从调用来看，应该是 avstream 中对应的 audio_index，还是 video_index.
//第三个参数是 int64_t pts，这个是用来干啥的呢？从使用来看，每次处理一个avframe，这个pts是 一帧audio frame 中所有pcm数据 （对于aac来说，就是1024个采样点）所占用的时间。当处理一帧 avframe，这个值就会 + 一帧avframe所需要的时间
//第四个参数是 int64_t time_base，这个是用来干啥的呢？从使用来看，传递的是1000000，是微秒 单位。
//也就是说：第三个参数和第四个参数的意义为：pcm数据这一帧显示的时间，用的时间基是啥

//   对于pts而言，pts的增长如下 double audio_frame_duration = 1.0 * audio_encoder.GetFrameSize()/pcm_sample_rate*audio_time_base;
//    1024 / 44100 * 1000000 = 23219.95464852608 微秒 第一帧数据在pcm阶段 显示的时间为23219.95464 ,第二帧数据 pts 显示 46439.90929705215
// time_base 为1000000.

AVPacket *AudioEncoder::Encode(AVFrame *frame,
                               int stream_index,
                               int64_t pts,
                               int64_t time_base)
{

    int ret =0;
    if(this->_avcodecContext == nullptr){
        cout<<"class AudioEncoder func Encode error because this->_avcodecContext == nullptr return nullptr"<<endl;
        return nullptr;
    }

    //this->_avcodecContext->time_base 的值是 1/44100
    ///从  avcodec_open2 源码可以看到，如果是 avcodecContext 的 type 是 AUDIO，avcodecContex的timebase就是  1/sample_rate
    ///     if (avctx->codec_type == AVMEDIA_TYPE_AUDIO &&
    ///         (!avctx->time_base.num || !avctx->time_base.den)) {
    ///             avctx->time_base.num = 1;
    ///             avctx->time_base.den = avctx->sample_rate;
    ///     }

    //转化时间，传递进来的(pts * 1/1000000)/ (1/44100)
    //转化后发现，第一个帧的时候，pts是1024，第二帧的时候，pts是2048
    pts = av_rescale_q(pts, AVRational{1, (int)time_base}, this->_avcodecContext->time_base);
    if(frame) {
        frame->pts = pts;
    }
    //这个传递进来的 frame是 已经打包好的frame，因此直接发送到 编码器就好了
    ret = avcodec_send_frame(this->_avcodecContext,frame);
    if(ret < 0){
        //@retval 0                 success
        //<0 说明
        cout<<"AudioEncoder::Encode avcodec_send_frame error ret = " << ret  << endl;
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avcodec_send_frame failed:%s\n", errbuf);
        return nullptr;

    }
    // 编码和解码都是一样的，都是send 1次，然后receive多次, 直到AVERROR(EAGAIN)或者AVERROR_EOF
    AVPacket *packet = av_packet_alloc();
    ret = avcodec_receive_packet(this->_avcodecContext, packet);在avcodec_receive_packet 完成后，avpacket的pts，avpacket的dts，avpacket的duration都有确定的值了,这时候avpacket中的pts，dts，duration都是从 avframe中拷贝过去的（这个从源码中看好像是这样，但是不是很确定）
    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("aac avcodec_receive_packet failed:%s\n", errbuf);
        av_packet_free(&packet);
        return NULL;
    }
    packet->stream_index = stream_index;
    return packet;



}


//这种写法，会将 avframe 中的数据打包成 avpacket，然后存储在avpacket 的vector中，由于是在
int AudioEncoder::Encode(AVFrame *frame, int stream_index, int64_t pts, int64_t time_base,
                         vector<AVPacket *> &packets)
{
    if(!this->_avcodecContext) {
        printf("codec_ctx_ null\n");
        return NULL;
    }
    pts = av_rescale_q(pts, AVRational{1, (int)time_base}, this->_avcodecContext->time_base);
    if(frame) {
        frame->pts = pts;
    }
    int ret = avcodec_send_frame(this->_avcodecContext, frame);
    if(ret != 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avcodec_send_frame failed:%s\n", errbuf);
        return NULL;
    }
    while(1)
    {
        //这个写法有没有问题呢？假设从frame中读取的数据要通过  9 次  avcodec_revceive 完成，也就是说，我们每次在内存应该有9个 avpacket，等待释放。
        AVPacket *packet = av_packet_alloc();
        ret = avcodec_receive_packet(this->_avcodecContext, packet);
        packet->stream_index = stream_index;
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
            ret = 0;
            av_packet_free(&packet);
            break;
        } else if (ret < 0) {
            char errbuf[1024] = {0};
            av_strerror(ret, errbuf, sizeof(errbuf) - 1);
            printf("aac avcodec_receive_packet failed:%s\n", errbuf);
            av_packet_free(&packet);
            ret = -1;
        }
        packets.push_back(packet);
    }
    return ret;
}


AVCodecContext* AudioEncoder::GetCodecContext(){
    return this->_avcodecContext;
}


AVChannelLayout AudioEncoder::GetChannelLayout(){
    return this->_avchannel;
}
int AudioEncoder::GetSampleRate(){
    return this->_samplerates;
}

4. 视频和音频结合的相关类 muxer.h

muxer.h

#ifndef MUXER_H
#define MUXER_H

//用于 将 aac 和 h264 组合在一起的类
#include "iostream"
using namespace std;
extern "C" {
    #include "libavutil/avassert.h" // include 后面<> 表示会从标准库路径中查找指定的文件，""表示从当前当前目录（即包含 #include 指令的文件所在的目录）中查找指定的文件
    #include "libavutil/channel_layout.h"
    #include "libavutil/opt.h"
    #include "libavutil/mathematics.h"
    #include "libavutil/timestamp.h"
    #include "libswscale/swscale.h"
    #include "libswresample/swresample.h"
    #include "libavutil/error.h"
    #include "libavutil/common.h"
    #include "libavcodec/avcodec.h"
    #include "libavformat/avformat.h"
}

#define ERROR_BUF \
    char errbuf[1024]; \
    av_strerror(ret, errbuf, sizeof (errbuf));

#define CODE(func, code) \
    if (ret < 0) { \
        ERROR_BUF; \
        cout << #func << "error" << errbuf; \
        code; \
    }

class Muxer
{
public:
    Muxer();
    ~Muxer();

    //1. 初始化,其主要作用是调用 int avformat_alloc_output_context2(AVFormatContext **ctx, const AVOutputFormat *oformat, const char *format_name, const char *filename);
    int Init_Muxer(const char *url);


    //资源释放
    int DeInit_Muxer();

    //创建流,创建出来的avstream 会保存在成员变量 _audio_avstream 或者 _video_avstream 中
    int  AddStream(AVCodecContext *avcodecContext);

    //给音视频文件里面写入内容的三大步
    int sendHeader();// 写mp4文件的头部
    int sendPacket(AVPacket *avpacket);//写mp4文件的身体
    int sendTrailer(); //写mp4文件的尾部

    //打开 AVIO
    int open();

    int getAudioIndex();

    int getVideoIndex();

    AVCodecContext * getAudioAVCondecContext();

    AVCodecContext * getVideoAVCondecContext();


private:
    // 1.初始化 时候用到的参数,可以直接为nullptr，为了 avformat_alloc_output_context2 方法准备的，做为avformat_alloc_output_context2方法参数，传递时候时用 &_avformatContext 传递，也就是二级指针，在avformat_alloc_output_context2方法内部，会给_avformatContext 重新赋值，因此方法执行完毕后，_avformatContext不再为nullptr
    AVFormatContext * _avformatContext = nullptr;

    // 1. 初始化 时候用到，会将user传递的url保存 到 _url中，相当于将 要生成的 out.mp4 文件名字保存在这个里面
    string _url = "";

    //音频编码器上下文
    AVCodecContext *_audio_avcodecContext = nullptr;

    //视频编码器上下文
    AVCodecContext *_video_avcodecContext = nullptr;

    //音频stream
    AVStream * _audio_avstream = nullptr;

    //视频stream
    AVStream * _video_avstream = nullptr;

    //音频stream 的index
    int _audio_index = -1;

    //视频stream 的index
    int _video_index = -1;
};



#endif // MUXER_H

muxer.cpp

#include "muxer.h"


Muxer::Muxer()
{

}

Muxer::~Muxer()
{
    DeInit_Muxer();
}

//初始化
int Muxer::Init_Muxer(const char *url)
{
    int ret = 0;
    ///函数介绍：为output fromat创建一个 avformatContext。Allocate an AVFormatContext for an output format.
    ///分配一个 avformatContext for an output format，需要通过 avformat_free_context()方法释放，我们会在 释放资源的函数中调用avformat_free_context(),释放资源的函数为 int DeInit_Muxer();
    ///    int avformat_alloc_output_context2(AVFormatContext **ctx, const AVOutputFormat *oformat, const char *format_name, const char *filename);
    ///此函数第二个参数和第三个参数，在读取本地文件的时候，可以为nullptr，意思是根据第四个参数自行推断，但是在 网络推拉流的时候，有特殊的设置，不能为nullptr
    ///参数说明：
    ///    ctx:需要创建的context，返回NULL表示失败。
    ///    oformat:指定对应的AVOutputFormat，如果不指定，可以通过后面format_name、filename两个参数进行指定，让ffmpeg自己推断。
    ///    format_name: 指定音视频的格式，比如“flv”，“mpeg”等，如果设置为NULL，则由filename进行指定，让ffmpeg自己推断。
    ///    filename: 指定视频文件的路径，如果oformat、format_name为NULL，则ffmpeg内部根据filename后缀名选择合适的复用器，⽐如xxx.flv则使用flv复用器。
    ///    @return  >= 0 in case of success, a negative AVERROR code in case of failure
    ///    也就是说，此函数有两个判断值，可以判断失败，一个就是函数返回值，一个是第一个参数返回NULL
    ret = avformat_alloc_output_context2(&_avformatContext,
                                         nullptr,
                                         nullptr,
                                         url);
    if(ret < 0){
        char errinfo[1024] = "";
        av_strerror(ret, errinfo,1024);
        cout << "Init_Muxer func error because avformat_alloc_output_context2 error errinfo = " << errinfo << endl;
        return ret;
    }
    //    ERROR_BUF;这一句和上面一句的功能一样


    //保存 传递过来的url 到 _url
    this->_url = url;

    return ret;
}


//释放资源相关，
int Muxer::DeInit_Muxer()
{
    if(this->_avformatContext){
        avformat_free_context(this->_avformatContext);
        //        avformat_close_input(&(this->_avformatContext));// 示例代码中用的 avformat_close_input 方法 ， avformat_close_input 方法和 avformat_free_context 方法应该两个都行，看源码 avformat_close_input方法内部会调用avformat_free_context方法
    }
    this->_url = "";
    this->_avformatContext = nullptr;
    this->_audio_avstream = nullptr;
    this->_video_avstream = nullptr;
    this->_audio_index = -1;
    this->_video_index = -1;
    this->_audio_avcodecContext = nullptr;
    this->_video_avcodecContext = nullptr;
    return 0;
}

int Muxer::AddStream(AVCodecContext *avcodecContext)
{
    int ret =0;
    //error 判断
    if(this->_avformatContext == nullptr){
        ret =-1;
        cout<<"func AddStream error bacause this->_avformatContext == nullptr"<<endl;
        return ret;
    }

    //error 判断
    if(avcodecContext == nullptr){
        ret =-1;
        cout<<"func AddStream error bacause avcodecContext == nullptr"<<endl;
        return ret;
    }

    //1.创建 avstream
    AVStream *avstream = avformat_new_stream(this->_avformatContext,nullptr);
    if(avstream == nullptr){

        cout<<"func AddStream error bacause avformat_new_stream func error == nullptr"<<endl;
        return ret;
    }

    //2. 将AddStream 方法参数 copy 到 AVStream 的cpdecpar中
    ret = avcodec_parameters_from_context(avstream->codecpar,avcodecContext);
    if(ret <0 ){
        ERROR_BUF;
        return ret;
    }

    //3. 打印 avfromatcontext 的信息。四个参数代表的含义如下：
    //    ic：输入/输出的AVFormatContext结构体；
    //    index：视频流/音频流/字幕流等流的索引号，可以为负数表示打印所有流的信息；
    //    url：输出的地址，可以为NULL，表示输出到标准错误流中；
    //    is_output：标识是输入还是输出，0表示输入，1表示输出。
    av_dump_format(this->_avformatContext,0,this->_url.c_str(),1);


    //4.判断当前是视频流还是 音频流, 然后 将对应的
    if(avcodecContext->codec_type == AVMEDIA_TYPE_AUDIO){
        this->_audio_avcodecContext = avcodecContext;
        this->_audio_avstream = avstream;// 说明当前new 出来的要是 audio 的 avstream。
        this->_audio_index = avstream->index;
    }

    if(avcodecContext->codec_type == AVMEDIA_TYPE_VIDEO){
        this->_video_avcodecContext = avcodecContext;
        this->_video_avstream = avstream;// 说明当前new 出来的要是 video 的 avstream。
        this->_video_index = avstream->index;
    }


    return ret ;
}


//发送头部信息，在发送头部信息的时候，会将avstream的 time_base 变成 1，90000
int Muxer::sendHeader()
{
    int ret =0;
    //error 判断
    if(this->_avformatContext == nullptr){
        ret =-1;
        cout<<"func sendHeader error bacause this->_avformatContext == nullptr"<<endl;
        return ret;
    }

    ret = avformat_write_header(this->_avformatContext,nullptr);
    cout << "avformat_write_header ret = " << ret << endl;
    if(ret < 0){
        ERROR_BUF;
        cout<<"func sendHeader error bacause avformat_write_header func error" << endl;
        return ret;
    }

    return ret;
}


//真正发送packet信息
int Muxer::sendPacket(AVPacket *avpacket)
{

    int ret =0;
    //error 判断
    if(this->_avformatContext == nullptr){
        ret =-1;
        cout<<"func sendPacket error bacause this->_avformatContext == nullptr"<<endl;
        //释放数据
        av_packet_free(&avpacket);
        return ret;
    }

    //error 判断
    if(avpacket == nullptr){
        ret =-1;
        cout<<"func sendPacket error bacause avpacket == nullptr"<<endl;
        //释放数据
        av_packet_free(&avpacket);
        return ret;
    }




    //得到avpacket 的data 首地址，avpacket->data指向保存压缩数据的指针，这就是AVPacket的实际数据。
    uint8_t * avpacket_data = avpacket->data;

    //得到avpacket 中data 的size
    int avpacket_size = avpacket->size;

    //error 判断
    if(avpacket_data == nullptr || avpacket_size <= 0){
        // 这说明没有数据，或者数据的大小为0，或者小于0,。
        ret =-1;
        cout<<"func sendPacket error bacause avpacket_data == nullptr || avpacket_size <= 0"<<endl;
        //释放数据
        av_packet_free(&avpacket);
        return ret;
    }


    //得到avpacket 的stream_index，用于判断是video_index 还是 audio_index
    int avpacket_stream_index = avpacket->stream_index;


    AVRational src_timebase; //avcodecContext 中的time_base，aacavcodecContext 是1，44100. h264的timebase是1，1000000
    AVRational dst_timebase; // 输出文件的 timebase
    if(avpacket_stream_index == this->_video_index && this->_video_avstream && this->_video_avcodecContext ){
        //如果当前传递进来的 stream_index 是 video_index,并且_video_avstream 和 _video_avcodecContext 两者都不为nullptr
        src_timebase = this->_video_avcodecContext->time_base; //(1,1000000)
        dst_timebase = this->_video_avstream->time_base;
    }else if(avpacket_stream_index == this->_audio_index && this->_audio_avstream && this->_audio_avcodecContext){
        //如果当前传递进来的 stream_index 是 audio_index,并且_audio_avstream 和 _audio_avcodecContext 两者都不为nullptr
        src_timebase = this->_audio_avcodecContext->time_base;//(1,44100)
        dst_timebase = this->_audio_avstream->time_base;
    }

    //时间基的转换 av_rescale_q 函数的 实质是 `a * bq / cq`.
//    x = a * bq / cq
//    目标是将原来的 avpacket->pts转化一下，原来的pts*src_timebase 就是要真正花费的时间。
//    例如原先的 pts 是10000，src_base是1/25，这时候 pts 对应的时间就应该是 400s，这个时间是一定的，
//    那么在avstream中的timebase如果和avpacket中不同，那么就需要重新计算 avstream中的pts，而实际上 avstream是没有 pts，avpacket 和avframe都用的 avstream中的time_base

    avpacket->pts = av_rescale_q(avpacket->pts, src_timebase,dst_timebase);
    avpacket->dts = av_rescale_q(avpacket->dts, src_timebase,dst_timebase);
    avpacket->duration = av_rescale_q(avpacket->duration, src_timebase,dst_timebase);

    //将avpacket数据写入， * Write a packet to an output media file ensuring correct interleaving.
    //av_interleaved_write_frame是有缓存的，假设我们 audio 是30，60，video是50，那么av_interleaved_write_frame内部会自动写成 audio 30，video 50，audio 60
    //由于有缓存，因此在直播的时候不建议使用这个方法
    ret = av_interleaved_write_frame(this->_avformatContext, avpacket);
    if(ret <0 ){
        ERROR_BUF;
        cout<<"func sendPacket error bacause av_interleaved_write_frame func error" << endl;
        //释放数据
        av_packet_free(&avpacket);
        return ret;
    }

    //在直播时候建议使用如下的代码
//    av_write_frame(this->_avformatContext,avpacket);


    //释放数据
    av_packet_free(&avpacket);
    return ret;


}















//发送尾部信息
int Muxer::sendTrailer()
{
    int ret =0;
    //error 判断
    if(this->_avformatContext == nullptr){
        ret =-1;
        cout<<"func sendTrailer error bacause this->_avformatContext == nullptr"<<endl;
        return ret;
    }
    ret = av_write_trailer(this->_avformatContext);
    if(ret < 0){
        ERROR_BUF;
        cout<<"func sendTrailer error bacause av_write_trailer func error" << endl;
        return ret;
    }

    return ret;
}

int Muxer::open()
{
    //创建并初始化AVIOContext以访问url指示的资源。
    int ret = avio_open(&this->_avformatContext->pb, this->_url.c_str(), AVIO_FLAG_WRITE);
    if(ret < 0) {
        char errbuf[1024] = {0};
        av_strerror(ret, errbuf, sizeof(errbuf) - 1);
        printf("avio_open %s failed:%s\n",this->_url.c_str(), errbuf);
        return -1;
    }
    return 0;
}

int Muxer::getAudioIndex()
{
    return this->_audio_index;
}

int Muxer::getVideoIndex()
{
    return this->_video_index;
}

AVCodecContext * Muxer::getAudioAVCondecContext(){
    return this->_audio_avcodecContext;
}

AVCodecContext * Muxer::getVideoAVCondecContext(){
    return this->_video_avcodecContext;
}

5.核心调用

main.cpp

/***
 *该程序的目的是:
 * 将 一个pcm文件 和 一个 yuv文件，合成为一个 0804_out.mp4文件
 * pcm文件和yuv文件是从哪里来的呢？是从 sound_in_sync_test.mp4 文件中，使用ffmpeg命令 抽取出来的。
 * 这样做的目的是为了对比前后两个mp4(sound_in_sync_test.mp4  和 0804_out.mp4 ) 文件。
 *
 * 1. 从sound_in_sync_test.mp4 文件 中抽取 pcm命令如下：
 * ffmpeg -i sound_in_sync_test.mp4 -vn -ar 44100 -ac 2 -f s16le 44100_2_s16le.pcm
 * -vn 表示不处理视频
 *
 * 2. 从sound_in_sync_test.mp4 文件 中抽取 yuv命令如下：
 *
 * ffmpeg -i sound_in_sync_test.mp4 -pix_fmt yuv420p 720x576_yuv420p.yuv
 *
 * 3.播放测试
 *  对于 pcm 数据
 ffplay -ac 2 -ar 44100 -f s16le 44100_2_s16le.pcm
 *  对于 YUV 数据
 ffplay -pixel_format yuv420p -video_size 720x576 -framerate 25  720x576_yuv420p.yuv
 ***/

#include <iostream>
#include <string>
#include "audioencoder.h"
#include "videoencoder.h"
#include "muxer.h"
#include "audioresampler.h"

extern "C" {

#include "libavutil/avassert.h"
#include <libavutil/channel_layout.h>
#include <libavutil/opt.h>
#include <libavutil/mathematics.h>
#include <libavutil/timestamp.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libswresample/swresample.h>
#include <libavutil/error.h>
#include <libavutil/common.h>

}
using namespace std;


#define YUV_WIDTH 720
#define YUV_HEIGHT 576
#define YUV_FPS 25
#define YUV_PIX_FMT AV_PIX_FMT_YUV420P
#define VIDEO_BIT_RATE 500 * 1024


#define PCM_SAMPLE_FORMAT AV_SAMPLE_FMT_S16
#define PCM_SAMPLE_RATE 44100
#define PCM_CHANNEL_LAYOUT AV_CHANNEL_LAYOUT_STEREO
#define AUDIO_BIT_RATE 128 * 1024


//时间基相关
#define AUDIO_TIME_BASE 1000000
#define VIDEO_TIME_BASE 1000000

int main()
{

    cout<<"aaa"<<endl;
    int ret =0;
    ///     0.先前准备。使用命令先弄出来pcm 为文件 和 yuv 文件
    ///     ffmpeg -i sound_in_sync_test.mp4 -pix_fmt yuv420p 720x576_yuv420p.yuv
    ///     ffmpeg -i sound_in_sync_test.mp4 -vn -ar 44100 -ac 2 -f s16le 44100_2_s16le.pcm

    ///第一步：打开yuv pcm文件
    string in_yuv_name = "D:/AllInformation/qtworkspacenew/08muxing_04_mp4/720x576_yuv420p.yuv";
    const char * in_pcm_name = "D:/AllInformation/qtworkspacenew/08muxing_04_mp4/44100_2_s16le.pcm";
    const char * out_mp4_name = "D:/AllInformation/qtworkspacenew/08muxing_04_mp4/out.mp4";

    FILE * in_yuv_fd = nullptr;
    FILE * in_pcm_fd = nullptr;

    ///第二步：初始化编码器，包括视频，音频编码器，分配yuv，pcm的帧buffer
    ///2.1 初始化视频编码器，这里为了后续的扩展和调试，我们使用变量，而不是宏定义完成
    int yuvwidth = YUV_WIDTH;
    int yuvheight = YUV_HEIGHT;
    int yuvfps = YUV_FPS;
    AVPixelFormat yuvpixfmt = YUV_PIX_FMT;
    int video_bit_rate = VIDEO_BIT_RATE;//注意，这个是video 的比特率，不是yuv
    videoencoder videoencoder;
    /// 2.1.1分配yuv buf
    int y_frame_size = 0;
    int u_frame_size = 0;
    int v_frame_size = 0;
    int yuv_frame_size = 0;
    uint8_t *yuv_frame_buf = nullptr;


    ///2.2 初始化音频编码器，这里为了后续的扩展和调试，我们使用变量，而不是宏定义完成
    int pcm_samplerate = PCM_SAMPLE_RATE;
    AVSampleFormat pcm_sampleformat = PCM_SAMPLE_FORMAT;
    AVChannelLayout pcm_avchannel_layout = PCM_CHANNEL_LAYOUT;
    int audio_bit_rate = AUDIO_BIT_RATE;
    AudioEncoder audioencoder;

    /// 2.2.1 分配pcm buf pcm_frame_size  = 单个采样点占用的字节 * 通道数量 * 每个通道有多少给采用点
    int pcm_frame_size = 0;
    uint8_t *pcm_frame_buf = nullptr;

    ///第三步：mp4初始化，包括新建流，open io，send header
    Muxer mp4_muxer;

    /// 4.1  时间戳相关 audio_pts 和 video_pts 刚开始都是0，audio_frame_duration是一帧aac数据
    double audio_pts = 0;
    double video_pts = 0;
    double audio_frame_duration = 0;
    double video_frame_duration = 0;
    int64_t audiotimebase = AUDIO_TIME_BASE;
    int64_t videotimebase = VIDEO_TIME_BASE;


    /// 4.2 video读完 或者 audio 读完 标志
    int audio_finish = 0;
    int video_finish = 0;
    int read_size = 0;  //用于记录每次读取的数据大小

    ///4.4 将 pcm变成 avframe，然后变成avpacket时，用于存储avpacket
    AVPacket *avpacket = nullptr;

    ///4.3  audio resample 相关
    AVChannelLayout out_resample_ch_layout;
    AVSampleFormat out_resample_sample_fmt;
    int out_resample_sample_rate;

    AVChannelLayout in_resample_ch_layout;
    AVSampleFormat in_resample_sample_fmt;
    int in_resample_sample_rate;

    AudioResampler audioresample;
    vector<AVPacket *> packets;

    ///第一步：打开yuv pcm文件
    in_yuv_fd = fopen(in_yuv_name.c_str(),"rb");
    if(!in_yuv_fd){
        ret = -1;
        cout<<"main.cpp fopen(in_yuv_name,'rb') func error return "<<endl;
        goto mainend;
    }
    in_pcm_fd = fopen(in_pcm_name, "rb");
    if(!in_pcm_fd){
        ret = -1;
        cout<<"main.cpp fopen(in_pcm_fd,'rb') func error return "<<endl;
        goto mainend;
    }




    ///第二步：初始化编码器，包括视频，音频编码器，分配yuv，pcm的帧buffer
    ///2.1 初始化视频编码器，这里为了后续的扩展和调试，我们使用变量，而不是宏定义完成
    ret = videoencoder.InitH264(yuvwidth,yuvheight,yuvpixfmt,yuvfps,video_bit_rate);
    if(ret < 0 ){
        cout<<"videoencoder.InitH264 error yuvwidth = " << yuvwidth
           <<" yuvheight = "<< yuvheight
          <<" yuvpixfmt = " << yuvpixfmt
         <<" yuvfps = " <<yuvfps
        <<" video_bit_rate = " << video_bit_rate
        <<endl;
        goto mainend;
    }
    /// 2.1.1分配yuv buf
    y_frame_size = yuvwidth * yuvheight;
    u_frame_size = yuvwidth * yuvheight /4;
    v_frame_size = yuvwidth * yuvheight /4;
    yuv_frame_size = y_frame_size + u_frame_size + v_frame_size;
    yuv_frame_buf = (uint8_t *)malloc(yuv_frame_size);
    if(!yuv_frame_buf)
    {
        ret =-1;
        printf("malloc(yuv_frame_size)\n");
        goto mainend;
    }


    ///2.2 初始化 音频编码器，这里为了后续的扩展，依然使用变量，而不是宏定义完成
    ret = audioencoder.InitAAC(pcm_avchannel_layout,AV_SAMPLE_FMT_FLTP,pcm_samplerate,audio_bit_rate);
    if(ret < 0 ){
        cout<<"audioencoder.InitH264 error pcm_avchannel_layout.nb_channels = " << pcm_avchannel_layout.nb_channels
           <<" pcm_sampleformat = "<< pcm_sampleformat
          <<" pcm_samplerate = " << pcm_samplerate
         <<" audio_bit_rate = " << audio_bit_rate
        <<endl;
        ret = -1;
        goto mainend;
    }

    ///2.2.1 分配pcm buf pcm_frame_size  = 单个采样点占用的字节 * 通道数量 * 每个通道有多少给采用点
    pcm_frame_size = av_get_bytes_per_sample((AVSampleFormat)pcm_sampleformat)
            *pcm_avchannel_layout.nb_channels * audioencoder.GetFrameSize();
    if(pcm_frame_size <= 0) {
        printf("pcm_frame_size <= 0\n");
        return -1;
    }
    pcm_frame_buf = (uint8_t *)malloc(pcm_frame_size);
    if(!pcm_frame_buf)
    {
        printf("malloc(pcm_frame_size)\n");
        ret = -1;
        goto mainend;
    }




    ///第三步：mp4初始化，包括新建流，open io，send header

    ///3.1 Init_Muxer方法 的核心是 avformat_alloc_output_context2方法，
    ret = mp4_muxer.Init_Muxer(out_mp4_name);
    if(ret <0){
        cout<<"main func error muxer.Init_Muxer(out_mp4_name) error out_mp4_name = "<< out_mp4_name << endl;
        goto mainend;
    }

    ///3.2 创建 video 的avstream，创建的video 的 avstream 会存储在 mp4_muxer 中的变量中
    ret = mp4_muxer.AddStream(videoencoder.GetCodecContext());
    if(ret < 0)
    {
        printf("mp4_muxer.AddStream video failed\n");
        goto mainend;
    }

    ///3.3 创建 audio 的avstream，创建的audio 的avstream 会存储在 mp4_muxer 中的变量中
    ret = mp4_muxer.AddStream(audioencoder.GetCodecContext());
    if(ret < 0)
    {
        printf("mp4_muxer.AddStream audio failed\n");
        return -1;
    }

    /// 3.4 核心方法是avio_open
    ret = mp4_muxer.open();
    if(ret < 0)
    {
        printf("mp4_muxer.Open failed\n");
        return -1;
    }
    /// 3.5 发送头部信息
    ret = mp4_muxer.sendHeader();

    if(ret < 0)
    {
        printf("mp4_muxer.SendHeader failed\n");
        return -1;
    }

    ///第四步中间，audio resample
    /// ///     int InitResampler(const AVChannelLayout *out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,
    ///                       const AVChannelLayout *in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate);
    out_resample_ch_layout = mp4_muxer.getAudioAVCondecContext()->ch_layout;
    out_resample_sample_fmt = mp4_muxer.getAudioAVCondecContext()->sample_fmt;
    out_resample_sample_rate =  mp4_muxer.getAudioAVCondecContext()->sample_rate;

    in_resample_ch_layout = PCM_CHANNEL_LAYOUT;
    in_resample_sample_fmt = PCM_SAMPLE_FORMAT;
    in_resample_sample_rate = PCM_SAMPLE_RATE;
    ret =  audioresample.InitResampler(&out_resample_ch_layout,
                                       out_resample_sample_fmt,
                                       out_resample_sample_rate,
                                       &in_resample_ch_layout,
                                       in_resample_sample_fmt,
                                       in_resample_sample_rate);
    if(ret <0){
        cout<<"main func  audioresample.InitResampler error "<<endl;
    }


    ///第四步：在while循环中读取yuv，pcm进行编码然后发送给 mp4 muxer
    /// 4.1  时间戳相关
    ///  处理一帧aac avframe所花费的时间，一帧aac 的avframe中有1024个样本帧
    /// 也就是说，一次处理1024个样本帧，占用的时间，最后面乘以1000000，是将单位 秒 转化成 微秒
    audio_frame_duration = 1.0 * audioencoder.GetFrameSize() / pcm_samplerate * audiotimebase;

    /// 处理一帧h264 avframe所花费的时间为：一帧h264只有一张图片  假设1秒有25张图片，那么所占用的时间如下，最后面乘以1000000，是将单位 秒 转化成 微秒
    video_frame_duration = 1.0 / yuvfps * videotimebase;

    /// 4.2 video读完 或者 audio 读完 标志, 两者都读取完成的时候才真正的完成
    //    audio_finish = 1;
    //    int video_finish = 0;
    //    int read_size = 0;  //用于记录每次读取的数据大小


    while(1){
        if(audio_finish && video_finish){
            break;
        }

        //读取video，从 in_yuv_fd 文件中读取数据，读取的内容存储到 yuv_frame_buf 中，每次读取1个字节，每次读取的大小为 yuv_frame_size
        if(video_finish != 1){
            read_size = fread(yuv_frame_buf, 1, yuv_frame_size, in_yuv_fd);
            ///对于视频来说，每次读取的大小都是 一张图片的大小，返回值也应该是一张图片的大小，因此如果读取到的返回值不是一张图片的大小，则有问题，需要在这里判断
            ///那么会是什么问题呢？一种情况就是 read_size的大小可能是0，读取完成了，就是0了。那么有没有可能是半张图片呢？理论上没有的，
            ///我们这里简单的认为：当读取到值是 < yuv_frame_size的时候，就认为读取完毕了，
            if(read_size < yuv_frame_size){
                video_finish = 1;
            }


            if(video_finish != 1){
                ///这里 在encode方法中，发送一次frame可能会生成多个packet，因此要使用vector<AVPakcet *> &pakcets 记录所有的 avpakcet
//                avpacket =  videoencoder.Encode(yuv_frame_buf,read_size,mp4_muxer.getVideoIndex(),video_pts,videotimebase);

                ret =  videoencoder.Encode(yuv_frame_buf,read_size,mp4_muxer.getVideoIndex(),video_pts,videotimebase,packets);
            }else{
                //当最后一次的时候，video_finish 值会变成1，因此要刷新
//                avpacket = videoencoder.Encode(nullptr,0,mp4_muxer.getVideoIndex(),video_pts,videotimebase);

                cout<<"flush start"<<endl;
                ret = videoencoder.Encode(nullptr,0,mp4_muxer.getVideoIndex(),video_pts,videotimebase,packets);
                cout<<"flush end"<<endl;


            }
            if(ret < 0){
                cout<<"videoencoder encode func error"<<endl;
                return ret;
            }

            //每读取一次，都需要给video pts + 上一次读取图片的时间
            video_pts += video_frame_duration;


//            if(avpacket){
//                ret = mp4_muxer.sendPacket(avpacket);
//            }
            //这时候就拿到了packets了，然后将该数据发送出去
            for(int i =0;i<packets.size();++i){
                ret = mp4_muxer.sendPacket(packets[i]);
                if(ret < 0){
                    cout<<"mp4_muxer send packet error packets[i]"<< packets[i] << endl;
                    continue;//这里发送一次有error，直接return吗？还是cotinue比较好
                }
            }
//            avpacket = nullptr;
            packets.clear();
        }

        // 读取audio，
        if(audio_finish != 1){
            //            AVFrame * AudioResampler::AllocFltpPcmFrame(AVChannelLayout out_ch_layout, AVSampleFormat out_sample_fmt, int out_sample_rate,int out_nb_samples){
            //            out_resample_ch_layout = mp4_muxer.getAudioAVCondecContext()->ch_layout;
            //            out_resample_sample_fmt = mp4_muxer.getAudioAVCondecContext()->sample_fmt;
            //            out_resample_sample_rate =  mp4_muxer.getAudioAVCondecContext()->sample_rate;
            AVFrame * fltpavframe = audioresample.AllocFltpPcmFrame(out_resample_ch_layout,
                                                                    out_resample_sample_fmt,
                                                                    out_resample_sample_rate,
                                                                    audioencoder.GetFrameSize());
            //            int AudioResampler::resamples_alloc_inbuffer_and_outbuffer(uint8_t ***in_audio_data,
            //                                                       int *in_linesize,
            //                                                       uint8_t ***out_audio_data,
            //                                                       int *out_linesize);
            uint8_t **in_audio_data = nullptr;
            int in_linesize = 0;
            uint8_t **out_audio_data = nullptr;
            int out_linesize = 0;
            //            cout<<"before  resamples_alloc_inbuffer_and_outbuffer method aa  = " << in_audio_data[0]<<endl;
            //            cout<<"before  resamples_alloc_inbuffer_and_outbuffer method bb = " << out_audio_data[1]<<endl;

            ret = audioresample.resamples_alloc_inbuffer_and_outbuffer(&in_audio_data,
                                                                       &in_linesize,
                                                                       &out_audio_data,
                                                                       &out_linesize);

            if(ret <= 0 ){
                cout<<"main func  resamples_alloc_inbuffer_and_outbuffer error "<<endl;
            }

            int realnumber = fread(in_audio_data[0], 1, in_linesize, in_pcm_fd);
            cout<<"in_linesize = "<< in_linesize << " out_linesize = " << realnumber << "realnumber = " << realnumber << endl;
            cout<<"in_audio_data = "<< in_audio_data << endl;
            cout<<"out_audio_data = "<< out_audio_data << endl;

            if(realnumber <in_linesize){
                audio_finish = 1;
            }

            if(audio_finish != 1){
                int out_nb_number = 1024; //我们输出的每次要是1024个样本帧，这是 ffmpeg aac决定的。
                /// 输入源每个声道的样本个数 = 输入源采样率 * 输出源每个声道的样本数 / 输出源的采样率

                int in_nb_samplerate = av_rescale_rnd(audioresample._in_sample_rate,
                                                      out_nb_number,
                                                      audioresample._in_sample_rate,
                                                      AV_ROUND_UP);
                audioresample.swr_convert_and_fill_avframe(
                                        out_audio_data,
                                        audioencoder.GetFrameSize(),
                                        (const uint8_t **)in_audio_data,
                                        in_nb_samplerate,
                                        fltpavframe);
    //            audioresample.swr_convert_and_fill_avframe_changed(fltpavframe,
    //                                                               (const uint8_t **)in_audio_data,
    //                                                               in_nb_samplerate);//这个1024应该是计算出来的 indata中的nb_number

                //优化，使用vecotr<AVPacket *> & packets 记录
    //            avpacket =  audioencoder.Encode(fltpavframe,mp4_muxer.getAudioIndex(),audio_pts,audiotimebase);
                ret =  audioencoder.Encode(fltpavframe,mp4_muxer.getAudioIndex(),audio_pts,audiotimebase,packets);

            } else {
                ret =  audioencoder.Encode(nullptr,mp4_muxer.getAudioIndex(),audio_pts,audiotimebase,packets);
            }

            if(ret < 0 ){
                cout<<"audioencoder encode func error"<<endl;
                continue;
            }


            //每读取一次，都需要给video pts + 上一次读取图片的时间
            audio_pts += audio_frame_duration;

            //这时候就拿到了packet了，然后将该数据发送出去

//            if(avpacket){
//                ret = mp4_muxer.sendPacket(avpacket);
//            }

            for(int i =0;i<packets.size();++i){
                ret = mp4_muxer.sendPacket(packets[i]);

            }


            packets.clear();
            audioresample.FreeFltpavframe(fltpavframe);

        }
    }

    ret = mp4_muxer.sendTrailer();

mainend:
    cout<<"bbb"<<endl;
    //    if(avpacket){//这里不需要释放avpacket，因为在 前面的 mp4_muxer.sendPacket(avpacket); 方法中，已经释放了，但是并没有将 avpacket=nullptr，因此这里avpacket不为nullptr，但是内部已经没有东西了，重复释放会有问题
    //        av_packet_free(&avpacket);
    //    }
    if(yuv_frame_buf){
        free(yuv_frame_buf);
        yuv_frame_buf = nullptr;
    }
    if(pcm_frame_buf){
        free(pcm_frame_buf);
        pcm_frame_buf = nullptr;
    }

    if(in_yuv_fd){
        fclose(in_yuv_fd);
        in_yuv_fd = nullptr;
    }
    if(in_pcm_fd){
        fclose(in_pcm_fd);
        in_pcm_fd = nullptr;
    }

    return ret;



    //    cout<<"ccc"<<endl;//return之后的代码，是走不到这块的

}

hunandede

关注

3
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
音视频开发35 FFmpeg 编码- 将YUV 和 pcm合成一个mp4文件

****该程序的目的是:* 将一个pcm文件和一个 yuv文件，合成为一个 0804_out.mp4文件* pcm文件和yuv文件是从哪里来的呢？是从 sound_in_sync_test.mp4 文件中，使用ffmpeg命令抽取出来的。* 这样做的目的是为了对比前后两个mp4(sound_in_sync_test.mp4 和 0804_out.mp4 ) 文件。
复制链接

扫一扫