音频录制

最新推荐文章于 2023-03-28 13:39:51 发布

一叶知秋@qqy

最新推荐文章于 2023-03-28 13:39:51 发布

阅读量370

点赞数 2

分类专栏： ffmpeg笔记

本文链接：https://blog.csdn.net/qq_41004932/article/details/117090385

版权

ffmpeg笔记专栏收录该内容

37 篇文章 16 订阅

订阅专栏

音频原始数据

PCM

PCM数据是采集到的纯的音频数据，也是最原始的数据。

WAV

WAV也是音频原始数据，只不过是在PCM的基础上加入了数据头，方便播放器使用正确的参数去播放。

量化基本概念

采样大小：一个采样用多少bit存放，常用的是16bit。也叫位深，位深越高描述的峰值越大，即描述声音的强度大

采样率：采样频率8K、16K、32K、44.1K、48K，采样率越高数字信号与模拟信号之间模仿的越接近，误差越小，打电话时常是8K，略有失真。

声道数：单声道、双声道、多声道，超过双声道的称为立体声。

码率计算

计算PCM音频流的码率，采样大小x采样率x声道数

例如：采样率是44.1KHz，采样大小是16bit，双声道的PCM编码的WAV文件，它的码率为44.1Kx16x2=1411.2Kb/s

这么大的码流显然是无法在我们的网络上传输。

WAV Header

在这里插入图片描述

参数解释：

ChunkID：实际上是字符串RIFF 大端4字节

ChunkSize：整个chunk数据块的大小小端4字节

Format：实际上是字符串WAVE；fmt->指的是后边是一个解释后边data的信息，data->后边是真正的数据大端4字节

Subchunk1 ID：这里实质上是字符串fmt表示后边是解释data的信息大端4字节

Subchunk1 Size：当前fmt的chunk块大小小端4字节

Audio Format：值为1时是PCM 小端2字节

NumChannels：声道数小端2字节

SampleRate：采样率小端4字节

ByteRate：采样率字节数，即用采样大小处8，一个字节或者两个字节再乘采样率，ByteRate=BitsPerSample/8*SampleRate 小端4字节

BlockAlign：块对其是几个字节对齐的，一般都是偶数小端2字节

BitsPerSample：采样大小(位深) 小端2字节

Subchunk2 ID：这里实质上是字符串data，表示后边是音频数据大端4字节

Subchunk2 Size：当前data的chunk块大小小端4字节

data：存储音频数据data 小端

这里给出一个WAV例子：

在这里插入图片描述

采集音频的步骤

注册设备

在进行采集音频时需要先进行注册设备，包括音视频设备，其他外部设备，继承设备

设置采集方式

mac系统->avfoundation

windows系统->dshow

linux系统->alsa

打开音频设备

打开音频设备后才能进行使用，进行采集

例子

H文件

#ifndef TESTC_H
#define TESTC_H

#include <stdio.h>
#include "libavutil/avutil.h"
#include "libavdevice/avdevice.h" //注册设备相关
#include "libavformat/avformat.h" //ffmpeg认为所有的设备、多媒体格式，所有的东西都是一种格式统一使用avformat

void test(void);

#endif

C文件

#include "testc.h"

void test()
{
    int ret = 0;
    char error[1024] = {0};
    AVFormatContext *fmt_ctx = NULL; // 设备上下文的指针
    AVDictionary *options = NULL; // 选则打开设备方式时需要的参数
    
    char *devicename = "hw:0"; //只选择音频设备，从第一个音频设备获取声音。通过设备的序号指定设备
    
    //register audio device
    avdevice_register_all();//注册设备-所有的设备，不只是音频的，还有视频等等的
    
    //get format 
    AVInputFormat *iformat = av_find_input_format("alsa"); //获取输入的格式 linux使用alsa，mac使用avfoundation
    
    /*
    * AVFormatContext **ps : 输出。上下文，相当于记录句柄
    * const char *url : 可以是网络地址也可以是本地文件，设备直接给个设备名就可以
    * AVInputFormat *iformat ： 音频设备的方式
    * AVDictionary **options ： 打开设备所用方式的一些所用参数，此时这里暂时给NULL
    */
    if((ret = avformat_open_input(&fmt_ctx, devicename, iformat, &options)) < 0 )
    {
        /*
        * int errnum : 错误码
        * char *errbuf : 具体的错误信息输出地址
        * size_t errbuf_size : 
        */
        av_strerror(ret, error, 1024);
        prointf(stderr, "failed to open audio device, [%d]%s\n", ret, error); //输出到标准错误里
        return;
    }
}

av_read_frame

读取数据，可以是音频数据也可以是视频数据，本例只是采集音频数据，所以不用判断，直接按音频处理就可以，后面在涉及到视频时在进行区分音频包还是视频包。具有两个重要的参数AVFormatContext和AVPacket，其中上下文AVFormatContext是在打开设备时获得的，AVPacket是音视频包，此场景下都是音频包。

AVPacket

他有两个成员，data->实际存放的音视频数据,size->音视频数据缓存区的大小

与AVPacket相关的API

av_init_packet->除了data和size以外的其他域的初始化

av_packet_unref->释放AVPacket资源

av_packet_alloc->分配一个AVPacket空间，对内部的成员会调用av_init_packet进行初始化

av_packet_free->av_packet_alloc的反操作，调用av_packet_unref释放AVPacket资源，释放av_packet_alloc分配的空间

例子

H文件

#ifndef TESTC_H
#define TESTC_H

#include <stdio.h>
#include "libavutil/avutil.h"
#include "libavdevice/avdevice.h" //注册设备相关
#include "libavformat/avformat.h" //ffmpeg认为所有的设备、多媒体格式，所有的东西都是一种格式统一使用avformat
#include "libavcodec/avcodec.h" //引入AVPacket

void rec_audio(void);

#endif

C文件

#include "my_ffmpeg_test.h"

void rec_audio()
{
    int ret = 0;
    char error[1024] = {0};
    
    //ctx
    AVFormatContext *fmt_ctx = NULL; // 设备上下文的指针
    AVDictionary *options = NULL; // 选则打开设备方式时需要的参数
    
    //packet
    int count = 0;//
    AVPacket pkt;//在栈中定义一个实体，因为要一直使用
    
    //[[video device]:[audio device]] 中括号代表可选
    char *devicename = "plughw:0,0"; //只选择音频设备，从第一个音频设备获取声音。通过设备的序号指定设备
    
    //set log level
    av_log_set_level(AV_LOG_DEBUG);
    
    //register audio device
    avdevice_register_all();//注册设备-所有的设备，不只是音频的，还有视频等等的
    avformat_network_init();
    
    //get format 
    AVInputFormat *iformat = av_find_input_format("alsa"); //获取输入的格式 linux使用 alsa ，mac 使用 avfoundation
        /*
    * AVFormatContext **ps : 输出。上下文，相当于记录句柄
    * const char *url : 可以是网络地址也可以是本地文件，设备直接给个设备名就可以
    * AVInputFormat *iformat ： 音频设备的方式
    * AVDictionary **options ： 打开设备所用方式的一些所用参数，此时这里暂时给NULL
    */
    //open device
    if((ret = avformat_open_input(&fmt_ctx, devicename, iformat, &options)) < 0 )
    {
        /*
        * int errnum : 错误码
        * char *errbuf : 具体的错误信息输出地址
        * size_t errbuf_size : 错误输出地址空间size
        */
        av_strerror(ret, error, 1024);
        fprintf(stderr, "Failed to open audio device, [%d]%s\n", ret, error); //输出到标准错误里
        return;
    }
    
    /*
     * AVFormatContext **ps : 输出。上下文，相当于记录句柄
     * AVPacket *pkt : 读出来的数据(此处音频数据)
    */
    //read data from device
    while((ret = av_read_frame(fmt_ctx, &pkt)) == 0 && count++ <500)
    {
        av_log(NULL, AV_LOG_INFO, "packet size is %d(%p), count=%d \n", pkt.size, pkt.data, count);
        av_packet_unref(&pkt); //release pkt
    }
    
    //close device and release ctx
    avformat_close_input(&fmt_ctx);
    av_log(NULL, AV_LOG_DEBUG, "finish!\n");
    
}

录制音频

录制文件

将音频数据写入到文件中

关闭文件

H文件

#ifndef TESTC_H
#define TESTC_H

#include <stdio.h>
#include "libavutil/avutil.h"
#include "libavdevice/avdevice.h" //注册设备相关
#include "libavformat/avformat.h" //ffmpeg认为所有的设备、多媒体格式，所有的东西都是一种格式统一使用avformat
#include "libavcodec/avcodec.h" //引入AVPacket

void rec_audio(void);

#endif

C文件

#include "my_ffmpeg_test.h"

void rec_audio()
{
    int ret = 0;
    char error[1024] = {0};
    
    //ctx
    AVFormatContext *fmt_ctx = NULL; // 设备上下文的指针
    AVDictionary *options = NULL; // 选则打开设备方式时需要的参数
    
    //packet
    int count = 0;//
    AVPacket pkt;//在栈中定义一个实体，因为要一直使用
    
    //[[video device]:[audio device]] 中括号代表可选
    char *devicename = "hw:0"; //只选择音频设备，从第一个音频设备获取声音。通过设备的序号指定设备
    
    //set log level
    av_log_set_level(AV_LOG_DEBUG);
    
    avdevice_register_all();//注册设备-所有的设备，不只是音频的，还有视频等等的
    avformat_network_init();

    //register audio device
    FILE *outfile = fopen("tem.pcm", "wb+");
    
    //get format 
    AVInputFormat *iformat = av_find_input_format("alsa"); //获取输入的格式 linux使用 alsa ，mac 使用 avfoundation
        /*
    * AVFormatContext **ps : 输出。上下文，相当于记录句柄
    * const char *url : 可以是网络地址也可以是本地文件，设备直接给个设备名就可以
    * AVInputFormat *iformat ： 音频设备的方式
    * AVDictionary **options ： 打开设备所用方式的一些所用参数，此时这里暂时给NULL
    */
    //open device
    if((ret = avformat_open_input(&fmt_ctx, devicename, iformat, &options)) < 0 )
    {
        /*
        * int errnum : 错误码
        * char *errbuf : 具体的错误信息输出地址
        * size_t errbuf_size : 错误输出地址空间size
        */
        av_strerror(ret, error, 1024);
        printf("open fail");
        fprintf(stderr, "Failed to open audio device, [%d]%s\n", ret, error); //输出到标准错误里
        return;
    }
    
    /*
     * AVFormatContext **ps : 输出。上下文，相当于记录句柄
     * AVPacket *pkt : 读出来的数据(此处音频数据)
    */
    //read data from device
    while((ret = av_read_frame(fmt_ctx, &pkt)) == 0 && count++ <5000)
    {
        /*
        * const void *restrict __ptr : 要写入数据的指针
        * size_t __size : 要写入数据的大小
        * size_t __nitems : 每个数据的个数，这里写1，一个包
        * FILE *restrict __stream : 指向文件的指针
        */
        //write file
        fwrite(pkt.data, pkt.size, 1, outfile);
        fflush(outfile); //刷新文件，不然系统不会立即将数据写入文件，会考虑效率而攒一大段数据一起写入
        av_log(NULL, AV_LOG_INFO, "packet size is %d(%p), count=%d \n", pkt.size, pkt.data, count);
        av_packet_unref(&pkt); //release pkt
    }
    
    //close file
    fclose(outfile);

    //close device and release ctx
    avformat_close_input(&fmt_ctx);
    av_log(NULL, AV_LOG_DEBUG, "finish!\n");
    
}