音频编解码相关的资料总结

最新推荐文章于 2023-12-29 10:57:21 发布

pangdawa

最新推荐文章于 2023-12-29 10:57:21 发布

阅读量406

点赞数 1

分类专栏： C++ 文章标签：音频编码解码

本文链接：https://blog.csdn.net/haithink/article/details/115254854

版权

C++ 专栏收录该内容

56 篇文章 0 订阅

订阅专栏

最近工作需要做一个音频推送，也就是将音频数据推送到某个服务器，服务器解码后播放，服务器是乙方负责，但乙方貌似只是在网上抄代码，根本不清楚一些技术细节和要点，等于自己来把坑踩一遍，，，

首先，需要得到PCM数据，也是就找了个WAV文件，因为WAV文件里面往往直接包含了原始PCM数据，然后发现很多介绍WAV文件格式的资料和实际情况对应不上，主要是文件头的结果对应不上。比如百度文库的这篇文章，

https://wenku.baidu.com/view/fb783def89eb172ded63b77e.html

这篇文章介绍了PCM格式和ADPCM格式的WAV文件，
看这篇文章给人一种只要WAV内部数据的编码格式确定了，文件头也就确定了，但实际打开一些WAV文件发现却不是这样，

WAV文件头由多个CHUNK 构成，实际情况表明 CHUNK的数量和种类是变化的，
WAV属于所谓 RIFF 格式的文件，可以参考如下两篇文章结合十六进制编辑器打开一些文件观察下，
https://www.jianshu.com/p/63d7aa88582b WAV文件格式解析及处理，比较特别的是所谓LIST CHUNK，因为这种CHUNK 可以嵌套其他CHUNK

https://www.cnblogs.com/wangguchangqing/p/5957531.html RIFF和WAVE音频文件格式

WAV 内部的音频数据一般都是最后一个CHUNK，如果包含的是PCM数据，那么直接用linux下的dd命令就可以提取出全部音频数据到一个单独的文件
dd if=ima_adpcm.wav of=ima_adpcm.bin bs=1 skip=94
这里假设跳过文件头的94个字节

得到PCM音频数据后，需要编码成ADPCM格式，这里又碰到一个问题，网上很多资料和代码要么不全，要么A编码成B，B再解码成C，结果C和A不一样了。

ADPCM的具体编码算法和数据结构参考
https://blog.csdn.net/houxiaoni01/article/details/104702570

https://blog.csdn.net/u013910954/article/details/78366451

struct ADPCMBlock
{
    qint16 sample0; //原始PCM采样数据
    quint8 index;
    quint8 RESERVED;
    quint8 sampledata[252];
};

这里的块是256个字节，但实际上并不需要一定是这个值，并且，这里一个样本点是用16位表示，实际情况也并不一定是这样

很多文章里面只给了编码和解码，没有完成的调用过程，这里面还是有点细节要注意，主要是注意每个块的index 和 sample0的填充，因为很可能一个块编码错误，后续块会全部错误，
全部代码如下，三个文件
adpcm.h

/*
** adpcm.h - include file for adpcm coder.
**
** Version 1.0, 7-Jul-92.
*/

#ifndef _ADPCM_H_
#define _ADPCM_H_

#ifdef __cplusplus
extern "C" {
#endif

typedef struct adpcm_state_t {
	short	valprev;	/* Previous output value */
	char	index;		/* Index into stepsize table */
}adpcm_state;


void adpcm_coder(short *indata, char *outdata, int len, adpcm_state *state);

void adpcm_decoder(char *indata, short *outdata, int len, adpcm_state *state);

#ifdef __cplusplus
}
#endif

#endif

adpcm.c

/***********************************************************
Copyright 1992 by Stichting Mathematisch Centrum, Amsterdam, The
Netherlands.

All Rights Reserved

Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the names of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior permission.

STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

******************************************************************/

/*
** Intel/DVI ADPCM coder/decoder.
**
** The algorithm for this coder was taken from the IMA Compatability Project
** proceedings, Vol 2, Number 2; May 1992.
**
** Version 1.2, 18-Dec-92.
**
** Change log:
** - Fixed a stupid bug, where the delta was computed as
**   stepsize*code/4 in stead of stepsize*(code+0.5)/4.
** - There was an off-by-one error causing it to pick
**   an incorrect delta once in a blue moon.
** - The NODIVMUL define has been removed. Computations are now always done
**   using shifts, adds and subtracts. It turned out that, because the standard
**   is defined using shift/add/subtract, you needed bits of fixup code
**   (because the div/mul simulation using shift/add/sub made some rounding
**   errors that real div/mul don't make) and all together the resultant code
**   ran slower than just using the shifts all the time.
** - Changed some of the variable names to be more meaningful.
*/

#include "adpcm.h"
#include <stdint.h>
#include <stdio.h> /*DBG*/

#ifndef __STDC__
#define signed
#endif

//#define ADPCM_ENDIAN_SWAP


#ifdef ADPCM_ENDIAN_SWAP
uint16_t swap_uint16(uint16_t val)
{
	return (val << 8) | (val >> 8);
}

//! Byte swap short
int16_t swap_int16(int16_t val)
{
	return (val << 8) | ((val >> 8) & 0xFF);
}
#endif

/* Intel ADPCM step variation table */
static int indexTable[16] = {
	-1, -1, -1, -1, 2, 4, 6, 8,
	-1, -1, -1, -1, 2, 4, 6, 8,
};

static int stepsizeTable[89] = {
	7, 8, 9, 10, 11, 12, 13, 14, 16, 17,
	19, 21, 23, 25, 28, 31, 34, 37, 41, 45,
	50, 55, 60, 66, 73, 80, 88, 97, 107, 118,
	130, 143, 157, 173, 190, 209, 230, 253, 279, 307,
	337, 371, 408, 449, 494, 544, 598, 658, 724, 796,
	876, 963, 1060, 1166, 1282, 1411, 1552, 1707, 1878, 2066,
	2272, 2499, 2749, 3024, 3327, 3660, 4026, 4428, 4871, 5358,
	5894, 6484, 7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899,
	15289, 16818, 18500, 20350, 22385, 24623, 27086, 29794, 32767
};

void adpcm_coder(short *indata, char *outdata, int len, adpcm_state *state)
{
	short *inp;			/* Input buffer pointer */
	signed char *outp;		/* output buffer pointer */
	int val;			/* Current input sample value */
	int sign;			/* Current adpcm sign bit */
	int delta;			/* Current adpcm output value */
	int diff;			/* Difference between val and valprev */
	int step;			/* Stepsize */
	int valpred;		/* Predicted output value */
	int vpdiff;			/* Current change to valpred */
	int index;			/* Current step change index */
	int outputbuffer;		/* place to keep previous 4-bit value */
	int bufferstep;		/* toggle between outputbuffer/output */
	short tmp;

	outp = (signed char *)outdata;
	inp = indata;

	valpred = state->valprev;
	index = state->index;
	step = stepsizeTable[index];

	bufferstep = 1;

	for (; len > 0; len--) {
#ifdef ADPCM_ENDIAN_SWAP	
		tmp = swap_int16(*inp);
		inp++;
		val = tmp;
#else
		val = *inp++;
#endif	

		/* Step 1 - compute difference with previous value */
		diff = val - valpred;
		sign = (diff < 0) ? 8 : 0;
		if (sign) diff = (-diff);

		/* Step 2 - Divide and clamp */
		/* Note:
		** This code *approximately* computes:
		**    delta = diff*4/step;
		**    vpdiff = (delta+0.5)*step/4;
		** but in shift step bits are dropped. The net result of this is
		** that even if you have fast mul/div hardware you cannot put it to
		** good use since the fixup would be too expensive.
		*/
		delta = 0;
		vpdiff = (step >> 3);

		if (diff >= step) {
			delta = 4;
			diff -= step;
			vpdiff += step;
		}
		step >>= 1;
		if (diff >= step) {
			delta |= 2;
			diff -= step;
			vpdiff += step;
		}
		step >>= 1;
		if (diff >= step) {
			delta |= 1;
			vpdiff += step;
		}

		/* Step 3 - Update previous value */
		if (sign)
			valpred -= vpdiff;
		else
			valpred += vpdiff;

		/* Step 4 - Clamp previous value to 16 bits */
		if (valpred > 32767)
			valpred = 32767;
		else if (valpred < -32768)
			valpred = -32768;

		/* Step 5 - Assemble value, update index and step values */
		delta |= sign;

		index += indexTable[delta];
		if (index < 0) index = 0;
		if (index > 88) index = 88;
		step = stepsizeTable[index];

		/* Step 6 - Output value */
		if (bufferstep) {
			outputbuffer = (delta << 4) & 0xf0;
		}
		else {
			*outp++ = (delta & 0x0f) | outputbuffer;
		}
		bufferstep = !bufferstep;
	}

	/* Output last step, if needed */
	if (!bufferstep)
		*outp++ = outputbuffer;

	state->valprev = valpred;
	state->index = index;
}


void adpcm_decoder(char *indata, short *outdata, int len, adpcm_state *state)
{
	printf("len is  %d", len);
	signed char *inp;		/* Input buffer pointer */
	short *outp;		/* output buffer pointer */
	int sign;			/* Current adpcm sign bit */
	int delta;			/* Current adpcm output value */
	int step;			/* Stepsize */
	int valpred;		/* Predicted value */
	int vpdiff;			/* Current change to valpred */
	int index;			/* Current step change index */
	int inputbuffer;		/* place to keep next 4-bit value */
	int bufferstep;		/* toggle between inputbuffer/input */
	short tmp;
	outp = outdata;
	inp = (signed char *)indata;

	valpred = state->valprev;
	index = state->index;
	step = stepsizeTable[index];

	bufferstep = 0;

	for (; len > 0; len--) {

		/* Step 1 - get the delta value */
		if (bufferstep) {
			delta = inputbuffer & 0xf;
		}
		else {
			inputbuffer = *inp++;
			delta = (inputbuffer >> 4) & 0xf;
		}
		bufferstep = !bufferstep;

		/* Step 2 - Find new index value (for later) */
		index += indexTable[delta];
		if (index < 0) index = 0;
		if (index > 88) index = 88;

		/* Step 3 - Separate sign and magnitude */
		sign = delta & 8;
		delta = delta & 7;

		/* Step 4 - Compute difference and new predicted value */
		/*
		** Computes 'vpdiff = (delta+0.5)*step/4', but see comment
		** in adpcm_coder.
		*/
		vpdiff = step >> 3;
		if (delta & 4) vpdiff += step;
		if (delta & 2) vpdiff += step >> 1;
		if (delta & 1) vpdiff += step >> 2;

		if (sign)
			valpred -= vpdiff;
		else
			valpred += vpdiff;

		/* Step 5 - clamp output value */
		if (valpred > 32767)
			valpred = 32767;
		else if (valpred < -32768)
			valpred = -32768;

		/* Step 6 - Update step value */
		step = stepsizeTable[index];

		/* Step 7 - Output value */
#ifdef ADPCM_ENDIAN_SWAP	
		tmp = valpred;
		tmp = swap_int16(tmp);
		*outp++ = tmp;
#else
		*outp++ = valpred;
#endif			

	}

	state->valprev = valpred;
	state->index = index;
}

test.cpp

#include <iostream>

#include "adpcm.h"

struct ADPCMBlock
{
	short sample0;
	unsigned char index;
	unsigned char RESERVED;
	char sampledata[252];
};

void convertPCM2ADPCM() {
	char * rawPcmFile = "L43.raw"; // 注意，需要 采样点位数为16位
	FILE * input_fd = fopen(rawPcmFile, "rb");

	if (!input_fd) {
		std::cout << " open input file failed\n";
		return;
	}
	fseek(input_fd, 0L, SEEK_END);
	int pcmSize = ftell(input_fd);

	fseek(input_fd, 0L, SEEK_SET);

	int totalblocks = 0;
	int blockcnt = 0;
	int tempcnt = 0;
	int readUnitSize = 1010; // adpcm块大小为256的前提下，每个块对应的 PCM数据是1010个字节， 参考https://blog.csdn.net/u013910954/article/details/78366451

	if (pcmSize % readUnitSize)
		totalblocks = (pcmSize) / readUnitSize + 1;                       //get total adpcm blocks
	else
		totalblocks = (pcmSize) / readUnitSize;

	ADPCMBlock block;
	adpcm_state_t state;
	state.index = 0;

	int totalreadsize = 0;

	unsigned char * buf = (unsigned char *)malloc(2000);

	char * outputFileName = "L43_c_gen.bin";
	FILE * output_fd = fopen(outputFileName, "wb");

	while (blockcnt < totalblocks)
	{
		if (blockcnt == 0)
			block.index = 0;
		else
			block.index = state.index;                                                  //get block index
		int readdatasize = fread(buf, 1, readUnitSize, input_fd);       // read 1010 byte pcm data(505 sample points)
		totalreadsize += readdatasize;                                                  //update total read size
		block.sample0 = (static_cast<short>(buf[1]) << 8) | buf[0];       //get first sample point of current block;
		block.RESERVED = 0;
		state.valprev = block.sample0;
		adpcm_coder(reinterpret_cast<short *>(&buf[2]), block.sampledata, (readdatasize - 2) / 2, &state);//convert the remain 504 sample points;
		
		std::cout << (int)state.index << std::endl;
		//tempcnt = out.writeRawData(reinterpret_cast<char *>(&block), sizeof(block));               //write 256 bytes block data
		fwrite(&block, 1, sizeof(block), output_fd);

		blockcnt++;
	}


	fclose(output_fd);
	fclose(input_fd);
	free(buf);
}

void convertADPCM2PCM() {
	char * inputFileName = "output8000_2.bin";
	FILE * input_fd = fopen(inputFileName, "rb");

	if (!input_fd) {
		std::cout << " open input file failed\n";
		return;
	}
	fseek(input_fd, 0L, SEEK_END);
	int adpcmfilesize = ftell(input_fd);

	fseek(input_fd, 0L, SEEK_SET);

	int blockalign = 256;

	int totalblocks;

	if ((adpcmfilesize) % (blockalign))
		totalblocks = (adpcmfilesize) / (blockalign) + 1;
	else
		totalblocks = (adpcmfilesize) / (blockalign);

	adpcm_state_t state;
	memset(&state, 0, sizeof(state));

	int blockcnt = 0;

	char * outputFileName = "output8000_2_c_gen.pcm";
	FILE * output_fd = fopen(outputFileName, "wb");

	unsigned char * buf = (unsigned char *)malloc(2000);

	int shortNumber = ((blockalign - 4) * 4 + 2) / 2;
	std::cout << "short number is " << shortNumber << std::endl;

	short * pcmbuffer = (short *) new short[shortNumber];

	while (blockcnt < totalblocks)
	{
		int readdatasize = fread(buf, 1, blockalign, input_fd);       // read 1010 byte pcm data(505 sample points)

		pcmbuffer[0] = static_cast<short>(buf[1]) << 8 | buf[0];
		//if (blockcnt == 0) {
		//	state.index = 0;
		//}
		//else {
		state.index = buf[2];
		//}
		state.valprev = pcmbuffer[0];
		adpcm_decoder((char *)&buf[4], &pcmbuffer[1], (readdatasize - 4) * 2, &state);

		fwrite(pcmbuffer, 1, shortNumber * sizeof(short), output_fd);

		blockcnt++;
	}

	delete[]pcmbuffer;

	fclose(output_fd);
	fclose(input_fd);
	free(buf);
}

int main() {

	convertPCM2ADPCM();
	return 0;
	//convertADPCM2PCM();
	//return 0;

	char filename[] = "tcpdump.bin";

	FILE  * fd;
	errno_t err = fopen_s(&fd, filename, "rb");;

	if (err != 0) {
		return -1;
	}

	char *  frame = (char *)malloc(500000);

	FILE  * fd_out;
	char filenameOut[] = "audio.bin";

	err = fopen_s(&fd_out, filenameOut, "wb");;

	if (err != 0) {
		printf("fopen_s outfile failed\n");
		return -1;
	}

	while (1) {
		int frameSize = getFrameFromH264File(fd, frame, 500000);
		if (frameSize < 0)
		{
			printf("read finish\n");
			break;
		}

		// 检查头部
		int type = (int)frame[15];

		if (type == 0x30) {
			// 应该是音频包
			int t1 = *(unsigned char *)(frame + 24);
			int t2 = *(unsigned char *)(frame + 25);
			int loadSize = (t1 << 8) + t2;
			printf("type is %d \n", type);

			printf("loadSize is  %d\n", loadSize);

			fwrite(frame+26, 1, loadSize, fd_out);
		}
		else {
		}
	}

	fclose(fd);
	fclose(fd_out);

	return 0;
}

如何验证编码和解码是否正确呢？pcm编码成adpcm后，再解码成pcm数据，和原始pcm数据对照下，看看是否一致，即使有差异，应该是最多是某些字节的值差1，

使用一个软件，audacity，有人说是音频分析神器，可以直接播放pcm数据，通过菜单：文件->导入->原始数据，
注意正确填写位数、大小端、采样率，这些都很重要

十六进制编辑器推荐 HxD hex editor

另外，有时候需要直接从一些mp3之类的文件获取pcm数据，可以通过ffmpeg，
ffmpeg -i L43.mp3 -f s16le -ac 1 -acodec pcm_s16le -ar 8000 L43.raw
ac表示通道数， pcm_s16le 表示有符号的16位数据，小端存储， ar表示采样率调整为 8000

ffmpeg也支持adpcm，但是它提供的adpcm文件格式实在太多了，试了两种，得到的文件头和数据部分都不一样，因此不知道用哪个才是想要的，

pangdawa

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
音频编解码相关的资料总结

最近工作需要做一个音频推送，也就是将音频数据推送到某个服务器，服务器解码后播放，服务器是乙方负责，但乙方貌似只是在网上抄代码，根本不清楚一些技术细节和要点，等于自己来把坑踩一遍，，，首先，需要得到PCM数据，也是就找了个WAV文件，因为WAV文件里面往往直接包含了原始PCM数据，然后发现很多介绍WAV文件格式的资料和实际情况对应不上，主要是文件头的结果对应不上。比如百度文库的这篇文章，https://wenku.baidu.com/view/fb783def89eb172ded63b77e.html这
复制链接

扫一扫