java pcm 重采样,从8kHz的重采样/上采样声音帧至48kHz(Java / Android的)

The application that I am trying to develop for andriod, records frames at 48Khz (PCM 16bits & mono) and sends them to the network. Also, there is an incoming stream of audio at 8Khz. So, I receive 8Khz sampled frames and play them (my AudioTrack object is set to 8Khz), but when playing them, everything works but the latency is HUGE. It takes like around 3 seconds until you hear something.

I think that if I upsample the received frames from 8Khz to 48Khz and play them, there won't be such a huge playing latency. In fact when I record and play frames at the same rate, the latency is really low. The bad thing is that I am forced to do it this way: send to 48Khz and receive to 8Khz.

As explained before, I'm trying to upsample a sound frame (16bits PCM) from 8Khz to 48Khz. Does anybody know any routine/library/API in Java that does this???

I know the basics about upsampling a discreet signal, but I consider that to design and implement my own FIR filter and convolute it with the audio stream ....is way too much. Also, it is over my knowledge.

So...does anybody can help me with this?? Does anybody know any library/routine in Java that I can use?? Any suggestions or alternatives??

解决方案

A quick and dirty solution would be linear interpolation. Since you're always sampling up by a factor of six this is really easy to do:

It works somewhat like this (C-code, and untested, and I don't handle the last iteration properly, but it shows the idea I think).

void resample (short * output, short * input, int n)

{

// output ought to be 6 times as large as input (48000/8000).

int i;

for (i=0; i

{

output[i*6+0] = input[i]*6/6 + input[i+1]*0/6;

output[i*6+1] = input[i]*5/6 + input[i+1]*1/6;

output[i*6+2] = input[i]*4/6 + input[i+1]*2/6;

output[i*6+3] = input[i]*3/6 + input[i+1]*3/6;

output[i*6+4] = input[i]*2/6 + input[i+1]*4/6;

output[i*6+5] = input[i]*1/6 + input[i+1]*5/6;

}

Linear interpolation won't give you great sound quality but it is cheap and fast. You can improve this using cubic interpolation if you want to.

If you want a fast and high quality resampling I suggest that you compile a c resampling library like libresample using the Android-NDK and call it from java using JNI. That will be a lot faster. Writing the JNI code is something most people shy away from, but it's quite easy.. The NDK has lots of examples for this.

这是一个比较复杂的问题,需要涉及到音频解码、音频处理、音频输出等多个方面。以下是一个大致的实现思路,供参考。 1. 首先需要使用一个AAC解码库,将AAC格式的音频数据解码成PCM数据。常用的AAC解码库有FFmpeg、OpenCORE等。 2. 解码后的PCM数据需要经过重采样,将其采样率转换为目标采样率。常用的重采样库有libsamplerate、soxr等。 3. 经过重采样后的PCM数据可以通过音频输出设备进行播放。可以使用系统提供的音频输出接口,如Windows的WASAPI、Linux的ALSA等。也可以使用第三方的音频输出库,如PortAudio、SDL等。 下面是一个简单的示例代码,演示如何使用FFmpeg解码AAC音频,并通过SDL输出。 ```c #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdint.h> #include <SDL.h> #include <libavcodec/avcodec.h> #include <libavformat/avformat.h> #include <libswresample/swresample.h> #define AUDIO_FRAME_SIZE 1024 int main(int argc, char *argv[]) { if (argc < 2) { fprintf(stderr, "Usage: %s <audio_file>\n", argv[0]); return 1; } const char *audio_file = argv[1]; // 打开输入文件 AVFormatContext *fmt_ctx = NULL; if (avformat_open_input(&fmt_ctx, audio_file, NULL, NULL) < 0) { fprintf(stderr, "Error: could not open input file '%s'\n", audio_file); return 1; } // 查找音频流 int audio_stream_index = av_find_best_stream(fmt_ctx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0); if (audio_stream_index < 0) { fprintf(stderr, "Error: could not find audio stream in input file '%s'\n", audio_file); return 1; } // 获取音频解码器 AVCodecParameters *codec_params = fmt_ctx->streams[audio_stream_index]->codecpar; AVCodec *codec = avcodec_find_decoder(codec_params->codec_id); if (!codec) { fprintf(stderr, "Error: unsupported audio codec '%s'\n", avcodec_get_name(codec_params->codec_id)); return 1; } // 打开音频解码器 AVCodecContext *codec_ctx = avcodec_alloc_context3(codec); if (!codec_ctx) { fprintf(stderr, "Error: could not allocate audio codec context\n"); return 1; } if (avcodec_parameters_to_context(codec_ctx, codec_params) < 0) { fprintf(stderr, "Error: could not initialize audio codec context\n"); return 1; } if (avcodec_open2(codec_ctx, codec, NULL) < 0) { fprintf(stderr, "Error: could not open audio codec\n"); return 1; } // 初始化重采样器 SwrContext *swr_ctx = swr_alloc_set_opts(NULL, codec_ctx->channel_layout, AV_SAMPLE_FMT_S16, codec_ctx->sample_rate, codec_ctx->channel_layout, codec_ctx->sample_fmt, codec_ctx->sample_rate, 0, NULL); if (!swr_ctx) { fprintf(stderr, "Error: could not allocate resampler context\n"); return 1; } if (swr_init(swr_ctx) < 0) { fprintf(stderr, "Error: could not initialize resampler context\n"); return 1; } // 初始化SDL if (SDL_Init(SDL_INIT_AUDIO) < 0) { fprintf(stderr, "Error: could not initialize SDL\n"); return 1; } SDL_AudioSpec wanted_spec = { .freq = 48000, .format = AUDIO_S16SYS, .channels = codec_ctx->channels, .samples = AUDIO_FRAME_SIZE }; SDL_AudioSpec spec; SDL_AudioDeviceID dev = SDL_OpenAudioDevice(NULL, 0, &wanted_spec, &spec, 0); if (!dev) { fprintf(stderr, "Error: could not open SDL audio device\n"); return 1; } SDL_PauseAudioDevice(dev, 0); // 解码并输出音频数据 AVPacket pkt; av_init_packet(&pkt); while (av_read_frame(fmt_ctx, &pkt) == 0) { if (pkt.stream_index == audio_stream_index) { AVFrame *frame = av_frame_alloc(); if (!frame) { fprintf(stderr, "Error: could not allocate audio frame\n"); break; } int ret = avcodec_send_packet(codec_ctx, &pkt); while (ret >= 0) { ret = avcodec_receive_frame(codec_ctx, frame); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break; else if (ret < 0) { fprintf(stderr, "Error: could not decode audio frame\n"); break; } uint8_t *out_buf = malloc(frame->nb_samples * codec_ctx->channels * av_get_bytes_per_sample(AV_SAMPLE_FMT_S16)); if (!out_buf) { fprintf(stderr, "Error: could not allocate output buffer\n"); break; } int out_samples = swr_convert(swr_ctx, &out_buf, frame->nb_samples, (const uint8_t **)frame->extended_data, frame->nb_samples); if (out_samples <= 0) { free(out_buf); continue; } SDL_QueueAudio(dev, out_buf, out_samples * codec_ctx->channels * av_get_bytes_per_sample(AV_SAMPLE_FMT_S16)); } av_frame_free(&frame); } av_packet_unref(&pkt); } // 清理资源 swr_free(&swr_ctx); avcodec_free_context(&codec_ctx); avformat_close_input(&fmt_ctx); SDL_CloseAudioDevice(dev); SDL_Quit(); return 0; } ``` 该示例代码使用FFmpeg库进行AAC解码和PCM重采样,使用SDL库进行音频输出。可以通过编译链接FFmpeg和SDL库后运行。在Linux下可以使用以下命令编译: ``` gcc -o aacplayer aacplayer.c -lavcodec -lavformat -lswresample -lSDL2 ``` 在Windows下可以使用以下命令编译: ``` gcc -o aacplayer.exe aacplayer.c -lavcodec -lavformat -lswresample -lSDL2 -lws2_32 ``` 这里需要注意的是,不同操作系统下的编译链接方式可能有所不同,需要根据具体情况进行调整。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值