基于RNN的音频降噪算法 (附完整C代码)

 

前几天无意间看到一个项目rnnoise。

项目地址: https://github.com/xiph/rnnoise

基于RNN的音频降噪算法。

采用的是 GRU/LSTM 模型。

阅读下训练代码,可惜的是作者没有提供数据训练集。

不过基本可以断定他采用的数据集里,肯定有urbansound8k。

urbansound8k 数据集地址:

https://serv.cusp.nyu.edu/projects/urbansounddataset/urbansound8k.html

也可以考虑采用用作者训练的模型来构建数据集的做法,不过即费事,也麻烦。

经过实测,降噪效果很不错,特别是在背景噪声比较严重的情况下。

不过作者仅仅提供 pcm 的代码示例,并且还只支持48K采样率,

( 明显是为了兼容其另一个 项目  opus)

在很多应用场景下,这很不方便。

尽管稍微有点麻烦,但是事在人为,花了点时间,稍作修改。

 

具体修改如下:

1.支持wav格式

 采用dr_wav(https://github.com/mackron/dr_libs/blob/master/dr_wav.h )

2.支持全部采样率

采样率的处理问题,采用简单粗暴法,

详情请移步博主另一篇小文《简洁明了的插值音频重采样算法例子 (附完整C代码)

3.增加CMake文件

4.增加测试用 示例音频sample.wav 

取自(https://github.com/orctom/rnnoise-java)

 

贴上完整示例代码 :

/* Copyright (c) 2017 Mozilla */
/*
   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
   are met:

   - Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.

   - Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
   A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR
   CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
   EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
   PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
   PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
   LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
   NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
   SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#include <stdio.h>
#include "rnnoise.h"
#include <stdlib.h>
#include <stdint.h>

#define DR_WAV_IMPLEMENTATION

#include "dr_wav.h"


void wavWrite_int16(char *filename, int16_t *buffer, int sampleRate, uint32_t totalSampleCount) {
    drwav_data_format format;
    format.container = drwav_container_riff;
    format.format = DR_WAVE_FORMAT_PCM;
    format.channels = 1;
    format.sampleRate = (drwav_uint32) sampleRate;
    format.bitsPerSample = 16;
    drwav *pWav = drwav_open_file_write(filename, &format);
    if (pWav) {
        drwav_uint64 samplesWritten = drwav_write(pWav, totalSampleCount, buffer);
        drwav_uninit(pWav);
        if (samplesWritten != totalSampleCount) {
            fprintf(stderr, "ERROR\n");
            exit(1);
        }
    }
}

int16_t *wavRead_int16(char *filename, uint32_t *sampleRate, uint64_t *totalSampleCount) {
    unsigned int channels;
    int16_t *buffer = drwav_open_and_read_file_s16(filename, &channels, sampleRate, totalSampleCount);
    if (buffer == NULL) {
        fprintf(stderr, "ERROR\n");
        exit(1);
    }
    if (channels != 1) {
        drwav_free(buffer);
        buffer = NULL;
        *sampleRate = 0;
        *totalSampleCount = 0;
    }
    return buffer;
}

void splitpath(const char *path, char *drv, char *dir, char *name, char *ext) {
    const char *end;
    const char *p;
    const char *s;
    if (path[0] && path[1] == ':') {
        if (drv) {
            *drv++ = *path++;
            *drv++ = *path++;
            *drv = '\0';
        }
    } else if (drv)
        *drv = '\0';
    for (end = path; *end && *end != ':';)
        end++;
    for (p = end; p > path && *--p != '\\' && *p != '/';)
        if (*p == '.') {
            end = p;
            break;
        }
    if (ext)
        for (s = end; (*ext = *s++);)
            ext++;
    for (p = end; p > path;)
        if (*--p == '\\' || *p == '/') {
            p++;
            break;
        }
    if (name) {
        for (s = p; s < end;)
            *name++ = *s++;
        *name = '\0';
    }
    if (dir) {
        for (s = path; s < p;)
            *dir++ = *s++;
        *dir = '\0';
    }
}

void resampleData(const int16_t *sourceData, int32_t sampleRate, uint32_t srcSize, int16_t *destinationData,
                  int32_t newSampleRate) {
    if (sampleRate == newSampleRate) {
        memcpy(destinationData, sourceData, srcSize * sizeof(int16_t));
        return;
    }
    uint32_t last_pos = srcSize - 1;
    uint32_t dstSize = (uint32_t) (srcSize * ((float) newSampleRate / sampleRate));
    for (uint32_t idx = 0; idx < dstSize; idx++) {
        float index = ((float) idx * sampleRate) / (newSampleRate);
        uint32_t p1 = (uint32_t) index;
        float coef = index - p1;
        uint32_t p2 = (p1 == last_pos) ? last_pos : p1 + 1;
        destinationData[idx] = (int16_t) ((1.0f - coef) * sourceData[p1] + coef * sourceData[p2]);
    }
}

void denoise_proc(int16_t *buffer, uint32_t buffen_len) {
    const int frame_size = 480;
    DenoiseState *st;
    st = rnnoise_create();
    int16_t patch_buffer[frame_size];
    if (st != NULL) {
        uint32_t frames = buffen_len / frame_size;
        uint32_t lastFrame = buffen_len % frame_size;
        for (int i = 0; i < frames; ++i) {
            rnnoise_process_frame(st, buffer, buffer);
            buffer += frame_size;
        }
        if (lastFrame != 0) {
            memset(patch_buffer, 0, frame_size * sizeof(int16_t));
            memcpy(patch_buffer, buffer, lastFrame * sizeof(int16_t));
            rnnoise_process_frame(st, patch_buffer, patch_buffer);
            memcpy(buffer, patch_buffer, lastFrame * sizeof(int16_t));
        }
    }
    rnnoise_destroy(st);
}

void rnnDeNoise(char *in_file, char *out_file) {
    uint32_t in_sampleRate = 0;
    uint64_t in_size = 0;
    int16_t *data_in = wavRead_int16(in_file, &in_sampleRate, &in_size);
    uint32_t out_sampleRate = 48000;
    uint32_t out_size = (uint32_t) (in_size * ((float) out_sampleRate / in_sampleRate));
    int16_t *data_out = (int16_t *) malloc(out_size * sizeof(int16_t));
    if (data_in != NULL && data_out != NULL) {
        resampleData(data_in, in_sampleRate, (uint32_t) in_size, data_out, out_sampleRate);
        denoise_proc(data_out, out_size);
        resampleData(data_out, out_sampleRate, (uint32_t) out_size, data_in, in_sampleRate);
        wavWrite_int16(out_file, data_in, in_sampleRate, (uint32_t) in_size);
        free(data_in);
        free(data_out);
    } else {
        if (data_in) free(data_in);
        if (data_out) free(data_out);
    }
}


int main(int argc, char **argv) {
    printf("Audio Noise Reduction\n");
    printf("blog:http://cpuimage.cnblogs.com/\n");
    printf("e-mail:gaozhihan@vip.qq.com\n");
    if (argc < 2)
        return -1;

    char *in_file = argv[1];
    char drive[3];
    char dir[256];
    char fname[256];
    char ext[256];
    char out_file[1024];
    splitpath(in_file, drive, dir, fname, ext);
    sprintf(out_file, "%s%s%s_out%s", drive, dir, fname, ext);
    rnnDeNoise(in_file, out_file);
    printf("press any key to exit.\n");
    getchar();
    return 0;
}

 

不多写注释,直接看代码吧。

 项目地址:https://github.com/cpuimage/rnnoise

 

示例具体流程为:

加载wav(拖放wav文件到可执行文件上)->重采样降噪->保存wav

 

若有其他相关问题或者需求也可以邮件联系俺探讨。

邮箱地址是: 
gaozhihan@vip.qq.com

转载于:https://www.cnblogs.com/cpuimage/p/8733742.html

  • 0
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
目前,语音算法有很多种。频谱减法有原理简单、容易实现的优点,是 语音的常用算法。但是频谱减法也有如下两个缺点:一是频谱减法性能的好 坏主要依赖于声估计,而声估计又依赖于端点检测算法。在声水平强度高 时,一般的端点检测算法会失效,无法检测出信号中声帧的具体位置,从而影 响了声估计值的准确性;二是带信号经过频谱减法后,由于在谱减时减 去的是同一声估计值,就使得信号会随机出现分离的谱区,这些谱区就形成了 容易让人耳听觉疲惫的“音乐声”。 针对频谱减法上述的两个缺点,本文对其进行了改进。第一:为了使得声 端点检测算法声水平高时也能获得正确的检测,我们求带信号的幅度值均 值,并根据这个均值与带信号开始数帧的幅度均值大小来判断带信号是以 声开始还是以带语音信号开始。然后根据连续两帧信号的差值的变化来判断 声帧和语音帧的起始位置,同时我们在判断的同时把得到的均值做为声估计值, 这样既考虑到了连续前后两帧信号的相关性又能够衰减声。除此之外,基于本 文改进的声端点检测方法的声估计值能够在整个带语音信号上快速的更新 声估计值,提高频谱减法的实时处理能力。第二:为了减少频谱减法所引入的 音乐声,我们实现了用 LMS 算法在时域上进行语音增强,来处理谱减后的 信号。LMS 算法能够在声水平的同时把音乐声转换为能量更低的白声, 减少了音乐声对人耳的刺激,有助于提高处理后的音频的语音质量,提高主客 观评价效果。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值