Dobly smart volume 的替代解决方案-CSDN博客

本文链接：https://blog.csdn.net/timedog/article/details/38037753

本文介绍了杜比SmartVolume技术，一种解决电视不同频道音量不一致问题的解决方案。通过动态增益控制，利用Sox/FFmpeg的compander滤镜实现了类似功能。文章对比了三种动态增益控制方法，并提供了一个基于compander的应用实例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

杜比的 smart volume 是杜比的一个针对电视以及嵌入式产品的解决方案。其主要解决了不同电视频道的响度不同的问题。

主要就是动态增益控制。利用 Sox /FFmpeg 的声学滤镜 compander，完全可以简单的实现类似的功能。Sox 是一个功能很强大的开源声学处理程序。虽然目前还是浮点的，但是在现在处理器发展的状况来看，这些问题似乎不大。在sox 的实例文档里，有多种声学滤镜级联处理的例子，经测试效果不错。

在电视中，实现动态增益控制主要有3种方法:

1. 硬件drc.

2. 三方方案，杜比的，或者其他的，看到过国内的方案。

3. 使用开源 compander.

使用 compander 和drc 比较，优势是drc 是整体的归一化，对于线性度比较好的喇叭，比较简单， compander 可以根据系统的频响进行调整，对于国内厂商来说，普遍使用的喇叭频响不太好，比较有优势。曾经碰到过使用一个厂商提供的破喇叭，用硬件drc 调试很久都效果很差的事情。

下面是一个应用，为了简单，选取了0延时的方案。 compander 的参数很多，可以去sox 的论坛上去查。

    // Post Processing
   {
       sox_signalinfo_t signal;
       sox_encodinginfo_t encoding;
       static sox_effect_handler_t in_handler = {
           "input", NULL, SOX_EFF_CHAN, NULL, NULL, NULL, input_drain, NULL, NULL, 0
       };
       static sox_effect_handler_t out_handler = {
           "output", NULL, SOX_EFF_CHAN, NULL, NULL, output_flow, NULL, NULL, NULL, 0
       };
       sox_effect_t * e;
       char * args[3];

       signal.channels =1;
       signal.precision=16;
       signal.rate=32000.0;
       signal.mult = NULL;

       encoding.bits_per_sample =16;
       encoding.encoding = SOX_ENCODING_SIGN2;
       encoding.compression = 1.0;
       encoding.reverse_bits = sox_option_default;
       encoding.reverse_bytes = sox_option_default;
       encoding.reverse_nibbles = sox_option_default;
       encoding.opposite_endian = sox_false;

       if(sox_init() != SOX_SUCCESS)
           WEBRTC_TRACE(kTraceError, kTraceUtility, -1, "Sox init failed");

       _chain = sox_create_effects_chain(&encoding, &encoding);

       e = sox_create_effect(&in_handler);
       sox_add_effect(_chain, e, &signal, &signal);
       free(e);
       e = sox_create_effect(sox_find_effect("compand"));
       args[0] = (char *)"0.1,0.3";
       args[1] = (char *)"-60,-60,-30,-15,-20,-12,-4,-8,-2,-7";
       args[2] = (char *)"-9";

       sox_effect_options(e, 3, args);

       /* Add the effect to the end of the effects processing chain: */
       sox_add_effect(_chain, e, &signal, &signal);
       free(e);

       e = sox_create_effect(&out_handler);
       sox_add_effect(_chain, e, &signal, &signal);
       free(e);
   }

static short io_temp[320];

int Channel::PostProcess(AudioFrame& frame) {
   int i;

   WEBRTC_TRACE(kTraceInfo, kTraceVoice, VoEId(_instanceId,_channelId),
       "Channel::PostProcess()");
   CriticalSectionScoped cs(&_callbackCritSect);

   for (i = 0; i < 320;i++)
   {
       io_temp[i] = frame.data_[i];
   }
   sox_flow_effects(_chain, NULL, NULL);
   for (i = 0; i < 320;i++) frame.data_[i] = io_temp[i];
   return 0;
}

int Channel::input_drain(
                       sox_effect_t * effp, sox_sample_t * obuf, size_t * osamp)
{
   int i;

   (void)effp;
   /* ensure that *osamp is a multiple of the number of channels. */
   *osamp -= *osamp % effp->out_signal.channels;

   for (i = 0; i<320; i++) obuf[i] = static_cast<int32_t>(io_temp[i]);

   return SOX_EOF;
}

/* The function that will be called to output samples from the effects chain.
* In this example, we store the samples in a SoX-opened audio file.
* In a different application, they might perhaps be analysed in some way,
* or displayed as a wave-form */
int Channel::output_flow(sox_effect_t *effp LSX_UNUSED, sox_sample_t const * ibuf,
                       sox_sample_t * obuf LSX_UNUSED, size_t * isamp, size_t * osamp)
{
   int i;

   if(* isamp == 0)
   {
       *osamp = 0;

       return SOX_SUCCESS; /* All samples output successfully */
   }

   if(*isamp ==320)
   for (i = 0; i< 320; i++) io_temp[i] = static_cast<int16_t>(ibuf[i]);
   *osamp = 0;
   (void)effp;
   return SOX_SUCCESS; /* All samples output successfully */
}