pjsip的语音检测

转自http://blog.sina.com.cn/s/blog_513f4e8401011hf9.html

在pjsip中包含了语音静音检测的实现。
静音检测开关宏

/**
 * Specify how long (in miliseconds) the stream should suspend the
 * silence detector/voice activity detector (VAD) during the initial
 * period of the session. This feature is useful to open bindings in
 * all NAT routers between local and remote endpoint since most NATs
 * do not allow incoming packet to get in before local endpoint sends
 * outgoing packets.
 *
 * Specify zero to disable this feature.
 *
 * Default: 600 msec (which gives good probability that some RTP 
 *                    packets will reach the destination, but without
 *                    filling up the jitter buffer on the remote end).
 */
#ifndef PJMEDIA_STREAM_VAD_SUSPEND_MSEC
#   define PJMEDIA_STREAM_VAD_SUSPEND_MSEC	600
#endif


方法就是对每帧的采样点进行累加,得到的结果和门限进行比较。

语音检测分为固定模式和动态模式。固定模式就是累加值和给定的门限比较,动态模式的具体方法还不确定。
语音检测的调用关系:

g711_encode()->pjmedia_silence_det_detect()
g711_encode:
is_silence = pjmedia_silence_det_detect(priv->vad,
(const pj_int16_t*) input->buf,
(input->size >> 1), NULL);
if (is_silence &&
(PJMEDIA_CODEC_MAX_SILENCE_PERIOD == -1 ||
silence_period < PJMEDIA_CODEC_MAX_SILENCE_PERIOD*8000/1000))
{
output->type = PJMEDIA_FRAME_TYPE_NONE;
output->buf = NULL;
output->size = 0;
output->timestamp = input->timestamp;
return PJ_SUCCESS;
} else {
priv->last_tx = input->timestamp;
}
}

........

这里把代码贴一下:
PJ_DEF(pj_bool_t) pjmedia_silence_det_apply( pjmedia_silence_det *sd,
pj_uint32_t level)
{
int avg_recent_level;
if (gp_pjmedia_silence_det->mode == VAD_MODE_NONE)//无模式
return PJ_FALSE;
if (gp_pjmedia_silence_det->mode == VAD_MODE_FIXED)//静态模式
return (level < gp_pjmedia_silence_det->threshold);
)//动态模式

gp_pjmedia_silence_det->sum_level += level;
++gp_pjmedia_silence_det->sum_cnt;
avg_recent_level = (gp_pjmedia_silence_det->sum_level / gp_pjmedia_silence_det->sum_cnt);
if (level > gp_pjmedia_silence_det->threshold ||
level >= PJMEDIA_SILENCE_DET_MAX_THRESHOLD)
{
gp_pjmedia_silence_det->silence_timer = 0;
gp_pjmedia_silence_det->voiced_timer += gp_pjmedia_silence_det->ptime;
switch(gp_pjmedia_silence_det->state) {
case STATE_VOICED:
if (gp_pjmedia_silence_det->voiced_timer > gp_pjmedia_silence_det->recalc_on_voiced) {

gp_pjmedia_silence_det->threshold = (avg_recent_level + gp_pjmedia_silence_det->threshold) >> 1;
TRACE_((THIS_FILE,"Re-adjust threshold (in talk burst)"
"to %d", gp_pjmedia_silence_det->threshold));
gp_pjmedia_silence_det->voiced_timer = 0;

gp_pjmedia_silence_det->sum_level = avg_recent_level;
gp_pjmedia_silence_det->sum_cnt = 1;
}
break;
case STATE_SILENCE:
TRACE_((THIS_FILE,"Starting talk burst (level=%d threshold=%d)",
level, gp_pjmedia_silence_det->threshold));
case STATE_START_SILENCE:
gp_pjmedia_silence_det->state = STATE_VOICED;

gp_pjmedia_silence_det->sum_level = level;
gp_pjmedia_silence_det->sum_cnt = 1;
break;
default:
pj_assert(0);
break;
}
} else {
gp_pjmedia_silence_det->voiced_timer = 0;
gp_pjmedia_silence_det->silence_timer += gp_pjmedia_silence_det->ptime;
switch(gp_pjmedia_silence_det->state) {
case STATE_SILENCE:
if (gp_pjmedia_silence_det->silence_timer >= gp_pjmedia_silence_det->recalc_on_silence) {
gp_pjmedia_silence_det->threshold = avg_recent_level << 1;
TRACE_((THIS_FILE,"Re-adjust threshold (in silence)"
"to %d", gp_pjmedia_silence_det->threshold));
gp_pjmedia_silence_det->silence_timer = 0;

gp_pjmedia_silence_det->sum_level = avg_recent_level;
gp_pjmedia_silence_det->sum_cnt = 1;
}
break;
case STATE_VOICED:
gp_pjmedia_silence_det->state = STATE_START_SILENCE;

gp_pjmedia_silence_det->sum_level = level;
gp_pjmedia_silence_det->sum_cnt = 1;
case STATE_START_SILENCE:
if (gp_pjmedia_silence_det->silence_timer >= gp_pjmedia_silence_det->before_silence) {
gp_pjmedia_silence_det->state = STATE_SILENCE;
gp_pjmedia_silence_det->threshold = avg_recent_level << 1;
TRACE_((THIS_FILE,"Starting silence (level=%d "
"threshold=%d,g_silence_nbr:%d)", level, gp_pjmedia_silence_det->threshold,g_silence_nbr));

gp_pjmedia_silence_det->sum_level = avg_recent_level;
gp_pjmedia_silence_det->sum_cnt = 1;
}
break;
default:
pj_assert(0);
break;
}
}
return (sd->state == STATE_SILENCE);
}


  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
pjsip是一种基于开源的多媒体通信库,它可以用于构建VoIP(Voice over Internet Protocol)应用程序。要在Windows系统上使用pjsip实现语音通话,可以按照以下步骤进行: 1. 下载pjsip库:首先,需要从pjsip的官方网站下载适用于Windows的pjsip库。选择与您系统架构相对应的版本,例如32位或64位。 2. 配置编译环境:安装和配置C编译器,例如MinGW。确保您的编译环境在系统的PATH环境变量中。 3. 构建pjsip库:将下载的pjsip源代码解压到本地目录。在命令行中进入该目录,并执行以下命令来构建pjsip库: ``` ./configure make dep make ``` 此过程可能需要一些时间,具体取决于您的系统性能。 4. 创建C语言项目:在您的开发环境(例如Visual Studio)中创建一个新的C语言项目。 5. 配置项目include路径:将pjsip库的include文件夹添加到您的项目的include路径中,以便您可以使用pjsip的头文件。 6. 配置链接器路径:将pjsip库的库文件夹添加到您项目的库路径中,以便您可以链接pjsip库。 7. 编写代码:在您的C语言项目中编写代码来实现语音通话功能。您可以使用pjsip提供的API来初始化pjsip库、创建SIP用户代理、完成呼叫建立和音频流。 8. 构建并运行项目:使用您的开发环境的编译和构建工具,构建并运行您的C语言项目。确保pjsip库和您的项目正确链接。 通过以上步骤,您将能够在Windows系统上使用pjsip实现语音通话功能。请注意,使用pjsip进行语音通话需要了解SIP(Session Initiation Protocol)协议和相关的VoIP概念。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值