大创——基于说话人识别和位置定位的智能考勤系统 2021.4.12-2021.4.18 周报

最新推荐文章于 2024-06-09 19:29:15 发布

hallo~~

最新推荐文章于 2024-06-09 19:29:15 发布

阅读量306

点赞数 1

文章标签： python

本文链接：https://blog.csdn.net/hallo_world__/article/details/115840600

版权

对3D-convolutional-speaker-recognition中论文的研读

一、论文摘要
二、论文插图
三、研读笔记
总结

一、论文摘要

In this paper, a novel method using 3D Convolutional Neural Network (3D-CNN) architecture has been proposed for speaker verification in the text-independent setting. One of the main challenges is the creation of the speaker models. Most of the previously-reported approaches create speaker models based on averaging the extracted features from utterances of the speaker, which is known as the d-vector system. In our paper, we propose an adaptive feature learning by utilizing the 3D-CNNs for direct speaker model creation in which, for both development and enrollment phases, an identical number of spoken utterances per speaker is fed to the network for representing the speakers’ utterances and creation of the speaker model. This leads to simultaneously capturing the speaker-related information and building a more robust system to cope with within-speaker variation. We demonstrate that the proposed method significantly outperforms the traditional d-vector verification system. Moreover, the proposed system can also be an alternative to the traditional d-vector system which is a one-shot speaker modeling system by utilizing 3D-CNNs.

二、论文插图

1.The C

最低0.47元/天解锁文章

hallo~~

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
大创——基于说话人识别和位置定位的智能考勤系统 2021.4.12-2021.4.18 周报

对3D-convolutional-speaker-recognition中论文的研读一、论文摘要二、论文插图1.The CNN architecture as the feature extractor2.The 3D-CNN architecture.3. The data input pipeline.4. The effect of the number of provided utterances on evaluation performance.5. The comparison of dif
复制链接

扫一扫