同步公众号(arXiv每日学术速递)
【1】 Addressing Missing Labels in Large-scale Sound Event Recognition using a Teacher-student Framework with Loss Masking
标题:使用具有丢失掩蔽的教师-学生框架解决大规模声音事件识别中的丢失标签
作者: Eduardo Fonseca, Xavier Serra
链接:https://arxiv.org/abs/2005.00878
【2】 VisualEchoes: Spatial Image Representation Learning through Echolocation
标题:VisualEchoes:通过回声定位的空间图像表征学习
作者: Ruohan Gao, Kristen Grauman
链接:https://arxiv.org/abs/2005.01616
【3】 Noise2Weight: On Detecting Payload Weight from Drones Acoustic Emissions
标题:Noise2Weight:从无人机声发射中检测有效载荷重量
作者: Omar Adel Ibrahim, Roberto Di Pietro
链接:https://arxiv.org/abs/2005.01347
【4】 MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
标题:MultiQT:用于语音实时问题跟踪的多模态学习
作者: Jakob Drachmann Havtorn, Željko Agić
备注:Accepted at ACL 2020
链接:https://arxiv.org/abs/2005.0081
【5】 Multi-episodic Perceived Quality of an Audio-on-Demand Service
标题:音频点播服务的多情节感知质量
作者: Dennis Guse, Sebastian Möller
备注:To appear at IEEE QoMEX 2020
链接:https://arxiv.org/abs/2005.00400
【6】 Jukebox: A Generative Model for Music
标题:自动点唱机:音乐的生成模式
作者: Prafulla Dhariwal, Ilya Sutskever
链接:https://arxiv.org/abs/2005.00341
【7】 Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
标题:多头单调组块注意在线语音识别中的应用
作者: Baiji Liu, Long Ma
链接:https://arxiv.org/abs/2005.00205
【8】 Unsupervised Domain Adaptation for Acoustic Scene Classification Using Band-Wise Statistics Matching
标题:基于带状统计匹配的无监督域自适应声学场景分类
作者: Alessandro Ilic Mezza, Augusto Sarti
备注:5 pages, 1 figure, 3 tables, submitted to EUSIPCO 2020
链接:https://arxiv.org/abs/2005.00145
【9】 An Early Study on Intelligent Analysis of Speech under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety
标题:COVID-19下语音智能分析的早期研究:严重性,睡眠质量,疲劳和焦虑
作者: Jing Han, Björn W. Schuller
链接:https://arxiv.org/abs/2005.0009
【10】 Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision
标题:看到声音和听到声音:使用跨模态自我监控学习辨别性嵌入
作者: Soo-Whan Chung, Joon Son Chung
链接:https://arxiv.org/abs/2004.14326
【11】 VGGSound: A Large-scale Audio-Visual Dataset
标题:VGGSound:一个大规模视听数据集
作者: Honglie Chen, Andrew Zisserman
备注:ICASSP2020
链接:https://arxiv.org/abs/2004.14368
【12】 Meta-Transfer Learning for Code-Switched Speech Recognition
标题:用于代码切换语音识别的元转移学习
作者: Genta Indra Winata, Pascale Fung
备注:Accepted in ACL 2020. The first two authors contributed equally to this work
链接:https://arxiv.org/abs/2004.14228
【13】 Determined BSS based on time-frequency masking and its application to harmonic vector analysis
标题:基于时频掩蔽的BSS确定及其在谐波矢量分析中的应用
作者: Kohei Yatabe, Daichi Kitamura
链接:https://arxiv.org/abs/2004.14091
【14】 Conditional Spoken Digit Generation with StyleGAN
标题:用StyleGAN生成条件口述数字
作者: Kasperi Palkama, Alexander Ilin
链接:https://arxiv.org/abs/2004.1376
【15】 Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
标题:基于对抗性特征学习和无监督聚类的语音合成用于具有声学和文本噪声的发现数据
作者: Shan Yang, Lei Xie
链接:https://arxiv.org/abs/2004.13595
【16】 Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
标题:用于普通话语音识别的变压器式换能器建模单元的研究
作者: Li Fu, Libo Zi
链接:https://arxiv.org/abs/2004.13522
【17】 Detect Language of Transliterated Texts
标题:检测音译文本的语言
作者: Sourav Sen
链接:https://arxiv.org/abs/2004.13521
【18】 L-Vector: Neural Label Embedding for Domain Adaptation
标题:L-Vector:用于域自适应的神经标签嵌入
作者: Zhong Meng, Chin-Hui Lee
备注:5 pages, 2 figure, ICASSP 2020
链接:https://arxiv.org/abs/2004.13480
【19】 When Hearing Defers to Touch
标题:当听觉延迟触摸时
作者: Hudin Charles, Hayward Vincent
链接:https://arxiv.org/abs/2004.13462
【20】 Autoencoding Neural Networks as Musical Audio Synthesizers
标题:自动编码神经网络作为音乐音频合成器
作者: Joseph Colonel, Sam Keene
链接:https://arxiv.org/abs/2004.13172
【21】 A session-based song recommendation approach involving user characterization along the play power-law distribution
标题:一种基于会话的歌曲推荐方法,包括沿播放幂律分布的用户特征
作者: Diego Sánchez-Moreno, María N. Moreno-García
链接:https://arxiv.org/abs/2004.1300