【语音之家公开课】SRD: A Dataset and Benchmark Perspective

 

本次语音之家公开课邀请到陈果果进行分享Speech Recognition Development: A Dataset and Benchmark Perspective

公开课简介

主题:Speech Recognition Development: A Dataset and Benchmark Perspective

时间:12月15日(周四)14:00-15:00

陈果果

嘉宾介绍

Dr. Guoguo Chen holds a Ph.D. degree in Electrical and Computer Engineering from the Johns Hopkins University and a B.Eng. degree in Electronic Engineering from Tsinghua University. During his Ph.D., he spent 5 years at the Center for Language and Speech Processing, Johns Hopkins University, where he worked on various aspects of speech recognition and was one of the key contributors to the open source speech recognition toolkit Kaldi, and the open source deep learning toolkit CNTK. He was the author of LibriSpeech, one of the most cited (3,500+ Google Scholar citations) speech recognition dataset/benchmark. He also spent two summers at Google Inc. where he developed the prototype of Android's wake word detection engine for "Okay Google", serving billions of Android/Google Home users. After graduation, Dr. Chen co-founded KITT.AI, a CBInsights AI 100 company in 2017, that was funded by Amazon’s Alexa Fund, Paul Allen’s Allen Institute for Artificial Intelligence, Madrona Venture Group, Founders’ Co-op, and A Level Capital. The company released two products: a customizable wake word engine and a conversation AI toolkit. It had more than 100,000 developers and customers over 20 countries in 4 continents. In 2017 KITT.AI was acquired by Baidu, which set up its first Seattle office upon the KITT.AI deal. In 2020, Dr. Chen co-founded Seasalt.ai. Dr. Chen also initiated SpeechColab, a volunteer organization for the speech recognition community, which released one of the largest speech recognition dataset GigaSpeech, covering 10,000 hours of transcribed audio and 33,000 hours of total audio for speech recognition research.

课程摘要

The previous decade saw remarkable development in automatic speech recognition technologies. While there are a lot of technical articles explaining the improvements from the model point of view, the impact of datasets and benchmarks to speech recognition development is not well studied. In this talk, we first investigate the contribution of datasets and benchmarks to speech recognition development. We then introduce a large scale English speech recognition dataset named GigaSpeech. We will demonstrate the data creation pipeline, as well as initial benchmarks on this dataset. Finally, we close this talk by outlining our on-going work for speech recognition benchmarks.

议 程

听课方式

直播将通过CSDN进行直播,手机端、PC端可同步观看

👇👇👇

【语音之家公开课】SRD: A Dataset and Benchmark Perspective-CSDN直播

活动奖品

12月15日在直播间,为大家准备1顶SpeechHome主题棒球帽、1个AISHELL5周年玩偶,观看直播互动即可抽取

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值