语音数据集

最大的语音数据下载网站:

openslr.org

vox-celeb说话人识别数据集:无法下载

OpenSpeaker之声纹数据整理 - 知乎本文是OpenSpeaker系列的第二篇文章,全系列可参考这篇文章或者文末的专栏: 蘑菇炖提莫:OpenSpeaker:从零实现一套声纹识别系统根据规划,今天先来看第一部分数据整理: 得益于业界最近几年的开源行动,公开的语…https://zhuanlan.zhihu.com/p/419979036

中国版本的vox能下载:

openslr.org

 

AISHELL-1 数据集解压方法

$ tar xzf data_aishell.tgz
$ cd data_aishell/wav
$ for tar in *.tar.gz; do tar xvf
$ tar; done

数据的组织形式,以语音识别为例子:
 

{
    "dict_filename": "dict.txt",

    "dataset":{
        "train":[
            {
                "name": "thchs30_train",
                "data_list": "datalist/thchs30/train.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/thchs30/train.syllable.txt"
            },
            {
                "name": "stcmds_train",
                "data_list": "datalist/st-cmds/train.wav.txt",
                "data_path": "/data/speech_data",
                "label_list": "datalist/st-cmds/train.syllable.txt"
            },
            {
                "name": "primewords_train",
                "data_list": "datalist/primewords/train.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/primewords/train.syllable.txt"
            },
            {
                "name": "aishell_train",
                "data_list": "datalist/aishell/train.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aishell/train.syllable.txt"
            },
            {
                "name": "aidatatang_train",
                "data_list": "datalist/aidatatang_lst/train.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aidatatang_lst/train.syllable.txt"
            },
            {
                "name": "magicdata_train",
                "data_list": "datalist/magicdata_lst/train.wav.lst",
                "data_path": "/data/speech_data/magicdata",
                "label_list": "datalist/magicdata_lst/train.syllable.txt"
            }
        ],

        "dev":[
            {
                "name": "thchs30_dev",
                "data_list": "datalist/thchs30/cv.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/thchs30/cv.syllable.txt"
            },
            {
                "name": "stcmds_dev",
                "data_list": "datalist/st-cmds/dev.wav.txt",
                "data_path": "/data/speech_data",
                "label_list": "datalist/st-cmds/dev.syllable.txt"
            },
            {
                "name": "primewords_dev",
                "data_list": "datalist/primewords/dev.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/primewords/dev.syllable.txt"
            },
            {
                "name": "aishell_dev",
                "data_list": "datalist/aishell/dev.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aishell/dev.syllable.txt"
            },
            {
                "name": "aidatatang_dev",
                "data_list": "datalist/aidatatang_lst/dev.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aidatatang_lst/dev.syllable.txt"
            },
            {
                "name": "magicdata_dev",
                "data_list": "datalist/magicdata_lst/dev.wav.lst",
                "data_path": "/data/speech_data/magicdata",
                "label_list": "datalist/magicdata_lst/dev.syllable.txt"
            }
        ],

        "test":[
            {
                "name": "thchs30_test",
                "data_list": "datalist/thchs30/test.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/thchs30/test.syllable.txt"
            },
            {
                "name": "stcmds_test",
                "data_list": "datalist/st-cmds/test.wav.txt",
                "data_path": "/data/speech_data",
                "label_list": "datalist/st-cmds/test.syllable.txt"
            },
            {
                "name": "primewords_test",
                "data_list": "datalist/primewords/test.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/primewords/test.syllable.txt"
            },
            {
                "name": "aishell_test",
                "data_list": "datalist/aishell/test.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aishell/test.syllable.txt"
            },
            {
                "name": "aidatatang_test",
                "data_list": "datalist/aidatatang_lst/test.wav.lst",
                "data_path": "/data/speech_data",
                "label_list": "datalist/aidatatang_lst/test.syllable.txt"
            },
            {
                "name": "magicdata_test",
                "data_list": "datalist/magicdata_lst/test.wav.lst",
                "data_path": "/data/speech_data/magicdata",
                "label_list": "datalist/magicdata_lst/test.syllable.txt"
            }
        ]
    }
}

图像处理数据集:

常见的深度学习图像处理数据集下载_萌1萌哒小萌萌的博客-CSDN博客_图像识别数据集下载
目前深度学习开源数据集整理_林老头、的博客-CSDN博客_dtu数据集

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值