一、常用的数据集网站
1、UCI官网(老版本):https://archive.ics.uci.edu/ml/index.phphttps://archive.ics.uci.edu/ml/index.php
UCI官网(新版本):https://archive-beta.ics.uci.edu/https://archive-beta.ics.uci.edu/2、Machine Learning Datasets
Machine Learning Datasets | Papers With Code286 datasets • 107918 papers with code.https://paperswithcode.com/datasets?lang=chinese&page=13、Kaggle比赛(这个登录有点麻烦):https://www.kaggle.com/datasetshttps://www.kaggle.com/datasets
4、天池大数据众智平台-阿里云天池(英文不好的可以试试这个):https://tianchi.aliyun.comhttps://tianchi.aliyun.com 5、飞浆数据集(百度的AI开放平台,选择【开发平台】,第二列的【数据集】就是):https://aistudio.baidu.com/aistudio/datasetoverviewhttps://aistudio.baidu.com/aistudio/datasetoverview
6、阿里巴巴datahub:https://github.com/alibaba/EasyNLP/tree/master/datahubhttps://github.com/alibaba/EasyNLP/tree/master/datahub
7、清华官网整理数据集:openslr.orgOpen Speech and Language Resources.https://www.openslr.org/resources.php
二、常用的数据集下载路径
1、鸢尾花数据集:https://archive.ics.uci.edu/ml/datasets/Irishttps://archive.ics.uci.edu/ml/datasets/Iris
2、红酒数据集:https://archive.ics.uci.edu/ml/datasets/Winehttps://archive.ics.uci.edu/ml/datasets/Wine
4、隐形眼镜数据集:https://archive.ics.uci.edu/ml/datasets/lenseshttps://archive.ics.uci.edu/ml/datasets/lenses5、患疝气病马的数据集:http://archive.ics.uci.edu/ml/datasets/Horse+Colichttp://archive.ics.uci.edu/ml/datasets/Horse+Colic
6、葡萄牙银行机构营销案例数据集:http://archive.ics.uci.edu/ml/datasets/Bank+Marketinghttp://archive.ics.uci.edu/ml/datasets/Bank+Marketing
7、1984年美国国会投票的数据集:http://archive.ics.uci.edu/ml/datasets/Congressional+Voting+Recordshttp://archive.ics.uci.edu/ml/datasets/Congressional+Voting+Records
8、发现毒蘑菇相似特征的数据集:https://archive.ics.uci.edu/ml/datasets/mushroomhttps://archive.ics.uci.edu/ml/datasets/mushroom
9、THUCNews数据集:http://thuctc.thunlp.org/http://thuctc.thunlp.org/10、今日头条新闻文本分类数据集地址:https://github.com/fate233/toutiao-text-classfication-datasethttps://github.com/fate233/toutiao-text-classfication-dataset另外几个是kaggle上的数据集(如果不登录还没法下,而且登录还麻烦):
11、旧金山犯罪案例:https://www.kaggle.com/c/sf-crimehttps://www.kaggle.com/c/sf-crime12、泰坦尼克幸存者预测:https://www.kaggle.com/c/titanic/datahttps://www.kaggle.com/c/titanic/data13、手写数字识别:https://www.kaggle.com/c/digit-recognizer/datahttps://www.kaggle.com/c/digit-recognizer/data