sklearn——数据集
Scikit-learn(sklearn)
是机器学习中常用的第三方模块,对常用的机器学习方法进行了封装,包括回归(Regression)
、降维(Dimensionality Reduction)
、分类(Classfication)
、聚类(Clustering)
等方法。作为机器学习库,sklearn
内置了非常丰富的数据集,本文介绍sklearn
数据集及导入过程
1.安装
pip install scikit-learn
2.导入sklearn
、查看所有数据集
import sklearn.datasets as sds
print(dir(sds))
'''输出
['__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '_base', '_california_housing', '_covtype', '_kddcup99', '_lfw', '_olivetti_faces', '_openml', '_rcv1', '_samples_generator', '_species_distributions', '_svmlight_format_fast', '_svmlight_format_io', '_twenty_newsgroups',
'clear_data_home', 'dump_svmlight_file',
'fetch_20newsgroups', 'fetch_20newsgroups_vectorized', 'fetch_california_housing', 'fetch_covtype', 'fetch_kddcup99', 'fetch_lfw_pairs', 'fetch_lfw_people', 'fetch_olivetti_faces', 'fetch_openml', 'fetch_rcv1', 'fetch_species_distributions',
'get_data_home',
'load_boston', 'load_breast_cancer', 'load_diabetes', 'load_digits', 'load_files', 'load_iris', 'load_linnerud', 'load_sample_image', 'load_sample_images', 'load_svmlight_file', 'load_svmlight_files', 'load_wine',
'make_biclusters', 'make_blobs', 'make_checkerboard', 'make_circles', 'make_classification', 'make_friedman1', 'make_friedman2', 'make_friedman3', 'make_gaussian_quantiles', 'make_hastie_10_2', 'make_low_rank_matrix', 'make_moons', 'make_multilabel_classification', 'make_regression', 'make_s_curve', 'make_sparse_coded_signal', 'make_sparse_spd_matrix', 'make_sparse_uncorrelated', 'make_spd_matrix', 'make_swiss_roll']
'''
sklearn
中的数据集分为几类:
自带的小数据集(packaged dataset):sklearn.datasets.load_<name>
可在线下载的数据集(Downloaded Dataset):sklearn.datasets.fetch_<name