sklearn.base: Base classes and utility functions
sklearn.calibration: Probability Calibration
sklearn.cluster: Clustering
sklearn.compose: Composite Estimators
sklearn.covariance: Covariance Estimators
sklearn.cross_decomposition: Cross decomposition
sklearn.datasets: Datasets
sklearn.decomposition: Matrix Decomposition
sklearn.discriminant_analysis: Discriminant Analysis
sklearn.dummy: Dummy estimators
sklearn.ensemble: Ensemble Methods
sklearn.exceptions: Exceptions and warnings
sklearn.experimental: Experimental
sklearn.feature_extraction: Feature Extraction
sklearn.feature_selection: Feature Selection
sklearn.gaussian_process: Gaussian Processes
sklearn.impute: Impute
sklearn.inspection: inspection
sklearn.isotonic: Isotonic regression
sklearn.kernel_approximation Kernel Approximation
sklearn.kernel_ridge Kernel Ridge Regression
sklearn.linear_model: Linear Models
sklearn.manifold: Manifold Learning
sklearn.metrics: Metrics
sklearn.mixture: Gaussian Mixture Models
sklearn.model_selection: Model Selection
sklearn.multiclass: Multiclass and multilabel classification
sklearn.multioutput: Multioutput regression and classification
sklearn.naive_bayes: Naive Bayes
sklearn.neighbors: Nearest Neighbors
sklearn.neural_network: Neural network models
sklearn.pipeline: Pipeline
sklearn.preprocessing: Preprocessing and Normalization
sklearn.random_projection: Random projection
sklearn.semi_supervised Semi-Supervised Learning
sklearn.svm: Support Vector Machines
sklearn.tree: Decision Trees
sklearn.utils: Utilities
功能
sklearn.base: Base classes and utility functions
Base classes for all estimators.
Used for VotingClassifier
Base classes
base.BaseEstimator
Base class for all estimators in scikit-learn
base.BiclusterMixin
Mixin class for all bicluster estimators in scikit-learn
base.ClassifierMixin
Mixin class for all classifiers in scikit-learn.
base.ClusterMixin
Mixin class for all cluster estimators in scikit-learn.
base.DensityMixin
Mixin class for all density estimators in scikit-learn.
base.RegressorMixin
Mixin class for all regression estimators in scikit-learn.
base.TransformerMixin
Mixin class for all transformers in scikit-learn.
Functions
base.clone(estimator[, safe])
Constructs a new estimator with the same parameters.
base.is_classifier(estimator)
Return True if the given estimator is (probably) a classifier.
base.is_regressor(estimator)
Return True if the given estimator is (probably) a regressor.
config_context(**new_config)
Context manager for global scikit-learn configuration
get_config()
Retrieve current values for configuration set by set_config
set_config([assume_finite, working_memory, …])
Set global scikit-learn configuration
show_versions()
Print useful debugging information
sklearn.calibration: Probability Calibration
Calibration of predicted probabilities.
User guide: See the Probability calibration section for further details.
calibration.CalibratedClassifierCV([…])
Probability calibration with isotonic regression or sigmoid.
calibration.calibration_curve(y_true, y_prob)
Compute true and predicted probabilities for a calibration curve.
sklearn.cluster: Clustering
The sklearn.cluster module gathers popular unsupervised clustering algorithms.
User guide: See the Clustering and Biclustering sections for further details.
Classes
cluster.AffinityPropagation([damping, …])
Perform Affinity Propagation Clustering of data.
cluster.AgglomerativeClustering([…])
Agglomerative Clustering
cluster.Birch([threshold, branching_factor, …])
Implements the Birch clustering algorithm.
cluster.DBSCAN([eps, min_samples, metric, …])
Perform DBSCAN clustering from vector array or distance matrix.
cluster.FeatureAgglomeration([n_clusters, …])
Agglomerate features.
cluster.KMeans([n_clusters, init, n_init, …])
K-Means clustering.
cluster.MiniBatchKMeans([n_clusters, init, …])
Mini-Batch K-Means clustering.
cluster.MeanShift([bandwidth, seeds, …])
Mean shift clustering using a flat kernel.
cluster.OPTICS([min_samples, max_eps, …])
Estimate clustering structure from vector array.
cluster.SpectralClustering([n_clusters, …])
Apply clustering to a projection of the normalized Laplacian.
cluster.SpectralBiclustering([n_clusters, …])
Spectral biclustering (Kluger, 2003).
cluster.SpectralCoclustering([n_clusters, …])
Spectral Co-Clustering algorithm (Dhillon, 2001).
Functions
cluster.affinity_propagation(S[, …])
Perform Affinity Propagation Clustering of data
cluster.cluster_optics_dbscan(reachability, …)
Performs DBSCAN extraction for an arbitrary epsilon.
cluster.cluster_optics_xi(reachability, …)
Automatically extract clusters according to the Xi-steep method.
cluster.compute_optics_graph(X, min_samples, …)
Computes the OPTICS reachability graph.
cluster.dbscan(X[, eps, min_samples, …])
Perform DBSCAN clustering from vector array or distance matrix.
cluster.estimate_bandwidth(X[, quantile, …])
Estimate the bandwidth to use with the mean-shift algorithm.
cluster.k_means(X, n_clusters[, …])
K-means clustering algorithm.
cluster.mean_shift(X[, bandwidth, seeds, …])
Perform mean shift clustering of data using a flat kernel.
cluster.spectral_clustering(affinity[, …])
Apply clustering to a projection of the normalized Laplacian.
cluster.ward_tree(X[, connectivity, …])
Ward clustering based on a Feature matrix.
sklearn.compose: Composite Estimators
Meta-estimators for building composite models with transformers
In addition to its current contents, this module will eventually be home to refurbished versions of Pipeline and FeatureUnion.
User guide: See the Pipelines and composite estimators section for further details.
compose.ColumnTransformer(transformers[, …])
Applies transformers to columns of an array or pandas DataFrame.
compose.TransformedTargetRegressor([…])
Meta-estimator to regress on a transformed target.
compose.make_column_transformer(…)
Construct a ColumnTransformer from the given transformers.
compose.make_column_selector([pattern, …])
Create a callable to select columns to be used with ColumnTransformer.
sklearn.covariance: Covariance Estimators
The sklearn.covariance module includes methods and algorithms to robustly estimate the covariance of features given a set of points. The precision matrix defined as the inverse of the covariance is also estimated. Covariance estimation is closely related to the theory of Gaussian Graphical Models.
User guide: See the Covariance estimation section for further details.
covariance.EmpiricalCovariance([…])
Maximum likelihood covariance estimator
covariance.EllipticEnvelope([…])
An object for detecting outliers in a Gaussian distributed dataset.
covariance.GraphicalLasso([alpha, mode, …])
Sparse inverse covariance estimation with an l1-penalized estimator.
covariance.GraphicalLassoCV([alphas, …])
Sparse inverse covariance w/ cross-validated choice of the l1 penalty.
covariance.LedoitWolf([store_precision, …])
LedoitWolf Estimator
covariance.MinCovDet([store_precision, …])
Minimum Covariance Determinant (MCD): robust estimator of covariance.
covariance.OAS([store_precision, …])
Oracle Approximating Shrinkage Estimator
covariance.ShrunkCovariance([…])
Covariance estimator with shrinkage
covariance.empirical_covariance(X[, …])
Computes the Maximum likelihood covariance estimator
covariance.graphical_lasso(emp_cov, alpha[, …])
l1-penalized covariance estimator
covariance.ledoit_wolf(X[, assume_centered, …])
Estimates the shrunk Ledoit-Wolf covariance matrix.
covariance.oas(X[, assume_centered])
Estimate covariance with the Oracle Approximating Shrinkage algorithm.
covariance.shrunk_covariance(emp_cov[, …])
Calculates a covariance matrix shrunk on the diagonal
sklearn.cross_decomposition: Cross decomposition
User guide: See the Cross decomposition section for further details.
cross_decomposition.CCA([n_components, …])
CCA Canonical Correlation Analysis.
cross_decomposition.PLSCanonical([…])
PLSCanonical implements the 2 blocks canonical PLS of the original Wold algorithm [Tenenhaus 1998] p.204, referred as PLS-C2A in [Wegelin 2000].
cross_decomposition.PLSRegression([…])
PLS regression
cross_decomposition.PLSSVD([n_components, …])
Partial Least Square SVD
sklearn.datasets: Datasets
The sklearn.datasets module includes utilities to load datasets, including methods to load and fetch popular reference datasets. It also features some artificial data generators.
User guide: See the Dataset loading utilities section for further details.
Loaders
datasets.clear_data_home([data_home])
Delete all the content of the data home cache.
datasets.dump_svmlight_file(X, y, f[, …])
Dump the dataset in svmlight / libsvm file format.
datasets.fetch_20newsgroups([data_home, …])
Load the filenames and data from the 20 newsgroups dataset (classification).
datasets.fetch_20newsgroups_vectorized([…])
Load the 20 newsgroups dataset and vectorize it into token counts (classification).
datasets.fetch_california_housing([…])
Load the California housing dataset (regression).
datasets.fetch_covtype([data_home, …])
Load the covertype dataset (classification).
datasets.fetch_kddcup99([subset, data_home, …])
Load the kddcup99 dataset (classification).
datasets.fetch_lfw_pairs([subset, …])
Load the Labeled Faces in the Wild (LFW) pairs dataset (classification).
datasets.fetch_lfw_people([data_home, …])
Load the Labeled Faces in the Wild (LFW) people dataset (classification).
datasets.fetch_olivetti_faces([data_home, …])
Load the Olivetti faces data-set from AT&T (classification).
datasets.fetch_openml([name, version, …])
Fetch dataset from openml by name or dataset id.
datasets.fetch_rcv1([data_home, subset, …])
Load the RCV1 multilabel dataset (classification).
datasets.fetch_species_distributions([…])
Loader for species distribution dataset from Phillips et.
datasets.get_data_home([data_home])
Return the path of the scikit-learn data dir.
datasets.load_boston([return_X_y])
Load and return the boston house-prices dataset (regression).
datasets.load_breast_cancer([return_X_y])
Load and return the breast cancer wisconsin dataset (classification).
datasets.load_diabetes([return_X_y])
Load and return the diabetes dataset (regression).
datasets.load_digits([n_class, return_X_y])
Load and return the digits dataset (classification).
datasets.load_files(container_path[, …])
Load text files with categories as subfolder names.
datasets.load_iris([return_X_y])
Load and return the iris dataset (classification).
datasets.load_linnerud([return_X_y])
Load and return the linnerud dataset (multivariate regression).
datasets.load_sample_image(image_name)
Load the numpy array of a single sample image
datasets.load_sample_images()
Load sample images for image manipulation.
datasets.load_svmlight_file(f[, n_features, …])
Load datasets in the svmlight / libsvm format into sparse CSR matrix
datasets.load_svmlight_files(files[, …])
Load dataset from multiple files in SVMlight format
datasets.load_wine([return_X_y])
Load and return the wine dataset (classification).
Samples generator
datasets.make_biclusters(shape, n_clusters)
Generate an array with constant block diagonal structure for biclustering.
datasets.make_blobs([n_samples, n_features, …])
Generate isotropic Gaussian blobs for clustering.
datasets.make_checkerboard(shape, n_clusters)
Generate an array with block checkerboard structure for biclustering.
datasets.make_circles([n_samples, shuffle, …])
Make a large circle containing a smaller circle in 2d.
datasets.make_classification([n_samples, …])
Generate a random n-class classification problem.
datasets.make_friedman1([n_samples, …])
Generate the “Friedman #1” regression problem
datasets.make_friedman2([n_samples, noise, …])
Generate the “Friedman #2” regression problem
datasets.make_friedman3([n_samples, noise, …])
Generate the “Friedman #3” regression problem
datasets.make_gaussian_quantiles([mean, …])
Generate isotropic Gaussian and label samples by quantile
datasets.make_hastie_10_2([n_samples, …])
Generates data for binary classification used in Hastie et al.
datasets.make_low_rank_matrix([n_samples, …])
Generate a mostly low rank matrix with bell-shaped singular values
datasets.make_moons([n_samples, shuffle, …])
Make two interleaving half circles
datasets.make