scikit-learn(project中用的相对较多的模型介绍):2.3. Clustering(可用于特征的无监督降维)...

參考:http://scikit-learn.org/stable/modules/clustering.html


在实际项目中,我们真的非常少用到那些简单的模型,比方LR、kNN、NB等。尽管经典,但在project中确实不有用。

今天我们不关注详细的模型,而关注无监督的聚类方法。


之所以关注无监督聚类方法。是由于。在实际项目中,我们除了使用PCA等方法降维外。有时候我们也会考虑使用聚类的方法降维特征


Overview of clustering methods:


A comparison of the clustering algorithms in scikit-learn

Method name Parameters Scalability Usecase Geometry (metric used)
K-Means number of clusters Very large n_samples, medium n_clusterswith MiniBatch code General-purpose, even cluster size, flat geometry, not too many clusters Distances between points
Affinity propagation damping, sample preference Not scalable with n_samples Many clusters, uneven cluster size, non-flat geometry Graph distance (e.g. nearest-neighbor graph)
Mean-shift bandwidth Not scalable withn_samples Many clusters, uneven cluster size, non-flat geometry Distances between points
Spectral clustering number of clusters Medium n_samples, small n_clusters Few clusters, even cluster size, non-flat geometry Graph distance (e.g. nearest-neighbor graph)
Ward hierarchical clustering number of clusters Large n_samples andn_clusters Many clusters, possibly connectivity constraints Distances between points
Agglomerative clustering number of clusters, linkage type, distance Large n_samples andn_clusters Many clusters, possibly connectivity constraints, non Euclidean distances Any pairwise distance
DBSCAN neighborhood size Very large n_samples, medium n_clusters Non-flat geometry, uneven cluster sizes Distances between nearest points
Gaussian mixtures many Not scalable Flat geometry, good for density estimation Mahalanobis distances to centers
Birch branching factor, threshold, optional global clusterer. Large n_clusters andn_samples Large dataset, outlier removal, data reduction. Euclidean distance between points



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值