the summary of sklearn.covariance

最新推荐文章于 2024-02-06 08:35:15 发布

muzhen_xupeng

最新推荐文章于 2024-02-06 08:35:15 发布

阅读量526

点赞数

分类专栏： sklearn

本文链接：https://blog.csdn.net/muzhen_xupeng/article/details/53991856

版权

sklearn 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

sklearn.covariance has three categories:EmpiricalCovariance and so on,Shrunkage,GraphLasso.

EmpiricalCovariance:Maximum likelihood covariance estimator.

If sample dataset has noisy data,we use MinCovDet to get a robust covariance estimator.

MCD:https://tr8dr.wordpress.com/2010/09/24/minimum-covariance-determination/

MCD want to choose a tolerance ellipse have minimum volume which is equaled to the determinant of splitted sample matrix by given accuracy.

And we can use EllipticEnvelope which use MCD estimator as covariance estimator to detect outlier.

Except for noisy data,we also encounter a situation where the number of data point N is small and the number of feature P is large.We can use Shrunkage method to handle this problem.

ShrunkageCovariance:

And LediotWolf and OAS can use certain formula to compute shrinkage .They are the better choices.

GraphLasso:Sparse inverse covariance estimation with an l1-penalized estimator.

where K is precision matrix.

GraphLasso(and GraphLasso) is another method to estimate precison matrix when N is samll and P is large,especically GraphLasso is always better than Shrunkage method when N<P.

By using GraphLasso,we can get a sparse precision matrix which has good-condition: if two features are independent conditionally on the others, the corresponding coefficient in the precision matrix will be zero.