the summary of sklearn.covariance

sklearn.covariance has three categories:EmpiricalCovariance and so on,Shrunkage,GraphLasso.

EmpiricalCovariance:Maximum likelihood covariance estimator.

If sample dataset has noisy data,we use MinCovDet to get a robust covariance estimator.


MCD want to choose a  tolerance ellipse have minimum volume which is equaled to the determinant of splitted sample matrix by given accuracy.

And we can use EllipticEnvelope which use MCD estimator as covariance estimator to detect outlier.

Except for noisy data,we also encounter a situation where the number of data point N is small and the number of feature P is large.We can use Shrunkage method to handle this problem.


And LediotWolf and OAS can use certain formula to compute shrinkage  .They are the better choices.

GraphLasso:Sparse inverse covariance estimation with an l1-penalized estimator.

where K is precision matrix.

GraphLasso(and GraphLasso) is another method to estimate precison matrix when N is samll and P is large,especically GraphLasso is always better than Shrunkage method when N<P.

By using GraphLasso,we can get a sparse precision matrix which has good-condition: if two features are independent conditionally on the others, the corresponding coefficient in the precision matrix will be zero.

个人分类: sklearn
下一篇Neural Networks for Applied Sciences and Engineering--Chapter 2
想对作者说点什么? 我来说一句