异常检测-线性方法

最新推荐文章于 2022-10-17 14:38:15 发布

weberyoung

最新推荐文章于 2022-10-17 14:38:15 发布

阅读量257

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_35692819/article/details/112794175

版权

机器学习专栏收录该内容

7 篇文章 0 订阅

订阅专栏

PCA异常检测

来自pyod的文档

Principal component analysis (PCA) can be used in detecting outliers. PCA is a linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.
In this procedure, covariance matrix of the data can be decomposed to orthogonal vectors, called eigenvectors, associated with eigenvalues. The eigenvectors with high eigenvalues capture most of the variance in the data.
Therefore, a low dimensional hyperplane constructed by k eigenvectors can capture most of the variance in the data. However, outliers are different from normal data points, which is more obvious on the hyperplane constructed by the eigenvectors with small eigenvalues.
Therefore, outlier scores can be obtained as the sum of the projected distance of a sample on all eigenvectors. See [BSCSC03,BAgg15] for details.
Score(X) = Sum of weighted euclidean distance between each sample to the hyperplane constructed by the selected eigenvectors

实践用乳腺癌数据集

from pyod.models import pca
data = train_data.values
y = data[:,-1]
n_samples = int(numeric.shape[0])
train_set = numeric[:int(n_samples*0.8)]
y_train = y[:int(n_samples*0.8)]
y_test = y[int(n_samples*0.8):]
test_set = numeric[int(n_samples*0.8):]
my_pca = pca.PCA()
my_pca.fit(train_set)

y_pre = my_pca.predict(X=test_set)
def trans(c):
    if c=='n':
        return 0
    else:
        return 1;
y_ = list(map(trans,y_test))
print('预测成功率为 %.2f%% '%((y_==y_pre).sum() / y_pre.shape[0] * 100))

weberyoung

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
异常检测-线性方法

PCA异常检测来自pyod的文档Principal component analysis (PCA) can be used in detecting outliers. PCA is a linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.In this procedure, covariance matri
复制链接

扫一扫