去极值Detect Outliers的几种方案:MAD、3sigma

  • 异常值检测Detect Outliers

    In statistics, outliers are data points that don’t belong to a certain population. It is an abnormal observation that lies far away from other values. An outlier is an observation that diverges from othervise well-structured data.

    There are several ways to detect anomalies.

    Detect Outlier这个概念,更多是用在machine learning处理数据时。

    去极值是一个更广泛的概念,极值是异常值的一种,先找出极值(异常值),再去掉极值,算是一个完整的“去极值”过程。

  • Detect Anomalies

  • 1.Standard Deviation

    For a data distribution is approximately normal then about 68% of the data values lie within one standard deviation of the mean and about 95% are within two standard deviations, and about 99.7% lie within three standard deviations.

  • 2.Boxplots

    Interquartile Range

  • 3.DBScan Clustring

    DBScan is a clustering algorithm that’s used cluster data into groups.

  • 4.Isolation Forest

    Isolation Forest is an unsupervised learning algorithm that belongs to the ensemble decision trees family.

  • 5.Robust Random Cut Forest

    Random Cut Forest (RCF) algorithm is Amazon’s unsupervised algorithm for detecting anaomalies.

  • 6.Minimum Covariance Determinant
  • 7.Local Outlier Factor

    The local outlier factor(LOF) is a technique that attempts to harness the idean of nearest neighbors for outlier detection.

  • 8.One-Class SVM
  • 9.Z-Score
  • Anomaly Detection vs. Outlier detection

    Outlier detection and n

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值