Outlier removal using Hampel identifier

https://www.mathworks.com/help/signal/ref/hampel.html

hampel

Outlier removal using Hampel identifier

collapse all in page

Syntax

y = hampel(x)

y = hampel(x,k)

y = hampel(x,k,nsigma)

[y,j] = hampel(___)

[y,j,xmedian,xsigma] = hampel(___)

hampel(___)

Description

example

y = hampel(x) applies a Hampel filter to the input vector, x, to detect and remove outliers. For each sample of x, the function computes the median of a window composed of the sample and its six surrounding samples, three per side. It also estimates the standard deviation of each sample about its window median using the median absolute deviation. If a sample differs from the median by more than three standard deviations, it is replaced with the median. If x is a matrix, then hampeltreats each column of x as an independent channel.

y = hampel(x,k) specifies the number of neighbors, k, on either side of each sample of x in the measurement window.

example

y = hampel(x,k,nsigma) specifies a number of standard deviations, nsigma, by which a sample of x must differ from the local median for it to be replaced with the median.

example

[y,j] = hampel(___) also returns a logical matrix that is true at the locations of all points identified as outliers. This syntax accepts any of the input arguments from previous syntaxes.

example

[y,j,xmedian,xsigma] = hampel(___) also returns the local medians and the estimated standard deviations for each element of x.

hampel(___) with no output arguments plots the filtered signal and annotates the outliers that were removed.

Examples

collapse all

Remove Spikes from Sinusoid

Try This Example

Generate 100 samples of a sinusoidal signal. Replace the sixth and twentieth samples with spikes.

x = sin(2*pi*(0:99)/100);
x(6) = 2;
x(20) = -2;

Use hampel to locate every sample that differs by more than three standard deviations from the local median. The measurement window is composed of the sample and its six surrounding samples, three per side.

[y,i,xmedian,xsigma] = hampel(x);

Plot the filtered signal and annotate the outliers.

n = 1:length(x);
plot(n,x)
hold on
plot(n,xmedian-3*xsigma,n,xmedian+3*xsigma)
plot(find(i),x(i),'sk')
hold off
legend('Original signal','Lower limit','Upper limit','Outliers')

Repeat the computation, but now take just one adjacent sample on each side when computing the median. The function considers the extrema as outliers.

hampel(x,1)

Hampel Filtering of Multichannel Signal

Try This Example

Generate a two-channel signal consisting of sinusoids of different frequencies. Place spikes in random places. Use NaNs to add missing samples at random. Reset the random number generator for reproducible results. Plot the signal.

rng('default')

n = 59;
x = sin(pi./[15 10]'*(1:n)+pi/3)';

spk = randi(2*n,9,1);
x(spk) = x(spk)*2;
x(randi(2*n,6,1)) = NaN;

plot(x)

Filter the signal using hampel with the default settings.

y = hampel(x);
plot(y)

Increase the length of the moving window and decrease the threshold to treat a sample as an outlier.

y = hampel(x,4,2);
plot(y)

Output the running median for each channel. Overlay the medians on a plot of the signal.

[y,j,xmd,xsd] = hampel(x,4,2);
plot(x)
hold on
plot(xmd,'--')

Find Outliers in Multichannel Signal

Try This Example

Generate a multichannel signal that consists of two sinusoids of different frequencies embedded in white Gaussian noise of unit variance.

rng('default')

t = 0:60;
x = sin(pi./[10;2]*t)'+randn(numel(t),2);

Apply a Hampel filter to the signal. Take as outliers those points that differ by more than two standard deviations from the median of a surrounding nine-sample window. Output a logical matrix that is true at the locations of the outliers.

k = 4;
nsig = 2;

[y,h] = hampel(x,k,nsig);

Plot each channel of the signal in its own set of axes. Draw the original signal, the filtered signal, and the outliers. Annotate the outlier locations.

for k = 1:2
    hk = h(:,k);
    ax = subplot(2,1,k);
    plot(t,x(:,k))
    hold on
    plot(t,y(:,k))
    plot(t(hk),x(hk,k),'*')
    hold off
    ax.XTick = t(hk);
end

Signal Statistics Returned by Hampel Filter

Try This Example

Generate 100 samples of a sinusoidal signal. Replace the sixth and twentieth samples with spikes.

n = 1:100;
x = sin(2*pi*n/100);
x(6) = 2;
x(20) = -2;

Use hampel to compute the local median and estimated standard deviation for every sample. Use the default values of the input parameters:

  • The window size is .

  • The points that differ from their window median by more than three standard deviations are considered outliers.

Plot the result.

[y,i,xmedian,xsigma] = hampel(x);

plot(n,x)
hold on
plot(n,[1;1]*xmedian+3*[-1;1]*xsigma)
plot(find(i),x(i),'sk')
hold off
legend('Signal','Lower','Upper','Outliers')

Repeat the calculation using a window size of  and two standard deviations as the criteria for identifying outliers.

sds = 2;
adj = 10;
[y,i,xmedian,xsigma] = hampel(x,adj,sds);

plot(n,x)
hold on
plot(n,[1;1]*xmedian+sds*[-1;1]*xsigma)
plot(find(i),x(i),'sk')
hold off
legend('Signal','Lower','Upper','Outliers')

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值