Introduction
Kalman filter is one of the best linear filters in the world. There is no frequency domain transformation involved, and it can only describe the linear relationship between states.
In the history of computer science, Stanley Schmidt implemented the Kalman filter for the first time. During a visit to NASA’s Ames Research Center, Kalman, from the Massachusetts Institute of Technology, found his method useful for solving orbit predictions for the Apollo program, and later used such filters in the navigation computers of the Apollo spacecraft. Papers on this filter were published by Swerling (1958), Kalman (1960), and Kalman and Bucy (1961).
Applications of Kalman filters in mathematical modeling and computer science
Target tracking, mathematical inversion and mathematical model optimization, data prediction, data signal extraction, etc
The core idea of the Kalman filter
Find balance in uncertainty.
The Basics(The following applies only to linear systems)
Basic principle of Kalman filter
This section of knowledge can be summarized into 6 parts:
1. State transition
To make it easier to understand the Kalman filter, let’s take a simple one-dimensional case:
Suppose that in the ideal case, a person who is pushing the box moves in the x axis at a constant speed, Ut is the acceleration, and the magnitude direction is constant.
Importing the whole state of the person and the box is Xt:
Where Pt is the horizontal position, Vt is the horizontal velocity, they together constitute the state, you can also study the subject of the new state of the object, this state is the important operation object of our Kalman filter.
If we know the state quantity at time t-1, then for the horizontal position and horizontal velocity in the state quantity at time t, we have the mathematical derivation:
In the above equation Δt is the change in time.
By looking at the above equation, we can see that the output variables are all linear combinations of the input variables.
Since the relationship between the variables is linear, linear algebra allows us to further write this as a matrix:
After the matrix is sorted into variable form, the above equation can be simplified as:
The above equation is also known as the state prediction formula. The Ft is called the state transition matrix and is used to speculate the relevant index at the next moment. Bt is called the control matrix and is used to represent how the control quantity acts on the current state.It is worth mentioning that the state quantities in the state prediction formula are estimated values, and the state quantities on the left side of the equal sign need to be corrected by the following steps.
2.Covariance matrix
For the predictive state formula in the first subsection, we can predict and estimate the state quantity of the studied object at any time, but all the predictions are noisy, and the greater the noise, the greater the uncertainty.
So how do you guess how much uncertainty this forecast brings?Just use the covariance matrix.
For one-dimensional measurements:
It’s clear that the distribution follows a Gaussian distribution, so we can write down the center and variance of the distribution while exploring the data.
For the two dimensional case:
clc;
figure;
hold on;
A=zeros(100,2);
for i=1:1000
x=2*randn(1);
y=2*randn(1);
A(i,1)=x;
A(i,2)=y;
scatter(x,y,'red','filled')
end
If the noise of the two dimensions is independent, the center value and variance of the two distributions can be recorded for study.
But if there is a correlation between the two dimensions, there are two different ways to distribute the data besides the above:
clc;
mu1=[0 0];
sigma1=[4 3;3 4];
r1=mvnrnd(mu1,sigma1,1000);
mu2=[0 0];
sigma2=[4 -3;-3 4];
r2=mvnrnd(mu2,sigma2,1000);
ax1=subplot(1,2,1);
scatter(ax1,r1(:,1),r1(:,2),"magenta","filled");
xlabel(ax1,"x");ylabel(ax1,"y");
ax2=subplot(1,2,2);
scatter(ax2,r2(:,1),r2(:,2),"magenta","filled");
xlabel(ax2,"x");ylabel(ax2,"y");
Now we introduce the covariance matrix to quantify the correlation between the two dimensions:
The three pictures above correspond to the three cases where the covariance is equal to 0, greater than 0, and less than 0.
Tips:All the uncertainty in the Kalman filter is described by the covariance matrix.
3.Transfer of noise covariance matrix (transfer of uncertainty between time instants)
The mathematical theory of the above equation:cov(Ax,Bx)=Acov(x,x)BT.
Observation matrix
By extension from the example in the first subsection, if we have a sensor, let the observed value be Zt.
In practice, Zt acts as the recorded data of a sensor, which is scalar in most cases. Therefore, there may be a case of data dimension mismatch between Zt and Xt.
Therefore, the vector H is introduced as follows.
So, let’s build a relationship:
The observation noise also has a covariance matrix, which is set to R.
Status updates
The Kalman coefficient has two main effects:
1.The tradeoff between the predicted state covariance P and the observed covariance R determines whether we trust the predicted model more or the observed model more.
The larger the residual, the more belief in the observation model. The smaller the residual, the more confidence in the prediction model.
2.Transform the representation of residuals from the observation domain to the state domain.
6.Update of noise covariance matrix
Formula set
MATLAB Code Implementation
With the examples presented in the basics section:
clc;
Z=(1:100);
noise=randn(1,100);
Z=Z+noise;
X=[0;0];
P=[1 0;0 1];
F=[1 1;0 1];
Q=[0.0001,0;0 0.0001];
H=[1 0];
R=1;
y=1;
figure;
hold on;
for i=1:100
X_=F*X;
P_=F*P*F'+Q;
K_=P_*H'/(H*P_*H'+R);
X=X_+K_*(Z(i)-H*X_);
P=(eye(2)-K_*H)*P_;
scatter(X(1),X(2),"red","filled");
scatter(X(1),y,"blue","bv");
end
xlabel("LOCATION");
ylabel("SPEED");
title("KALMAN FILTER");
It can be seen from the above figure that when the whole body is assumed to move forward with uniform speed, the magnitude of the velocity can still be well estimated by the Kalman filter even if the noise is large.
Tips:One of the earliest applications of the Kalman filter was to measure external temperature in space, but the sensors could only be placed inside, and external estimates could be made based on internal results.
Noise to which the Kalman filter is subjected in practical situations
For example, when we are driving on the road, how do we prepare to locate our position?
1.Calculating acceleration information
2.The odometer represents the information
3.GPS information.
…
But in real life, these measurements have errors.
Kalman filter is to estimate the optimal position by combining the known information, which is essentially an optimal estimation algorithm.
For the motion ensemble mentioned earlier, any state is affected by the external environment (such as a box pressing against a rock), and these uncertainties are usually normally distributed.
Kalman Filter model
The Kalman filter model is essentially a combination of the estimated value and the observed value (predicted value in the next frame and detected value in the next frame).
The purpose of the Kalman coefficient Kt is to make the variance of the best estimate smaller.
Two extreme cases of the Kalman coefficient(Of course reality doesn’t exist)
1.When the observation noise is not present:
2.When there is no noise in the state estimation
A simple application of Kalman Filter
A general framework for solving practical problems:
Assuming that the actual length of the measured object is 50mm, the first estimate is 40mm, the estimated error is 5mm, the first measurement is 51mm, and the measurement error is 3mm, we can make a prediction by Kalman filter :(implemented by EXCEL)
It can be seen from the figure above that through the iterative operation of Kalman filter, both the estimated value and the measured value can finally hover around the actual length of the object 50mm, and the effect is very good. It is worth noting that the data in the yellow part needs to be input by sensors and other components during each iteration operation.
Important mathematical derivations
Data fusion
The Kalman gain is designed to minimize the variance of the measurements.
When the Kalman gain obtains the form in the above equation, the optimal estimation can be obtained.
Detailed derivation of the Kalman coefficient
Two common cases of taking the derivative of a trace
A priori estimates of the noise covariance matrix
Extensions of Kalman Filters – Applications to nonlinear systems(EKF)
The core idea is to transform the nonlinear system into the linear system because nonlinear system: cannot be expressed by space equation of state.
Tips:Linearization is not a global linearization but a point linearization.
It is worth noting that if the functions I wrote in this article are derivable, see “Abstract Functions in Engineering Objects are derivable.” The theory.
And we can build the model:
Possible problems in the mathematical part of EKF model derivation:
Matlab implementation of high-dimensional EKF
%% EKF 3 dimensions %%
clc;
clear;
q=0.1; %Set the process standard deviation
r=0.1; %Set the measurement standard deviation
Q=q^2*eye(3); %Covariance matrix for the case where the three variables are independent of each other
R=r^2; %The variance of the measurement
%Define a function f and h with input x
f=@(x)[x(2);x(3);0.05*x(1)*(x(2)+x(3))]; %Equation of state
h=@(x)[x(1);x(2);x(3)]; %Equation of measurement
s=[0;0;1]; %Set the initial value of the state quantity
%% Setting the initial state %%
x=s+q*randn(3,1); %Three rows and one column of noise are added to the initial value s
p=eye(3); %Covariance matrix, third-order identity matrix
xV=zeros(3,50); %The EKF posterior estimate is stored
sV=zeros(3,50);%A zero matrix that holds the actual values
zV=zeros(3,50); %Store observation zero matrix
for k=1:50 %Iterate 50 times
z=h(s)+r*randn; %Observation noise r is added to the theoretical value, which is the observed value, and r*randn is a number of normal distribution
sV(:,k)=s; %Actual value
zV(:,k)=z; %Measured value
[x1,A]=jaccsd(f,x); %Compute the Jacobian A of the state transition matrix f, where x1 is the prior value of the state quantities computed by the applied model
p=A*p*A'+Q; %Compute the prior covariance matrix
[z1,H]=jaccsd(h,x1);%Calculate the Jacobian of h
K=p*H'*inv(H*p*H'+R); %Solve for the Kalman gain
x=x1+K*(z-z1); %Estimated value of state EKF, z is the observed value, and z1 is the theoretical observed value calculated by the model
p=p-K*H*p;%Calculate the posterior covariance matrix
xV(:,k)=x; %Store the calculated posterior estimate
s=f(s)+q*randn(3,1); %Calculate the actual value
end
%% drawing the figure %%
for k=1:3
FontSize=14; %Set the font size
LineWidth=1; %Set line width
figure();
plot(sV(k,:),'r-');%draw the real value
hold on;
plot(xV(k,:),'b-','LineWidth',LineWidth)%Plot the best estimate
hold on;
plot(zV(k,:),'kx');%Plot observed value
hold on;
legend('True value','EKF posterior estimate','Observed value');
xl=xlabel('Number of iterations');
t=['State variable',num2str(k)];
yl=ylabel(t);
set(xl,'fontsize',FontSize);
set(yl,'fontsize',FontSize);
hold off;
set(gca,'FontSize',FontSize);
end
%% Define the function of the Jacobian matrix %%
function [z,A]=jaccsd(fun,x)
z=fun(x);
n=numel(x);
m=numel(z);
A=zeros(m,n);
h=n*eps;
for k=1:n
x1=x;
x1(k)=x1(k)+h*i;
A(:,k)=imag(fun(x1))/h;
end
end
And the figures:
We can see that the predictions of the 3D EKF are very good.