数学推导
问题描述:
假设一个带噪声的信号
[
x
∗
,
y
∗
]
[x^{*},y^{*}]
[x∗,y∗]的数学模型为
y
∗
=
e
a
x
i
∗
2
+
b
x
i
∗
+
c
+
δ
y^{*}=e^{a{x_i^{*}} ^2+bx^{*}_i+c}+\delta
y∗=eaxi∗2+bxi∗+c+δ(
δ
\delta
δ为高斯噪声),通过这组数据来求出
[
a
,
b
,
c
]
T
[a,b,c]^T
[a,b,c]T。
使用高斯-牛顿法求解最小二乘问题:
[
a
,
b
,
c
]
=
a
r
g
min
[
a
,
b
,
c
]
∑
i
=
x
i
n
∥
y
∗
−
e
a
x
i
2
+
b
x
i
+
c
∥
2
2
(1)
[a,b,c] = arg \min_{[a,b,c]}\sum_{i=x_i}^n \parallel y^{*}-e^{ax_i ^2+bx_i+c} \parallel_2^2 \tag{1}
[a,b,c]=arg[a,b,c]mini=xi∑n∥y∗−eaxi2+bxi+c∥22(1)
令 v = [ a , b , c ] T v=[a,b,c]^T v=[a,b,c]T, v v v是一个三维向量(要优化的值)。(注意区分方程中的 x x x和要求的向量 v v v,读者经常会把要优化的量和方程中的 x x x符号相混淆)
误差为:
e
r
r
o
r
=
y
−
e
a
x
i
2
+
b
x
i
+
c
(2)
error = y-e^{ax_i^2+bx_i+c} \tag{2}
error=y−eaxi2+bxi+c(2)
b
(
v
)
=
−
e
r
r
o
r
(
v
)
∗
J
(
v
)
(3)
b(v) = -error(v)*J(v) \tag{3}
b(v)=−error(v)∗J(v)(3)
分别对
f
(
x
)
f(x)
f(x)求关于
a
,
b
,
c
a,b,c
a,b,c偏导,得到对应的雅可比矩阵为:
J
(
v
)
=
[
−
x
i
2
e
e
a
x
i
2
+
b
x
i
+
c
−
x
i
e
e
a
x
i
2
+
b
x
i
+
c
−
e
e
a
x
i
2
+
b
x
i
+
c
]
J(v)=\left[ \begin{matrix} -x_i^2e^{e^{ax_i ^2+bx_i+c}} \\ -x_ie^{e^{ax_i ^2+bx_i+c}}\\ -e^{e^{ax_i ^2+bx_i+c}} \end{matrix} \right]
J(v)=⎣⎢⎡−xi2eeaxi2+bxi+c−xieeaxi2+bxi+c−eeaxi2+bxi+c⎦⎥⎤
求海森矩阵:
H
(
v
)
=
J
(
v
)
∗
J
(
v
)
T
(4)
H(v) = J(v)*J(v)^T \tag{4}
H(v)=J(v)∗J(v)T(4)
求bias:
b
(
v
)
=
−
e
r
r
o
r
(
v
)
∗
J
(
v
)
(5)
b(v)=-error(v)*J(v) \tag{5}
b(v)=−error(v)∗J(v)(5)
H
(
v
)
Δ
x
=
b
(
v
)
(6)
H(v)\Delta x=b(v) \tag{6}
H(v)Δx=b(v)(6)
求增量
Δ
v
\Delta v
Δv:
Δ
v
=
H
(
v
)
−
1
∗
b
(
v
)
(7)
\Delta v = H(v)^{-1}*b(v) \tag{7}
Δv=H(v)−1∗b(v)(7)
更新
v
k
+
1
v_{k+1}
vk+1:
v
k
+
1
=
v
k
+
Δ
v
(8)
v_{k+1} = v_k + \Delta v \tag{8}
vk+1=vk+Δv(8)
当增量收敛量变小时,即:
Δ
v
c
u
r
r
e
n
t
<
Δ
v
l
a
s
t
(9)
\Delta v_{current} < \Delta v_{last} \tag{9}
Δvcurrent<Δvlast(9)
停止迭代。此时得到的
v
v
v会比较接近真实的没有噪声的值。
C++算法实现
//
// Created by wpr on 18-12-17.
//
#include <iostream>
#include <opencv2/opencv.hpp>
#include <Eigen/Core>
#include <Eigen/Dense>
using namespace std;
using namespace Eigen;
int main(int argc, char ** argv) {
std::cout << "Hello GN.cpp" << std::endl;
double a = 1.0, b = 2.0, c = 1.0; //real value
int N = 100; //data number
double w_sigma = 1.0; //noise sigma value
cv::RNG rng; //opencv random creater
double ae = 2.0, be = -1.0, ce = 5.0; //abc estimator
vector<double> x_data, y_data; //data
cout << "generating data..." << endl;
for (int i = 0; i < N; i++) {
double x = i / 100.0;
x_data.push_back(x);
y_data.push_back(exp(a * x * x + b * x + c) + rng.gaussian(w_sigma));
}
//start iterating
int iterations = 100; //iterate times
double cost = 0, lastCost = 0; //iterate cost data
for (int iter = 0; iter < iterations; iter++) {
Matrix3d H = Matrix3d::Zero(); // Hessian = J^T J
Vector3d b = Vector3d::Zero(); // bias
cost = 0;
for (int i = 0; i < N; i++) {
double xi = x_data[i], yi = y_data[i]; // 第i个数据点
double error = 0; // 第i个数据点的计算误差
error = yi-exp(ae * xi * xi + be * xi + ce);
Vector3d J; // 雅可比矩阵
J[0] = -xi*xi*exp(ae*xi*xi+be*xi+ce);
J[1] = -xi*exp(ae*xi*xi+be*xi+ce);
J[2] = -exp(ae*xi*xi+be*xi+ce);
H += J * J.transpose();
b += -error * J;
cost += error * error;
}
Vector3d dx;
dx = H.inverse()*b;
if (isnan(dx[0]))
{
cout << "result is nan! " << endl;
break;
}
if (iter > 0 && cost > lastCost)
{
// error increased, indicating the result is bad
cout << "cost: " << cost << ", last cost:" << lastCost << endl;
break;
}
//update a b c estimate value
ae += dx[0];
be += dx[1];
ce += dx[2];
lastCost = cost;
cout << "total cost: " << cost << endl;
}
cout << "estimated abc = " << ae << ", " << be << ", " << ce << endl;
return 0;
}