Introduction
In mathematics, statistics and computer science, particularly in machine learning and inverse problems, regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting.
Regularization applies to objective functions in ill-posed optimization problems.
Classification
One use of regularization is in classification. Empirical learning of classifiers (from a finite data set) is always an underdetermined problem, because it attempts to infer a function of any x x x given only examples x 1 , x 2 , . . . , x 3 x_{1}, x_{2}, ..., x_{3} x1,x2,...,x3.
A regularization term (or regularizer)
R
(
f
)
R(f)
R(f) is add to a loss function:
m
i
n
f
∑
i
=
1
n
V
(
f
(
x
i
)
,
y
i
)
)
+
λ
R
(
f
)
min_{f} \sum_{i=1}^{n} V(f(x_{i}), y_{i})) + \lambda R(f)
minfi=1∑nV(f(xi),yi))+λR(f) where
V
V
V —— an underlying loss function that describes the cost of predicting
f
(
x
)
f(x)
f(x) when the label is
y
y
y, such as the square loss or hinge loss.
λ
\lambda
λ —— a parameter which controls the importance of the regularization term.
R
(
f
)
R(f)
R(f) —— is typically chosen to impose a penalty on the complexity of
f
f
f.
Reference
Wiki
http://wap.sciencenet.cn/blog-261330-648033.html?mobile=1