常用核函数-Kernel Function

原创 2013年12月02日 11:06:04

Kernel Functions

Below is a list of some kernel functions available from the existing literature. As was the case with previous articles, every LaTeX notation for the formulas below are readily available from their alternate text html tag. I can not guarantee all of them are perfectly correct, thus use them at your own risk. Most of them have links to articles where they have been originally used or proposed.

1. Linear Kernel

The Linear kernel is the simplest kernel function. It is given by the inner product <x,y> plus an optional constant c. Kernel algorithms using a linear kernel are often equivalent to their non-kernel counterparts, i.e. KPCAwith linear kernel is the same as standard PCA.

k(x, y) = x^T y + c

2. Polynomial Kernel

The Polynomial kernel is a non-stationary kernel. Polynomial kernels are well suited for problems where all the training data is normalized.

k(x, y) = (\alpha x^T y + c)^d
Adjustable parameters are the slope alpha, the constant term c and the polynomial degree d.

3. Gaussian Kernel

The Gaussian kernel is an example of radial basis function kernel.

k(x, y) = \exp\left(-\frac{ \lVert x-y \rVert ^2}{2\sigma^2}\right)

Alternatively, it could also be implemented using

k(x, y) = \exp\left(- \gamma \lVert x-y \rVert ^2 )

The adjustable parameter sigma plays a major role in the performance of the kernel, and should be carefully tuned to the problem at hand. If overestimated, the exponential will behave almost linearly and the higher-dimensional projection will start to lose its non-linear power. In the other hand, if underestimated, the function will lack regularization and the decision boundary will be highly sensitive to noise in training data.

4. Exponential Kernel

The exponential kernel is closely related to the Gaussian kernel, with only the square of the norm left out. It is also a radial basis function kernel.

k(x, y) = \exp\left(-\frac{ \lVert x-y \rVert }{2\sigma^2}\right)

5. Laplacian Kernel

The Laplace Kernel is completely equivalent to the exponential kernel, except for being less sensitive for changes in the sigma parameter. Being equivalent, it is also a radial basis function kernel.

k(x, y) = \exp\left(- \frac{\lVert x-y \rVert }{\sigma}\right)

It is important to note that the observations made about the sigma parameter for the Gaussian kernel also apply to the Exponential and Laplacian kernels.

6. ANOVA Kernel

The ANOVA kernel is also a radial basis function kernel, just as the Gaussian and Laplacian kernels. It is said toperform well in multidimensional regression problems (Hofmann, 2008).

k(x, y) =  \sum_{k=1}^n  \exp (-\sigma (x^k - y^k)^2)^d

7. Hyperbolic Tangent (Sigmoid) Kernel

The Hyperbolic Tangent Kernel is also known as the Sigmoid Kernel and as the Multilayer Perceptron (MLP) kernel. The Sigmoid Kernel comes from the Neural Networks field, where the bipolar sigmoid function is often used as an activation function for artificial neurons.

k(x, y) = \tanh (\alpha x^T y + c)

It is interesting to note that a SVM model using a sigmoid kernel function is equivalent to a two-layer, perceptron neural network. This kernel was quite popular for support vector machines due to its origin from neural network theory. Also, despite being only conditionally positive definite, it has been found to perform well in practice.

There are two adjustable parameters in the sigmoid kernel, the slope alpha and the intercept constant c. A common value for alpha is 1/N, where N is the data dimension. A more detailed study on sigmoid kernels can be found in the works by Hsuan-Tien and Chih-Jen.

8. Rational Quadratic Kernel

The Rational Quadratic kernel is less computationally intensive than the Gaussian kernel and can be used as an alternative when using the Gaussian becomes too expensive.

k(x, y) = 1 - \frac{\lVert x-y \rVert^2}{\lVert x-y \rVert^2 + c}

9. Multiquadric Kernel

The Multiquadric kernel can be used in the same situations as the Rational Quadratic kernel. As is the case with the Sigmoid kernel, it is also an example of an non-positive definite kernel.

k(x, y) = \sqrt{\lVert x-y \rVert^2 + c^2}

10. Inverse Multiquadric Kernel

The Inverse Multi Quadric kernel. As with the Gaussian kernel, it results in a kernel matrix with full rank (Micchelli, 1986) and thus forms a infinite dimension feature space.

k(x, y) = \frac{1}{\sqrt{\lVert x-y \rVert^2 + \theta^2}}

11. Circular Kernel

The circular kernel comes from a statistics perspective. It is an example of an isotropic stationary kernel and is positive definite in R2.

k(x, y) = \frac{2}{\pi} \arccos ( - \frac{ \lVert x-y \rVert}{\sigma}) - \frac{2}{\pi} \frac{ \lVert x-y \rVert}{\sigma} \sqrt{1 - \left(\frac{ \lVert x-y \rVert^2}{\sigma} \right)}
\mbox{if}~ \lVert x-y \rVert < \sigma \mbox{, zero otherwise}

12. Spherical Kernel

The spherical kernel is similar to the circular kernel, but is positive definite in R3.

k(x, y) = 1 - \frac{3}{2} \frac{\lVert x-y \rVert}{\sigma} + \frac{1}{2} \left( \frac{ \lVert x-y \rVert}{\sigma} \right)^3

\mbox{if}~ \lVert x-y \rVert < \sigma \mbox{, zero otherwise}

13. Wave Kernel

The Wave kernel is also symmetric positive semi-definite (Huang, 2008).

k(x, y) = \frac{\theta}{\lVert x-y \rVert \right} \sin \frac{\lVert x-y \rVert }{\theta}

14. Power Kernel

The Power kernel is also known as the (unrectified) triangular kernel. It is an example of scale-invariant kernel (Sahbi and Fleuret, 2004) and is also only conditionally positive definite.

k(x,y) = - \lVert x-y \rVert ^d

15. Log Kernel

The Log kernel seems to be particularly interesting for images, but is only conditionally positive definite.

k(x,y) = - log (\lVert x-y \rVert ^d + 1)

16. Spline Kernel

The Spline kernel is given as a piece-wise cubic polynomial, as derived in the works by Gunn (1998).

k(x, y) = 1 + xy + xy~min(x,y) - \frac{x+y}{2}~min(x,y)^2+\frac{1}{3}\min(x,y)^3

However, what it actually mean is:

k(x,y) = \prod_{i=1}^d 1 + x_i y_i + x_i y_i \min(x_i, y_i) - \frac{x_i + y_i}{2} \min(x_i,y_i)^2 + \frac{\min(x_i,y_i)^3}{3}

Withx,y \in R^d

17. B-Spline (Radial Basis Function) Kernel

The B-Spline kernel is defined on the interval [−1, 1]. It is given by the recursive formula:

k(x,y) = B_{2p+1}(x-y)

\mbox{where~} p \in N \mbox{~with~} B_{i+1} := B_i \otimes  B_0.

In the work by Bart Hamers it is given by:k(x, y) = \prod_{p=1}^d B_{2n+1}(x_p - y_p)

Alternatively, Bn can be computed using the explicit expression (Fomel, 2000):

B_n(x) = \frac{1}{n!} \sum_{k=0}^{n+1} \binom{n+1}{k} (-1)^k (x + \frac{n+1}{2} - k)^n_+

Where x+ is defined as the truncated power function:

x^d_+ = \begin{cases} x^d, & \mbox{if }x > 0 \\  0, & \mbox{otherwise} \end{cases}

18. Bessel Kernel

The Bessel kernel is well known in the theory of function spaces of fractional smoothness. It is given by:

k(x, y) = \frac{J_{v+1}( \sigma \lVert x-y \rVert)}{ \lVert x-y \rVert ^ {-n(v+1)} }

where J is the Bessel function of first kind. However, in the Kernlab for R documentation, the Bessel kernel is said to be:

k(x,x') = - Bessel_{(nu+1)}^n (\sigma |x - x'|^2)

19. Cauchy Kernel

The Cauchy kernel comes from the Cauchy distribution (Basak, 2008). It is a long-tailed kernel and can be used to give long-range influence and sensitivity over the high dimension space.

k(x, y) = \frac{1}{1 + \frac{\lVert x-y \rVert^2}{\sigma} }

20. Chi-Square Kernel

The Chi-Square kernel comes from the Chi-Square distribution.

k(x,y) = 1 - \sum_{i=1}^n \frac{(x_i-y_i)^2}{\frac{1}{2}(x_i+y_i)}

21. Histogram Intersection Kernel

The Histogram Intersection Kernel is also known as the Min Kernel and has been proven useful in image classification.

k(x,y) = \sum_{i=1}^n \min(x_i,y_i)

22. Generalized Histogram Intersection

The Generalized Histogram Intersection kernel is built based on the Histogram Intersection Kernel for image classification but applies in a much larger variety of contexts (Boughorbel, 2005). It is given by:

k(x,y) = \sum_{i=1}^m \min(|x_i|^\alpha,|y_i|^\beta)

23. Generalized T-Student Kernel

The Generalized T-Student Kernel has been proven to be a Mercel Kernel, thus having a positive semi-definite Kernel matrix (Boughorbel, 2004). It is given by:

k(x,y) = \frac{1}{1 + \lVert x-y \rVert ^d}

24. Bayesian Kernel

The Bayesian kernel could be given as:

k(x,y) = \prod_{l=1}^N \kappa_l (x_l,y_l)

where

\kappa_l(a,b) = \sum_{c \in \{0;1\}} P(Y=c \mid X_l=a) ~ P(Y=c \mid X_l=b)

However, it really depends on the problem being modeled. For more information, please see the work by Alashwal, Deris and Othman, in which they used a SVM with Bayesian kernels in the prediction of protein-protein interactions.

25. Wavelet Kernel

The Wavelet kernel (Zhang et al, 2004) comes from Wavelet theory and is given as:

k(x,y) = \prod_{i=1}^N h(\frac{x_i-c_i}{a}) \:  h(\frac{y_i-c_i}{a})

Where a and c are the wavelet dilation and translation coefficients, respectively (the form presented above is a simplification, please see the original paper for details). A translation-invariant version of this kernel can be given as:

k(x,y) = \prod_{i=1}^N h(\frac{x_i-y_i}{a})

Where in both h(x) denotes a mother wavelet function. In the paper by Li Zhang, Weida Zhou, and Licheng Jiao, the authors suggests a possible h(x) as:

h(x) = cos(1.75x)exp(-\frac{x^2}{2})

Which they also prove as an admissible kernel function.


版权声明:本文为博主原创文章,未经博主允许不得转载。

核函数-Kernel Function汇总

 Kernel Functions Below is a list of some kernel functions available from the existing literat...
  • dajue2014
  • dajue2014
  • 2016年12月16日 14:27
  • 868

核函数(Kernel Function)与SVM

1.核函数把低维空间映射到高维空间下面这张图位于第一、二象限内。我们关注红色的门,以及“北京四合院”这几个字下面的紫色的字母。我们把红色的门上的点看成是“+”数据,紫色字母上的点看成是“-”数据,它们...
  • bitcarmanlee
  • bitcarmanlee
  • 2017年08月26日 18:02
  • 1157

收集一下核函数-Kernel Function

本文转载自:http://blog.sina.com.cn/s/blog_6d979ba00100qpxm.html ,特别感谢! Kernel Functions ...
  • godenlove007
  • godenlove007
  • 2012年11月04日 19:11
  • 2367

Kernel Function

Below is a list of some kernel functions available from the existing literature. As was the case wit...
  • chinaliping
  • chinaliping
  • 2012年12月17日 14:08
  • 1275

常用核函数-Kernel Function

Kernel Functions Below is a list of some kernel functions available from the existing literature....
  • chuminnan2010
  • chuminnan2010
  • 2014年04月07日 13:42
  • 990

总结一下遇到的各种核函数~

由于之前做了很多核方法相关的子空间学习算法,本文打算对各种核函数进行一下简要的介绍,希望对大家能够有所帮助。...
  • wsj998689aa
  • wsj998689aa
  • 2015年07月24日 01:08
  • 8717

kernel常用函数、宏

__setup early_param MACHINE_START __attribute__编译属性 section 1 initcall宏定义 1 __setup   在in...
  • ivychend
  • ivychend
  • 2018年01月22日 15:15
  • 15

MLP、RBF、SVM网络比较及其应用前景

摘  要: 本文主要对MLP、RBF、SVM三种神经网络进行了详细的分析与讨论,从三种网络的结构、学习算法、功能和性能等方面进行了比较。同时,结合自己的研究方向讨论了三种结构的神经网络的应用前景。 ...
  • xiaoding133
  • xiaoding133
  • 2013年06月12日 14:50
  • 23509

在OpenCV库中新增自定义函数和修改库函数

在OpenCV库中新增自定义函数和修改库函数OpenCV库函数功能强大,但是有时候并不能满足我们搞图像处理的,所以有时候想修改库函数或者新增自定义函数,然后在自己编写的程序中像OpenCV那样调用它。...
  • liumangmao1314
  • liumangmao1314
  • 2017年04月12日 22:40
  • 1143

核方法(kernel method)的主要思想

本文对核方法(kernel method)进行简要的介绍。 核方法的主要思想是基于这样一个假设:“在低维空间中不能线性分割的点集,通过转化为高维空间中的点集时,很有可能变为线性可分的” ,例如下图 ...
  • xianlingmao
  • xianlingmao
  • 2012年07月05日 16:31
  • 51498
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:常用核函数-Kernel Function
举报原因:
原因补充:

(最多只允许输入30个字)