Kernel density estimation

最新推荐文章于 2023-10-21 14:31:23 发布

魏榆小生

最新推荐文章于 2023-10-21 14:31:23 发布

阅读量1.3k

点赞数

分类专栏：数字通信文章标签：统计学

数字通信专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Kernel density estimation

This comes from https://en.wikipedia.org/wiki/Kernel_density_estimation
In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. In some fields such as signal processing and econometrics it is also termed the Parzen–Rosenblatt window method, after Emanuel Parzen and Murray Rosenblatt, who are usually credited with independently creating it in its current form.

Definition

Let $(x_1, x_2, …, x_n)$ be an independent and identically distributed sample drawn from some distribution with an unknown density ƒ. We are interested in estimating the shape of this function ƒ. Its kernel density estimator is

f^_h (x) = 1 n \sum_i = 1 n K_h (x - x_i) = 1 n h \sum_i = 1 n K (x - x _ i h)

$\begin{equation} \hat{f}\_h(x) = \frac{1}{n}\sum\_{i=1}^nK\_h(x-x\_i) = \frac{1}{nh}\sum\_{i=1}^nK(\frac{x-x\_i}{h}) \end{equation}$
where K(•) is the kernel — a non-negative function that integrates to one and has mean zero — and h > 0 is a smoothing parameter called the bandwidth. A kernel with subscript h is called the scaled kernel and defined as Kh(x) = 1/h K(x/h). Intuitively one wants to choose h as small as the data allow; however, there is always a trade-off between the bias of the estimator and its variance. The choice of bandwidth is discussed in more detail below.A range of kernel functions are commonly used: uniform, triangular, biweight, triweight, Epanechnikov, normal, and others. The Epanechnikov kernel is optimal in a mean square error sense,though the loss of efficiency is small for the kernels listed previously, and due to its convenient mathematical properties, the normal kernel is often used, which means K(x) = ϕ(x), where ϕ is the standard normal density function.
The construction of a kernel density estimate finds interpretations in fields outside of density estimation.
For example, in thermodynamics, this is equivalent to the amount of heat generated when heat kernels (the fundamental solution to the heat equation) are placed at each data point locations xi. Similar methods are used to construct discrete Laplace operators on point clouds for manifold learning.
Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernel. To see this, we compare the construction of histogram and kernel density estimators, using these 6 data points: x1 = −2.1, x2 = −1.3, x3 = −0.4, x4 = 1.9, x5 = 5.1, x6 = 6.2. For the histogram, first the horizontal axis is divided into sub-intervals or bins which cover the range of the data. In this case, we have 6 bins each of width 2. Whenever a data point falls inside this interval, we place a box of height 1/12. If more than one data point falls inside the same bin, we stack the boxes on top of each other.
For the kernel density estimate, we place a normal kernel with variance 2.25 (indicated by the red dashed lines) on each of the data points xi. The kernels are summed to make the kernel density estimate (solid blue curve). The smoothness of the kernel density estimate is evident compared to the discreteness of the histogram, as kernel density estimates converge faster to the true underlying density for continuous random variables.
这里写图片描述