最大似然估计MLE
首先,看一下似然函数的定义。
1 Introduction
The principle of maximum likelihood is relatively straightforward. As before, we begin with a sample X = (X1,...,Xn) of random variables chosen according to one of a family of probabilities Pθ. In addition, f(x|θ), x = (x1,...,xn) will be used to denote the density function for the data when θ is the true state of nature. Then, the principle of maximum likelihood yields a choice of the estimator ˆ θ as the value for the parameter that makes the observed data most probable. Definition1. The likelihood function is the density function regarded as a function of θ.
L(θ|x) = f(x|θ), θ ∈Θ. (1)
The maximumlikelihoodestimator(MLE),
ˆ θ(x) = argmax θ L(θ|x).
(2) We will learn that especially for large samples, the maximum likelihood estimators have many desirable properties. However, especially for high dimensional data, the likelihood can have many local maxima. Thus, finding the global maximum can be a major computational challenge. This class of estimators has an important property. If ˆ θ(x) is a maximum likelihood estimate for θ, then g(ˆ θ(x)) is a maximum likelihood estimate for g(θ). For example, if θ is a parameter for the variance and ˆ θ is the maximum likelihood estimator, then pˆ θ is the maximum likelihood estimator for the standard deviation. This flexibility in estimation criterion seen here is not available in the case of unbiased estimators. Typically, maximizing the score function, ln L(θ|x), the logarithm of the likelihood, will be easier. Having the parameter values be the variable of interest is somewhat unusual, so we will next look at several examples of the likelihood function.
也就是说: