The logarithm of the likelihood function(L) for estimating the population mean and variance for an i.i.d normal sample is as follows:
L
=
−
n
2
l
o
g
(
2
π
σ
2
)
−
1
2
σ
2
∑
(
Y
i
−
μ
Y
)
2
L=-\frac{n}{2}log(2\pi \sigma^2)-\frac{1}{2\sigma^2}\sum(Y_i-\mu_Y)^2
L=−2nlog(2πσ2)−2σ21∑(Yi−μY)2
- maxmum likelihood estimators
Taking the dericative with respect to the two parameters μ Y \mu_Y μYand σ 2 \sigma^2 σ2results in
∂ L ∂ μ Y = − 1 2 σ 2 ∑ 2 ( Y i − μ Y ) ( − 1 ) = 1 σ 2 ∑ ( Y i − μ Y ) ∂ L ∂ σ 2 = − n 2 σ 2 + 1 2 σ 4 ∑ ( Y i − u Y ) 2 \frac{\partial L}{\partial \mu_Y}=-\frac{1}{2\sigma^2}\sum 2(Y_i-\mu_Y)(-1)=\frac{1}{\sigma^2}\sum (Y_i-\mu_Y) \\\quad \\\frac{\partial L}{\partial \sigma^2}=-\frac{n}{2\sigma^2}+\frac{1}{2\sigma^4}\sum(Y_i-u_Y)^2 ∂μY∂L=−2σ21∑2(Yi−μY)(−1)=σ21∑(Yi−μY)∂σ2∂L=−2σ2n+2σ41∑(Yi−uY)2
The maximum likelihood estimator is then the value for μ Y \mu_Y μYand σ 2 \sigma^2 σ2 that maximizes the (log) likelihood function. Setting both equations to zero, and assuming that this results in a maximum rather than a minimum, yields
μ Y , M L E = 1 n ∑ Y i = Y ‾ σ M L E 2 = 1 n ∑ ( Y i − Y ‾ ) \mu_{Y,MLE}=\frac{1}{n}\sum Y_i=\overline Y \\\quad\\\sigma^2_{MLE}=\frac{1}{n}\sum(Y_i-\overline Y) μY,MLE=n1∑Yi=YσMLE2=n1∑(Yi−Y) - differ from the OLS estimators
The maximum likelihood estimator of the population mean is therefore the sample mean. Since th OLS estimator is identical, and it is unbiased, the MLE will also be unbiased . However, the MLE for the population differs from the OLS estimator, and since the OLS estimator is unbiased, the MLE must be biased. But, the difference betwwen the two estimators vanishes as n increases, and hence the MLE is consistent.
Microeconometrics course, University of LIVERPOOL