Week 10

This week, we will discuss maximum likelihood estimators.
Published

April 3, 2023

Learning Outcomes

First Lecture

  • Maximum Likelihood Estimator

Second Lecture

  • MLE Properties

Important Concepts

Data

Let \(X_1,\ldots,X_n\overset{iid}{\sim}F(\boldsymbol \theta)\) where \(F(\cdot)\) is a known distribution function and \(\boldsymbol\theta\) is a vector of parameters. Let \(\boldsymbol X = (X_1,\ldots, X_n)^\mathrm{T}\), be the sample collected.

Maximum Likelihood Estimator

Likelihood Function

Using the joint pdf or pmf of the sample \(\boldsymbol X\), the likelihood function is a function of \(\boldsymbol \theta\), given the observed data \(\boldsymbol X =\boldsymbol x\), defined as

\[ L(\boldsymbol \theta|\boldsymbol x)=f(\boldsymbol x|\boldsymbol \theta) \]

If the data is iid, then

\[ f(\boldsymbol x|\boldsymbol \theta) = \prod^n_{i=1}f(x_i|\boldsymbol\theta) \]

Estimator

The maximum likelihood estimator are the estimates of \(\boldsymbol \theta\) that maximize \(L(\boldsymbol\theta)\).

Log-Likelihood Approach

If \(\ln\{L(\boldsymbol \theta)\}\) is monotone of \(\boldsymbol \theta\), then maximizing \(\ln\{L(\boldsymbol \theta)\}\) will yield the maximum likelihood estimators.

MLE Properties

Unbiasedness

Let \(X_1,\ldots,X_n\) be a random sample from a distribution with parameter \(\theta\). Let \(\hat \theta\) be an estimator for a parameter \(\theta\). Then \(\hat \theta\) is an unbiased estimator if \(E(\hat \theta) = \theta\). Otherwise, \(\hat\theta\) is considered biased.

Consistency

Let \(X_1,\ldots,X_n\) be a random sample from a distribution with parameter \(\theta\). The estimator \(\hat \theta\) is a consistent estimator of the \(\theta\) if

  1. \(E\{(\hat\theta-\theta)^2\}\rightarrow0\) as \(n\rightarrow \infty\)
  2. \(P(|\hat\theta-\theta|\ge \epsilon)\rightarrow0\) as \(n\rightarrow \infty\) for every \(\epsilon>0\)

Invariance Principle

If \(\hat \theta\) is an ML estimator of \(\theta\), then for any one-to-one function \(g\), the ML estimator for \(g(\theta)\) is \(g(\hat\theta)\).

Large Sample MLE Properties

Let \(X_1,\ldots,X_n\) be a random sample from a distribution with parameter \(\theta\). Let \(\hat \theta\) be the MLE estimator for a parameter \(\theta\). As \(n\rightarrow\infty\), then \(\hat \theta\) has a normal distribution with mean \(\theta\) and variance \(1/nI(\theta)\), where

\[ I(\theta)=E\left[-\frac{\partial^2}{\partial\theta^2}\log\{f(X;\theta)\}\right] \]

Resources

First Lecture

Slides Videos Notes
Slides Video 001 Video 002 Notes 001 Notes 002

Second Lecture

Slides Videos Notes
Slides Video 001 Video 002 Notes 001 Notes 002