Maximum likelihood estimation

Prof. Maria Tackett

Oct 10, 2024

Announcements

  • Office hours:

    • This week: Thursday - Friday

    • Next week: Wednesday - Friday

  • No class next Monday or Tuesday


🍁 Have a good Fall Break! 🍁

Topics

  • Likelihood

  • Maximum likelihood estimation

  • MLE for linear regression

  • Properties of maximum likelihood estimator

Motivation

  • We can find the estimators of β and σϵ2 for the model

y=Xβ+ϵ,ϵ∼N(0,σϵ2I)using least-squares estimation

  • We have also shown some nice properties of the least-squares estimator β^, given E(ϵ)=0 and Var(ϵ)=σϵ2I

  • Today we will introduce another way to find these estimators - maximum likelihood estimation. We will see…

    • the maximum likelihood estimators have nice properties

    • the least-squares estimator is equal to the maximum likelihood estimator when certain assumptions hold

Maximum likelihood estimation

Example: Shooting free throws

Suppose a basketball player shoots a single free throw, such that the probability of making a basket is p

  • What is the probability distribution for this random phenomenon?

  • Suppose the probability is p=0.5? What is the probability the player makes a single shot, given this value of p?

  • Suppose the probability is p=0.8? What is the probability the player makes a single shot, given this value of p?

Shooting three free throws

Suppose the player shoots three free throws. They are all independent and the player has the same probability p of making each shot.

Let B represent a made basket, and M represent a missed basket. The player shoots three free throws with the outcome BBM.

  • Suppose the probability is p=0.5? What is the probability of observing the data BBM, given this value of p?

  • Suppose the probability is p=0.3? What is the probability of observing the data BBM, given this value of p ?

Shooting three free throws

Suppose the player shoots three free throws. They are all independent and the player has the same probability p of making each shot.

The player shoots three free throws with the outcome BBM.

  • How would you describe in words the probabilities we previously calculated?

  • New question: What parameter value of p do you think maximizes the probability of observing this data?

  • We will use a likelihood function to answer this question.

Likelihood

  • A likelihood is a function that tells us how likely we are to observe our data for a given parameter value (or values).

  • Note that this is not the same as the probability function.

  • Probability function: Fixed parameter value(s) + input possible outcomes ⇒ probability of seeing the different outcomes given the parameter value(s)

  • Likelihood function: Fixed data + input possible parameter values ⇒ probability of seeing the fixed data for each parameter value

Likelihood: shooting three free throws

The likelihood function for the probability of a basket p given we observed BBM when shooting three independent free throws is L(p|BBM)=p×p×(1−p)


Thus, if the likelihood for p=0.8 is

L(p=0.8|BBM)=0.8×0.8×(1−0.8)=0.128

Likelihood: shooting three free throws

  • What is the general formula for the likelihood function for p given the observed data BBM?

  • Why do we need to assume independence?

  • Why does having identically distributed data simplify things?

Likelihood: shooting three free throws

The likelihood function for p given the data BBM is

L(p|BBM)=p×p×(1−p)=p2×(1−p)

  • We want of the value of p that maximizes this likelihood function, i.e., the value of p that is most likely given the observed data.

  • The process of finding this value is maximum likelihood estimation.

  • There are three primary ways to find the maximum likelihood estimator

    • Approximate using a graph

    • Using calculus

    • Numerical approximation

Finding the MLE using graphs

What do you think is the approximate value of the MLE of p given the data?

Finding the MLE using calculus

  • Find the MLE using the first derivative of the likelihood function.
  • This can be tricky because of the Product Rule, so we can maximize the log(Likelihood) instead. The same value maximizes the likelihood and log(Likelihood).

Use calculus to find the MLE of p given the data BBM.

Shooting n free throws

Suppose the player shoots n free throws. They are all independent and the player has the same probability p of making each shot.

Suppose the player makes k baskets out of the n free throws. This is the observed data.

  • What is the formula for the probability distribution to describe this random phenomenon?
  • What is the formula for the likelihood function for p given the observed data?

  • For what value of p do we maximize the likelihood given the observed data? Use calculus to find the response.

Why maximum likelihood estimation?

  • “Maximum likelihood estimation is, by far, the most popular technique for deriving estimators.” (Casella and Berger 2024, 315)

  • MLEs have nice statistical properties. They are

    • Consistent

    • Efficient - Have the smallest MSE among all consistent estimators

    • Asymptotically normal

Note

If the normality assumption holds, the least squares estimator is the maximum likelihood estimator for β. Therefore, it has all these properties of the MLE.

MLE in linear regression

Linear regression

Recall the linear model

y=Xβ+ϵ,ϵ∼N(0,σϵ2I)

  • We have discussed least-squares estimation to find β^ and σ^ϵ2
  • We have discussed properties of β^ that depend on E(ϵ)=0 and Var(ϵ)=σϵ2I
  • We have used the fact that β^∼N(β,σϵ2(XTX)−1) when doing hypothesis testing and confidence intervals.
  • Now we will discuss how we know β^ is normally distributed, as we introduce MLE for linear regression

Simple linear regression model

Suppose we have the simple linear regression (SLR) model

yi=β0+β1xi+ϵi,ϵi∼N(0,σϵ2)

such that ϵi are independently and identically distributed.


We can write this model in the form below and use this to find the MLE

yi|xi∼N(β0+β1xi,σϵ2)

Side note: Normal distribution

Let X be a random variable, such that X∼N(μ,σ2). Then the probability function is

P(X=x|μ,σ2)=12πσ2exp⁡{−12σ2(x−μ)2}

Likelihood for SLR

The likelihood function for β0,β1,σϵ2 is

L(β0,β1,σϵ2|xi,…,xn,yi,…,yn)=∏i=1n12πσϵ2exp⁡{−12σϵ2(yi−[β0+β1xi])2}=(2πσϵ2)−n2exp⁡{−12σϵ2∑i=1n(yi−β0−β1xi)2}

Log-likelihood for SLR

The log-likelihood function for β0,β1,σϵ2 is

logL(β0,β1,σϵ2|xi,…,xn,yi,…,yn)=−n2log⁡(2πσϵ2)−12σϵ2∑i=1n(yi−β0−β1xi)2


We will use the log-likelihood function to find the MLEs

MLE for β0

1️⃣ Take derivative of log⁡L with respect to β0 and set it equal to 0

∂log⁡L∂β0=−22σϵ2∑i=1n(yi−β0−β1xi)(−1)=0

MLE for β0

2️⃣ Find the β~0 that satisfies the equality on the previous slide

After a few steps…

⇒∑i=1nyi−nβ~0−β~1∑i=1nxi=0⇒∑i=1nyi−β~1∑i=1nxi=nβ~0⇒1n∑i=1nyi−1nβ~1∑i=1nxi=β~0

MLE for β0

3️⃣ We can use the second derivative to show we’ve found the maximum

∂2log⁡L∂β02=−n2σ~ϵ2<0


Therefore, we have found the maximum. Thus, MLE for β0 is

β~0=y¯−β~1x¯

MLE for β1 and σϵ2

We can use a similar process to find the MLEs for β1 and σϵ2

β~1=∑i=1nyi(xi−x¯)∑i=1n(xi−x¯)2

σ~ϵ2=∑i=1n(yi−β~0−β~1xi)2n=∑i=1nei2n

Putting it all together

  • The MLEs β~0 and β~1 are equivalent to the least-squares estimators, when the errors follow independent and identical normal distributions

  • This means the least-squares estimators β^0 and β^1 and inherit all the nice properties of MLEs

    • Consistency
    • Efficiency - minimum variance among all consistent estimators
    • Asymptotically normal

Putting it all together

  • From previous work, we also know estimators β~0 and β~1 are unbiased

  • Note that the MLE σ~ϵ2 is asymptotically unbiased

    • The estimate from least-squares σ^ϵ2 is unbiased

References

Casella, George, and Roger Berger. 2024. Statistical Inference. CRC Press.

🔗 STA 221 - Fall 2024

1 / 29
Maximum likelihood estimation Prof. Maria Tackett Oct 10, 2024

  1. Slides

  2. Tools

  3. Close
  • Maximum likelihood estimation
  • Announcements
  • Topics
  • Motivation
  • Maximum likelihood estimation
  • Example: Shooting free throws
  • Shooting three free throws
  • Shooting three free throws
  • Likelihood
  • Likelihood: shooting three free throws
  • Likelihood: shooting three free throws
  • Likelihood: shooting three free throws
  • Finding the MLE using graphs
  • Finding the MLE using calculus
  • Shooting n free throws
  • Why maximum likelihood estimation?
  • MLE in linear regression
  • Linear regression
  • Simple linear regression model
  • Side note: Normal distribution
  • Likelihood for SLR
  • Log-likelihood for SLR
  • MLE for β0
  • MLE for β0
  • MLE for β0
  • MLE for β1 and σϵ2
  • Putting it all together
  • Putting it all together
  • References
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • r Scroll View Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help