M. Drew LaMar
February 20, 2019
“If all the statisticians in the world were laid head to toe, they wouldn't be able to reach a conclusion.”
- Anonymous
Definition: The Likelihood of model parameters \( \theta \) given the model (\( g \)) and data (\( x \)) is given by: \[ \mathcal{L}(\theta | x, g) \]
Note: Likelihood theory describes how to find the most likely parameters of a model that fit the data the best.
Suppose you observe 10 coin flips and you see the following result:
H H H H H H T T T H
Discuss: What’s the most likely value for the probability of heads on an individual coin flip?
Let \( p \) denote the probability of getting heads.
This follows what is known as a binomial model, with the probability of getting 7 heads out of 10 given by:
\[ \mathrm{Prob}(7 \ \textrm{heads}) = \left(\begin{array}{c}10 \\ 7\end{array}\right)p^{7}(1-p)^{3} \]
In this example \[ \mathcal{L}(\theta | x, g) = \left(\begin{array}{c}10 \\ 7\end{array}\right)p^{7}(1-p)^{3} \] we have
\[ \mathcal{L}(p | 10, 7; \mathrm{binomial}) = \left(\begin{array}{c}10 \\ 7\end{array}\right)p^{7}(1-p)^{3} \]
The most likely parameter given the model and data is where the likelihood function is maximized.
It's usually easier to deal with summation rather than products, so we look at the log-likelihood function instead:
\[ \log(\mathcal{L}(\theta | g, x)) \]
which in our case becomes
\[ \log\left(\begin{array}{c}10 \\ 7\end{array}\right) + 7\log p +3\log (1-p) \]
Log-likelihood function: