Negative log likelihood

Perhaps some graphs will do. We have the normal distribution for unknown mean and variance 1 is given by

\[ f(x; \mu, 1) = (1/\sqrt(2\pi)) \cdot exp(-(1/2)(x-\mu)^2) \]

Here we have 3 data points, chosen independently so you would need to know that the distribution for the three points is found by multiplying:

\[ l(\mu; x_1, x_2, x_3) = f(x_1; \mu, 1) \cdot f(x_2; \mu, 1) \cdot f(x_3; \mu, 1) \]
Notice how clever the statisticians are, using \( \mu \) as the main variable on the left and the x's as the main variable on the right.

The negative log of $l$ is
\[ L(\mu) = -\log(l) = -(\log(f(x_1; \mu, 1)) + \log(f(x_2; \mu,1)) + \log(f(x_3; \mu,1))) \]
Due to the special nature of \( f(x) \), the logs become:

\[ \log(f(x_1; \mu, 1)) = - \log(\sqrt(2\pi)) - (1/2) \cdot(x_1 - \mu)^2 \]

Ditto for the other x's. Simplifying we get:

\[ L(\mu) = - 3 \log(\sqrt(2 \pi) + \sum (1/2)\cdot (x_i - \mu)^2. \]

Here is where we use calculus to find the minimum. That we did in the notes. Lets look graphically instead. Suppose the x's are +1, 0 and -1. Then the answer for where the minimum of l occurs should be the average of the three x's or 0. Here are some graphs to see.

First, let's look at l (before the negative log). Here we should see a maximum at 3.

f <- function(x, mu) (1/sqrt(2 * pi)) * exp(-(1/2) * (x - mu)^2)
x1 <- -1
x2 <- 0
x3 <- +1
l <- function(mu) f(x1, mu) * f(x2, mu) * f(x3, mu)
curve(l, -1.5, 1.5)

plot of chunk unnamed-chunk-1

Okay, the peak is at 0. The y axis is really small, hence a reason to take the logs. (Actually, there are better ones.). Here 0 should be a maximum.

L <- function(mu) -log(l(mu))
curve(L, -1.5, 1.5)

plot of chunk unnamed-chunk-2