Negative log likelihood

Perhaps some graphs will do. We have the normal distribution for unknown mean and variance 1 is given by

\[ f(x; \mu, 1) = (1/\sqrt(2\pi)) \cdot exp(-(1/2)(x-\mu)^2) \]

Here we have 3 data points, chosen independently so you would need to know that the distribution for the three points is found by multiplying:

\[ l(\mu; x_1, x_2, x_3) = f(x_1; \mu, 1) \cdot f(x_2; \mu, 1) \cdot f(x_3; \mu, 1) \]
Notice how clever the statisticians are, using $ \mu $ as the main variable on the left and the x's as the main variable on the right.

The negative log of $l$ is
\[ L(\mu) = -\log(l) = -(\log(f(x_1; \mu, 1)) + \log(f(x_2; \mu,1)) + \log(f(x_3; \mu,1))) \]
Due to the special nature of $ f(x) $, the logs become:

\[ \log(f(x_1; \mu, 1)) = - \log(\sqrt(2\pi)) - (1/2) \cdot(x_1 - \mu)^2 \]

Ditto for the other x's. Simplifying we get:

\[ L(\mu) = - 3 \log(\sqrt(2 \pi) + \sum (1/2)\cdot (x_i - \mu)^2. \]

Here is where we use calculus to find the minimum. That we did in the notes. Lets look graphically instead. Suppose the x's are +1, 0 and -1. Then the answer for where the minimum of l occurs should be the average of the three x's or 0. Here are some graphs to see.

First, let's look at l (before the negative log). Here we should see a maximum at 3.

f <- function(x, mu) (1/sqrt(2 * pi)) * exp(-(1/2) * (x - mu)^2)
x1 <- -1
x2 <- 0
x3 <- +1
l <- function(mu) f(x1, mu) * f(x2, mu) * f(x3, mu)
curve(l, -1.5, 1.5)

plot of chunk unnamed-chunk-1

Okay, the peak is at 0. The y axis is really small, hence a reason to take the logs. (Actually, there are better ones.). Here 0 should be a maximum.

L <- function(mu) -log(l(mu))
curve(L, -1.5, 1.5)

plot of chunk unnamed-chunk-2