🐸 Frogs \(\rightarrow\) Log Regression

A flow.

This walk through builds intuition for:

glm(y ~ x, family = binomial)

Using the flow:

Linear predictor → plogis → likelihood (dbinom) → log-likelihood (sum(log)) → maximize (MLE)

With a binary predictor & response.


The Story

You surveyed 20 ponds for a frog and you detected frogs in 12 ponds.

You hypothesize that pond size affects frog presence. And so you categorize ponds:

  • x = 0 \(\rightarrow\) small pond

  • x = 1 \(\rightarrow\) large pond

Out of all of the ponds you visited, you labeled 10 small and 10 large.

y <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0) #frog presence or absence

x <- c(1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1) #ponds large (1) or small (0)

1 Linear Predictor

\(\eta = b_0 + b_1x\)

\(\eta\) = eta

This step:

  • uses only predictors

  • produces log odds

  • is not yet a model

  • you choose the coefficients

b0 <- -0.5 #here you are choosing these and it feels arbitrary but THIS IS WHAT THE GRIDSEARCH IS DOING. So, you can eventually use a sequence to find the best slope and intercept. 

b1 <- 1.0

eta <- b0 + b1*x

plot(eta, main="Log odds, b0: -0.5, b1: 1.0", col="#c77cff", ylab = expression(eta), xlab = "Index")

2 Plogis() \(\rightarrow\) Probabilities

Now convert those log-odds into probabilities:

allprobs <- plogis(eta)
  • small ponds get the probability: plogis(b0)

  • large ponds get the probability: plogis(b0 + b1)

plot(allprobs, main="Log probabilities converted", ylab="Probabilities", xlab = "Index", col="#00bfc4")

3 Dbinom() \(\rightarrow\) Likelihood

Likelihood is:

  • a method to measure how well the model fits the data

  • bigger #’s mean better fit

  • depends on the following:

    ** the model

    ** the parameters

    ** the data

Given the predicted probabilities in step 2, how likely is each observation?

all_likelihoods <- dbinom(y, size=1, prob = allprobs)

plot(all_likelihoods, main="Likelihoods for each observation of y given x", ylab = "Likelihood", xlab = "Index", col = "#f8766d")

4 Sum(log) \(\rightarrow\) Log-Likelihood

How well do these coefficients explain the frog data?

sum_log <- sum(dbinom(y, 1, allprobs, log=T))

By itself sum_log is just a value you can use to compare. Use it against different chosen values of b0 and b1 to find the least negative value!

The least negative gives you the best fit. This is when gridsearch/heatmaps come in handy. They point you in the less arbitrary direction quick.

5 Maximize/Gridsearching

To estimate coefficients, we would:

  • try many (b0, b1) combinations

  • compute the log-likelihood each time

  • keep the pair with the largest value

This is maximum likelihood estimation.

(Grid search does this manually; glm() software does it automatically.)

# I used CHAT to create this function. I did not make it from scratch. 
maximize_loglik_frogs <- function(y, x, b0_seq, b1_seq) {

  grid <- expand.grid(b0 = b0_seq, b1 = b1_seq)
  grid$loglik <- NA

  for (i in 1:nrow(grid)) {
    b0 <- grid$b0[i]
    b1 <- grid$b1[i]
    p <- plogis(b0 + b1 * x)
    grid$loglik[i] <- sum(dbinom(y, 1, p, log = TRUE))
  }

  best <- grid[which.max(grid$loglik), ]
  list(best = best, grid = grid)
}

out <- maximize_loglik_frogs(y, x, seq(-6,6,0.1), seq(-6,6,0.1))
out$best
##       b0   b1    loglik
## 6845 0.8 -0.4 -12.84116
out$grid$rel_loglik <- out$grid$loglik - max(out$grid$loglik)

ggplot(out$grid, aes(b0, b1, fill = rel_loglik)) +
  geom_raster() +
  geom_point(data = out$best, aes(b0, b1),
             inherit.aes = FALSE, size = 3) +
  theme_minimal() +
  labs(title = "Relative log-likelihood (0 = best)",
       x = "Intercept (b0)", y = "Slope (b1)", fill = "Rel log-lik")+
  scale_fill_viridis_c()

6 🐛 Metamorphosis \(\rightarrow\) Glm()

Steps 1-5 can be performed using the function glm() in r. So, we basically just did each step to create it! Easy peasy.

glm_frogs <- glm(y~x, family = binomial)

glm_frogs
## 
## Call:  glm(formula = y ~ x, family = binomial)
## 
## Coefficients:
## (Intercept)            x  
##      0.8473      -0.4418  
## 
## Degrees of Freedom: 19 Total (i.e. Null);  18 Residual
## Null Deviance:       25.9 
## Residual Deviance: 25.68     AIC: 29.68