Heike Hofmann
Stat 579, Fall 2013
No homework this week - but make sure to look at the practice exams!
Please submit Homework #4 online to the Blackboard system.
Grading Scheme C/S/O - each area gives 3.5 points, you could potentially over-score
Best answers for homework #3 are now online
mymean <- function(x) {
return(sum(x)/length(x))
}
mymean(1:15)
[1] 8
mymean(c(1:15, NA))
[1] NA
We would like to be able to specify na.rm=TRUE to be able to ignore missing values
mymean <- function(x, na.rm) {
## if na.rm is true, we would like to ignore missing values and
## compute average then
return(sum(x)/length(x))
}
mymean(1:15)
[1] 8
mymean(c(1:15, NA))
[1] NA
if (condition) {
statement
} [else { ## else is optional
statement
}]
condition is a length one logical value, i.e. either TRUE or FALSEmymean <- function(x, na.rm) {
## if na.rm is true, we would like to ignore missing values and
## compute average then
if (na.rm == TRUE) x <- na.omit(x)
return(sum(x)/length(x))
}
mymean(c(1:15, NA), na.rm=TRUE)
[1] 8
But a regular call without missing values doesn't work anymore. This is what we get for mymean(1:15)
Error in na.rm == TRUE : 'na.rm' is missing
We need to either specify na.rm for each call or use a default value
mymean <- function(x, na.rm=TRUE) {
## if na.rm is true, we would like to ignore missing values and
## compute average then
if (na.rm == TRUE) x <- na.omit(x)
return(sum(x)/length(x))
}
mymean(1:15)
[1] 8
mymean(c(1:15, NA))
[1] 8
mymean(c(1:15, NA), na.rm=T)
[1] 8
mymeanx, na.rm=T
if(na.rm) x <- na.omit(x)return(sum(x)/length(x))Similar to what we did with mean, write a function mysd that computes the standard deviation of variable x from scratch'. Include a parameter na.rm in it.
\( L(\lambda; x) = - n\lambda + \log(\lambda) \sum_i x_i \)
is the log likelihood function of a Poisson variable \( X \) with parameter \( \lambda > 0 \).
Write a function loglikpois with parameters lambda and (vector) x, that computes the log likelihood value for \( \lambda \) given observations \( x \). Make sure to check that lambda is positive, return an error message (stop) if not.
Plot the likelihood values for lambda between 0 and 10 for
x = c(1, 3, 2, 3, 0, 1, 0, 1, 3, 3)
mysd <- function(x, na.rm=TRUE) {
## if na.rm is true, we would like to ignore missing values and
## compute standard deviation then
if (na.rm == TRUE) x <- na.omit(x)
if (length(x) == 1) return(0)
tss <- sum((x - mean(x))^2)
return(sqrt(tss/(length(x)-1)))
}
x <- rnorm(10)
mysd(x)
[1] 1.177
## compare with
sd(x)
[1] 1.177
mysd(c(x, NA), na.rm=T)
[1] 1.177
## compare with
sd(x,na.rm=T)
[1] 1.177
loglikpois <- function(x, lambda) {
stopifnot(lambda>0, all(x>=0))
n <- length(x)
return(- n*lambda + log(lambda)*sum(x))
}
x <- c(1, 3, 2, 3, 0, 1, 0, 1, 3, 3)
lambda <- seq(0.01, 10, by=0.01)
library(ggplot2)
qplot(lambda, loglikpois(x, lambda))
Debugging functions is an art – very much like finding errors in a proof. What you can do yourself:
browser()Want to run the same block of code multiple times:
players <- unique(baseball$id)
for (i in players) {
player <- subset(baseball, id == i)
avg <- with(player, mean(h/ab, na.rm=T))
print(avg)
}
Loop or iteration
Re-do the first part of the homework without ddply; i.e. use for-loops to gather all the player information.