These exercises accompany the Functions tutorial.
x
which is the numeric value to check, and threshold
which is the numeric threshold. Have the function return a logical value, TRUE
if the value is above the threshold and FALSE
if it is equal to or below the threshold.rnorm(n = 10, mean = 35, sd = 10)
to see if it’s above a threshold of 35.sapply()
function to find out how many days had an 8-hour ozone value above 0.075 ppm in the chicago_air
dataset.mapply()
function to find out how many days had an 8-hour ozone value above 0.075, how many days had a temperature above 90, and how many days had a solar radiation value above 1.25. (**Hint: You will have to write a new threshold function to use in the mapply()
function).Functions should have descriptive names that describe what the function does. Long names are better than shorter, less descriptive names.
checkAboveThreshold <- function(x, standard){
x > standard
}
checkAboveThreshold(x = 10, standard = 5)
## [1] TRUE
To make the answer reproducible, we set the seed so that the random numbers being generated can be recreated.
set.seed(22)
y <- rnorm(n = 10, mean = 35, sd = 10)
y
## [1] 29.87861 59.85184 45.07826 37.92815 32.91041 53.58092 34.33974
## [8] 33.37235 33.00139 38.00562
above_threshold <- c()
for(i in 1:10){
above_threshold[i] <- checkAboveThreshold(x = y[i], standard = 35)
}
data.frame(y, above_threshold)
## y above_threshold
## 1 29.87861 FALSE
## 2 59.85184 TRUE
## 3 45.07826 TRUE
## 4 37.92815 TRUE
## 5 32.91041 FALSE
## 6 53.58092 TRUE
## 7 34.33974 FALSE
## 8 33.37235 FALSE
## 9 33.00139 FALSE
## 10 38.00562 TRUE
library(region5air)
data(chicago_air)
head(chicago_air)
## date ozone temp solar month weekday
## 1 2013-01-01 0.032 17 0.65 1 3
## 2 2013-01-02 0.020 15 0.61 1 4
## 3 2013-01-03 0.021 28 0.17 1 5
## 4 2013-01-04 0.028 18 0.62 1 6
## 5 2013-01-05 0.025 26 0.48 1 7
## 6 2013-01-06 0.026 36 0.47 1 1
Before we can use our function, we need to modify it to handle NA
s.
checkAboveThreshold <- function(x, standard){
if(is.na(x)){
FALSE
}else{
x > standard
}
}
Now we can use our threshold function and take advantage of the fact that, in R, TRUE
is equivalent to a numeric value of 1 and FALSE
is equivalent to 0.
violation <- sapply(chicago_air$ozone, checkAboveThreshold, standard = 0.075)
head(violation)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE
total_violations <- sum(violation)
total_violations
## [1] 2
First, we make our own table with a parameter column and the corresponding threshold.
my_thresholds <- data.frame(parameter = c("ozone", "temp", "solar"),
threshold = c(0.075, 90, 1.25),
stringsAsFactors = FALSE)
my_thresholds
## parameter threshold
## 1 ozone 0.075
## 2 temp 90.000
## 3 solar 1.250
Next we write a function that will take the two values in each row and find the sum of days above the threshold for that parameter.
sumAboveThreshold <- function(values, threshold){
sum(sapply(values, checkAboveThreshold, standard = threshold))
}
Now we feed mapply()
our new function and use the my_thresholds
data frame to feed the arguments.
mapply(FUN = sumAboveThreshold, chicago_air[, my_thresholds$parameter],
my_thresholds$threshold)
## ozone temp solar
## 2 2 67