Exercises

These exercises accompany the Functions tutorial.

  1. Write a function that checks if a value is above a threshold value. The function should have two parameters: x which is the numeric value to check, and threshold which is the numeric threshold. Have the function return a logical value, TRUE if the value is above the threshold and FALSE if it is equal to or below the threshold.

Solution

  1. Write a for loop that uses the function from the first exercise and checks each value in the random vector rnorm(n = 10, mean = 35, sd = 10) to see if it’s above a threshold of 35.

Solution

  1. Use the function from the first exercise and the sapply() function to find out how many days had an 8-hour ozone value above 0.075 ppm in the chicago_air dataset.

Solution

  1. Use the mapply() function to find out how many days had an 8-hour ozone value above 0.075, how many days had a temperature above 90, and how many days had a solar radiation value above 1.25. (**Hint: You will have to write a new threshold function to use in the mapply() function).

Solution

Solutions

Solution 1

Functions should have descriptive names that describe what the function does. Long names are better than shorter, less descriptive names.

checkAboveThreshold <- function(x, standard){
  x > standard
}

checkAboveThreshold(x = 10, standard = 5)
## [1] TRUE

Back to exercises

Solution 2

To make the answer reproducible, we set the seed so that the random numbers being generated can be recreated.

set.seed(22)
y <- rnorm(n = 10, mean = 35, sd = 10)
y
##  [1] 29.87861 59.85184 45.07826 37.92815 32.91041 53.58092 34.33974
##  [8] 33.37235 33.00139 38.00562
above_threshold <- c()
for(i in 1:10){
  above_threshold[i] <- checkAboveThreshold(x = y[i], standard = 35)
}
data.frame(y, above_threshold)
##           y above_threshold
## 1  29.87861           FALSE
## 2  59.85184            TRUE
## 3  45.07826            TRUE
## 4  37.92815            TRUE
## 5  32.91041           FALSE
## 6  53.58092            TRUE
## 7  34.33974           FALSE
## 8  33.37235           FALSE
## 9  33.00139           FALSE
## 10 38.00562            TRUE

Back to exercises

Solution 3

library(region5air)
data(chicago_air)
head(chicago_air)
##         date ozone temp solar month weekday
## 1 2013-01-01 0.032   17  0.65     1       3
## 2 2013-01-02 0.020   15  0.61     1       4
## 3 2013-01-03 0.021   28  0.17     1       5
## 4 2013-01-04 0.028   18  0.62     1       6
## 5 2013-01-05 0.025   26  0.48     1       7
## 6 2013-01-06 0.026   36  0.47     1       1

Before we can use our function, we need to modify it to handle NAs.

checkAboveThreshold <- function(x, standard){
  if(is.na(x)){
    FALSE
  }else{
    x > standard
  }
}

Now we can use our threshold function and take advantage of the fact that, in R, TRUE is equivalent to a numeric value of 1 and FALSE is equivalent to 0.

violation <- sapply(chicago_air$ozone, checkAboveThreshold, standard = 0.075)
head(violation)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE
total_violations <- sum(violation)
total_violations
## [1] 2

Back to exercises

Solution 4

First, we make our own table with a parameter column and the corresponding threshold.

my_thresholds <- data.frame(parameter = c("ozone", "temp", "solar"),
                            threshold = c(0.075, 90, 1.25),
                            stringsAsFactors = FALSE)
my_thresholds
##   parameter threshold
## 1     ozone     0.075
## 2      temp    90.000
## 3     solar     1.250

Next we write a function that will take the two values in each row and find the sum of days above the threshold for that parameter.

sumAboveThreshold <- function(values, threshold){
  sum(sapply(values, checkAboveThreshold, standard = threshold))
}

Now we feed mapply() our new function and use the my_thresholds data frame to feed the arguments.

mapply(FUN = sumAboveThreshold, chicago_air[, my_thresholds$parameter],
       my_thresholds$threshold)
## ozone  temp solar 
##     2     2    67

Back to exercises