Info

Material Covered

Lectures 1–3:

  • Arithmetic and logical operators in R
  • Generating vectors and matrices (including incremental vectors and replicated vectors)
  • Built-in base functions (sum(), mean(), sd() etc)
  • Type conversion (numeric, logical, character)
  • Indexing using numeric, logical, and character vectors
  • Data frames and tibbles
  • Writing custom functions
  • Pipe operator
  • Data manipulation with tidyverse: subsetting, mutating, grouping, summarising, reshaping
  • Plotting with tidyverse

Objective:

Solve as many questions as you can. Write your answers in the provided spaces. Knit your submission to PDF or to HTML and print to PDF, and upload to NTULearn.

If some of your R code doesn’t run, you may comment it out.

Mode

This is a restricted open-book quiz. You may use R manuals and course materials, but internet search and AI tools are not allowed.

Time limit

You have 30 minutes to complete the quiz.

# Load required packages
library(tidyverse)

Question 1

Compute the following sum using a single R command (no loops or if-else statements):

\[ \sum_{k=3}^{42}\frac{1}{4k+1}=\frac{1}{13} + \frac{1}{17}+\frac{1}{21}+\cdots+\frac{1}{169} \]

# ANSWER HERE
## First Method
sum(1 / seq(from = 13, to = 169, by = 4))

## Second Method
sum(1 / (4 * (3:42) + 1))
## [1] 0.6846003
## [1] 0.6846003

Question 2

Below we define a reference matrix:

X <- matrix(rep(1, 100), nrow = 10)
X <- 1.3 * row(X) - col(X)
X
##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
##  [1,]  0.3 -0.7 -1.7 -2.7 -3.7 -4.7 -5.7 -6.7 -7.7  -8.7
##  [2,]  1.6  0.6 -0.4 -1.4 -2.4 -3.4 -4.4 -5.4 -6.4  -7.4
##  [3,]  2.9  1.9  0.9 -0.1 -1.1 -2.1 -3.1 -4.1 -5.1  -6.1
##  [4,]  4.2  3.2  2.2  1.2  0.2 -0.8 -1.8 -2.8 -3.8  -4.8
##  [5,]  5.5  4.5  3.5  2.5  1.5  0.5 -0.5 -1.5 -2.5  -3.5
##  [6,]  6.8  5.8  4.8  3.8  2.8  1.8  0.8 -0.2 -1.2  -2.2
##  [7,]  8.1  7.1  6.1  5.1  4.1  3.1  2.1  1.1  0.1  -0.9
##  [8,]  9.4  8.4  7.4  6.4  5.4  4.4  3.4  2.4  1.4   0.4
##  [9,] 10.7  9.7  8.7  7.7  6.7  5.7  4.7  3.7  2.7   1.7
## [10,] 12.0 11.0 10.0  9.0  8.0  7.0  6.0  5.0  4.0   3.0

Write a single R command that computes the fraction of negative entries in any matrix (not necessarily the one above). For the given reference matrix, you should get \(0.37\)

# ANSWER HERE
mean(X < 0) 
## [1] 0.37

Question 3

What does the following function do?

mystery_function <- function(x) {
  rep(0, length(x)) +
    1 * (x == "0-5yrs") +
    2 * (x == "6-11yrs") +
    3 * (x == "12+ yrs")
}

Here we apply this function to a sample of education from infert data as a reference:

infert %>%
  pull(education) %>%
  sample(size = 10) %>%
  mystery_function()
##  [1] 2 3 3 3 2 3 2 3 3 3

Type your answer below:

ANSWER This function recodes a character variable into numeric values:

Question 4

Interpret the following plot. Specifically, which species have the longest and shortest petals, and how do you know?

iris %>%
  ggplot(aes(x = Petal.Length, fill = Species)) + 
  geom_density(alpha = 0.6)

Type your answer below:

ANSWER Virginica has the longest petals and Setosa the shortest. This is evident because the density curve for Virginica is concentrated on the right side, while Setosa is concentrated on the left.

Question 5

Write a single chain of pipe operators that

# ANSWER HERE
USArrests %>%
  mutate(across(-UrbanPop, log)) %>%
  pivot_longer(cols = -UrbanPop, names_to = "crime", values_to = "log_value") %>%
  ggplot(aes(x = crime, y = log_value)) + geom_violin()

Question 6

Write a pipe chain that computes the mean uptake in CO2 for each combination of Type and Treatment, and reshapes it to a table with:

Example format:

Type nonchilled chilled
Quebec
Mississippi
# ANSWER HERE
CO2 %>%
  group_by(Type, Treatment) %>%
  summarise(mean_uptake = mean(uptake)) %>%
  pivot_wider(names_from = "Treatment", values_from = mean_uptake)