Quarto Assignment

Breaking Down Homework #4

By: Bronwyn Kiernan

Sophisticated Data Structures


In this homework assignment, we look into data structures like lists and dataframes that simplify:

  • Data Representation
  • Manipulation
  • Analysis


Also introduced the factor, which is a special kind of variable that is used to represent categorical objects.

Question 1)

From the spuRs package you can obtain the dataset ufc.csv, with forest inventory observations from the University of Idaho Experimental Forest.

What are the species of the three tallest trees? Of the five fattest trees? (Use the order command.)

What are the mean diameters by species?

What are the two species that have the largest third quartile diameters?

What are the two species with the largest median slenderness (height/diameter) ratios? How about the two species with the smallest median slenderness ratios?

What is the identity of the tallest tree of the species that was the fattest on average?

Step 1: Loading data into R


Before answering any question we have to upload ufc data

ufc <- read.csv("/Users/bronwynkiernan/math 329/ufc.csv", header = TRUE)


What’s contained in this dataset?

head(ufc) #will show first 6 rows of data
  plot tree species dbh.cm height.m
1    2    1      DF     39     20.5
2    2    2      WL     48     33.0
3    3    2      GF     52     30.0
4    3    5      WC     36     20.7
5    3    8      WC     38     22.5
6    4    1      WC     46     18.0


Now onto part a)!

Part a)


We want to use the order function to determine the heights of trees in the ufc dataset

three_tallest_trees <- ufc[order(ufc$height.m,  decreasing=TRUE), ][1:3, "species"] 
#we want to use the height.m column to categorize the df by height
#we set decreasing=TRUE, so it sorts the heights in descending order (from tallest to shortest).


so, the function return the species with the three tallest trees

print(three_tallest_trees)
[1] "GF" "WL" "DF"

Part a)


Now, for the fattest trees we will be using the dbh.cm column

five_fattest_trees <- ufc[order(ufc$dbh.cm, decreasing=TRUE), ][1:5, "species"] 


so, the function return the species with the five fattest trees

print(five_fattest_trees)
[1] "WC" "DF" "WC" "WC" "GF"

Part b)

tapply() is a function that allows us to vectorize the application of a function to subsets of data. In conjunction with factors, this can make for exceptionally efficient code. it looks like the form: tapply(X, INDEX, FUN, ...)

Using tapply() we are able to calculate the mean DBH for each species separately.

mean_diameters <- tapply(ufc$dbh.cm, ufc$species, mean)

so, the function is applied to each subset of dbh.cm defined by the corresponding species & calculates the average diameter for trees of each species.

print(round(mean_diameters,2))
   DF    GF    WC    WL 
39.91 35.21 38.84 33.73 

Part c)


We want to find the two species with the largest third quartile diameters


This means we are going to use the tapply() function again

# tapply applies the quantile function to dbh.cm (diameter) values grouped by species
third_quartile_diameters <- tapply(ufc$dbh.cm, ufc$species, quantile, probs = 0.75)
# this calculates the third quartile diameter (0.75 quantile) for each species in the dataset.

Part c)


So, in return it will calculate all species’ third quartile diameters then sort by the largest to smallest

# Print sorted third quartile diameters
print(sort(third_quartile_diameters, decreasing=TRUE)) 
   DF    WC    GF    WL 
50.20 49.20 44.40 43.85 
# Print the names of the two species with the largest third quartile diameters
print(names(sort(third_quartile_diameters, decreasing=TRUE))[1:2]) 
[1] "DF" "WC"

Part d)

  • The slenderness ratio is calculated as the ratio of height to diameter.
  • Using tapply(), we compute the median ratio for each species.
  • Then we sort the results to identify the species with the largest and smallest median slenderness ratios.
# Calculate the median slenderness for each species
slenderness <- tapply(ufc$height.m / ufc$dbh.cm, ufc$species, median)

# Sort the ratios for output
print(sort(slenderness))
       WC        DF        GF        WL 
0.6435644 0.6487342 0.7136517 0.9244338 
# Display the largest and smallest median ratios
cat('largest: ', names(sort(slenderness, decreasing = TRUE))[1:2], '\n')
largest:  WL GF 
cat('smallest:' ,names(sort(slenderness, decreasing = FALSE))[1:2], '\n')
smallest: WC DF 

Part e) Finding the identity of the tallest tree that was the fattest on average

# Calculate the volume of each tree using the formula for the volume of a cylinder
ufc$volume.m3 <- pi * (ufc$dbh.cm/200)^2 * ufc$height.m / 2
# Calculate the mean volume for each species in the dataset
fatness_per_species <- tapply(ufc$volume.m3, ufc$species, mean)
print(fatness_per_species) # Prints the result shows the average volume for each species.
      DF       GF       WC       WL 
2.127623 1.774630 2.060298 1.472980 

Part e)


The names() function is used for part of this question to extract the species name


# Sorts the species by their average volume in descending order and selects the one with the highest average fattest" species
fattest_species <- names(sort(fatness_per_species, decreasing= TRUE))[1]
# Filters the dataset to include only trees belonging to the fattest species (using subset())
fattest_species_list <- subset(ufc, subset= species %in% c(fattest_species), select=c(plot, tree, species, dbh.cm, height.m,volume.m3))
# Sorts the trees of the fattest species by height in descending order and selects the tallest one
tallest_tree_in_fattest_species <- fattest_species_list[order(fattest_species_list$height.m, decreasing=TRUE), ][1, ]
print(tallest_tree_in_fattest_species)
    plot tree species dbh.cm height.m volume.m3
141   55    2      DF   99.8       42  16.42745

Question 2) Create a list with the following info:

  • your full name, gender, age,
  • a list of your 3 favorite movies,
  • the answer to the question ‘Do you support the United Nations?’, and
  • a list of the birth day and month of your immediate family members


Do the same for three close friends, & write a program to check if there are any shared birthdays or names in the four lists.

Lists

  • A list is an indexed set of objects (and so has a length), but unlike a vector the elements of a list can be of different types, including other lists.
  • A list is created using the list(...) command, with comma-separated arguments
  • Single square brackets are used to select a sublist; double square brackets are used to extract a single element.

Information Lists

My Info List

my_info <- list(
  name = "Bronwyn Kiernan",               
  gender = "Female",
  age = 20,
  favorite_movies = c("Pitch Perfect", "Mamma Mia", "Maze Runner"),
  support_UN = TRUE,
  family_birthdays = list(
    Bronwyn = c(day = 7, month = 9),   # My birthday
    Jennifer = c(day = 20, month = 8),   # Immediate family member 1 (mother)
    Erin = c(day = 27, month = 9)    # Immediate family member 2 (sister)
  )
)

Friend #1 List

friend_1 <- list(
  name = "Hayden",
  gender = "Female",
  age = 19,
  favorite_movies = c("La La Land", "Barbie", "Pitch Perfect"),
  support_UN = TRUE,
  family_birthdays = list(
    Hayden = c(day = 29, month = 9),    # Friend 1's birthday
    PJ = c(day = 23, month = 12),
    Wendy = c(day = 16, month = 4)
  )
)

Information Lists

Friend #2 List

friend_2 <- list(
  name = "Ruby",
  gender = "Female",
  age = 18,
  favorite_movies = c("Quiet Place", "Harry Potter", "Get Out"),
  support_UN = TRUE,
  family_birthdays = list(
    Ruby = c(day = 24, month = 10),      # Friend 2's birthday
    Posey = c(day = 23, month = 4),
    Sunny = c(day = 18, month = 5)
  )
)

Friend #3 List

friend_3 <- list(
  name = "Cassie",
  gender = "Female",
  age = 20,
  favorite_movies = c("Coco", "Toy Story", "Moana"),
  support_UN = TRUE,
  family_birthdays = list(
    Cassie = c(day = 25, month = 7),   # Friend 3's birthday
    Tori = c(day = 20, month = 6),
    Denise = c(day = 20, month = 8)
  )
)

Check for Shared Birthdays

  • all_families: combines all family information into a single list for easier iteration.
  • birthdays: initializes an empty list to store all individual birthday entries.
  • The for loop iterates over each family in all_families.
  • It extracts the family_birthdays (assumed to be a list of birthdays for each family) and adds them to the birthdays list.
  • The class(birthdays) & print(birthdays) statements are used for debugging, displaying the data type and contents of the birthdays list.
  • Initializes shared_birthdays as an empty list to store any birthdays that match.
  • Uses for loops, the outer loop (i) iterates from the first birthday to the second-to-last & the inner loop (j) iterates from the next birthday after i to the last.
  • Inside the loops, it prints out the current pair of birthdays for debugging & checks if the day/month match for the 2 birthdays being compared. If they do, it appends the matching birthday to the shared_birthdays list.

Code for checking shared birthdays

check_shared_birthdays <- function(my_info, friend_1, friend_2, friend_3) {
  all_families <- list(my_info, friend_1, friend_2, friend_3)
  birthdays <- list()
  
  for (family in all_families) {
    birthdays <- c(birthdays, family$family_birthdays)
  }
  class(birthdays)
  print(birthdays)
  
  shared_birthdays <- list()
  for (i in 1:(length(birthdays)-1)) {
    for (j in (i + 1):length(birthdays)) {
        print(birthdays[[i]])
        print(birthdays[[j]])
      if (birthdays[[i]]["day"] == birthdays[[j]]["day"] && 
          birthdays[[i]]["month"] == birthdays[[j]]["month"]) {
        shared_birthdays <- append(shared_birthdays, list(birthdays[[i]]))
      }
    }
  }
  
  return(shared_birthdays)
}

Check for Shared Names

Repeat the similar process for names

  • Create a function to collect everyone’s names
  • Place all names into the same list
  • Check for duplicates in names

Code

check_shared_names <- function(my_info, friend_1, friend_2, friend_3) {
  all_families <- list(my_info, friend_1, friend_2, friend_3)
  names <- list()
  
  for (family in all_families) {
    names <- c(names, names(family$family_birthdays))
  }
  
  shared_names <- names[duplicated(names)]
  
  return(shared_names)
}

Answer

Combine all lists we made into the two functions we have

shared_birthdays <- check_shared_birthdays(my_info, friend_1, friend_2, friend_3)
$Bronwyn
  day month 
    7     9 

$Jennifer
  day month 
   20     8 

$Erin
  day month 
   27     9 

$Hayden
  day month 
   29     9 

$PJ
  day month 
   23    12 

$Wendy
  day month 
   16     4 

$Ruby
  day month 
   24    10 

$Posey
  day month 
   23     4 

$Sunny
  day month 
   18     5 

$Cassie
  day month 
   25     7 

$Tori
  day month 
   20     6 

$Denise
  day month 
   20     8 

  day month 
    7     9 
  day month 
   20     8 
  day month 
    7     9 
  day month 
   27     9 
  day month 
    7     9 
  day month 
   29     9 
  day month 
    7     9 
  day month 
   23    12 
  day month 
    7     9 
  day month 
   16     4 
  day month 
    7     9 
  day month 
   24    10 
  day month 
    7     9 
  day month 
   23     4 
  day month 
    7     9 
  day month 
   18     5 
  day month 
    7     9 
  day month 
   25     7 
  day month 
    7     9 
  day month 
   20     6 
  day month 
    7     9 
  day month 
   20     8 
  day month 
   20     8 
  day month 
   27     9 
  day month 
   20     8 
  day month 
   29     9 
  day month 
   20     8 
  day month 
   23    12 
  day month 
   20     8 
  day month 
   16     4 
  day month 
   20     8 
  day month 
   24    10 
  day month 
   20     8 
  day month 
   23     4 
  day month 
   20     8 
  day month 
   18     5 
  day month 
   20     8 
  day month 
   25     7 
  day month 
   20     8 
  day month 
   20     6 
  day month 
   20     8 
  day month 
   20     8 
  day month 
   27     9 
  day month 
   29     9 
  day month 
   27     9 
  day month 
   23    12 
  day month 
   27     9 
  day month 
   16     4 
  day month 
   27     9 
  day month 
   24    10 
  day month 
   27     9 
  day month 
   23     4 
  day month 
   27     9 
  day month 
   18     5 
  day month 
   27     9 
  day month 
   25     7 
  day month 
   27     9 
  day month 
   20     6 
  day month 
   27     9 
  day month 
   20     8 
  day month 
   29     9 
  day month 
   23    12 
  day month 
   29     9 
  day month 
   16     4 
  day month 
   29     9 
  day month 
   24    10 
  day month 
   29     9 
  day month 
   23     4 
  day month 
   29     9 
  day month 
   18     5 
  day month 
   29     9 
  day month 
   25     7 
  day month 
   29     9 
  day month 
   20     6 
  day month 
   29     9 
  day month 
   20     8 
  day month 
   23    12 
  day month 
   16     4 
  day month 
   23    12 
  day month 
   24    10 
  day month 
   23    12 
  day month 
   23     4 
  day month 
   23    12 
  day month 
   18     5 
  day month 
   23    12 
  day month 
   25     7 
  day month 
   23    12 
  day month 
   20     6 
  day month 
   23    12 
  day month 
   20     8 
  day month 
   16     4 
  day month 
   24    10 
  day month 
   16     4 
  day month 
   23     4 
  day month 
   16     4 
  day month 
   18     5 
  day month 
   16     4 
  day month 
   25     7 
  day month 
   16     4 
  day month 
   20     6 
  day month 
   16     4 
  day month 
   20     8 
  day month 
   24    10 
  day month 
   23     4 
  day month 
   24    10 
  day month 
   18     5 
  day month 
   24    10 
  day month 
   25     7 
  day month 
   24    10 
  day month 
   20     6 
  day month 
   24    10 
  day month 
   20     8 
  day month 
   23     4 
  day month 
   18     5 
  day month 
   23     4 
  day month 
   25     7 
  day month 
   23     4 
  day month 
   20     6 
  day month 
   23     4 
  day month 
   20     8 
  day month 
   18     5 
  day month 
   25     7 
  day month 
   18     5 
  day month 
   20     6 
  day month 
   18     5 
  day month 
   20     8 
  day month 
   25     7 
  day month 
   20     6 
  day month 
   25     7 
  day month 
   20     8 
  day month 
   20     6 
  day month 
   20     8 
shared_names <- check_shared_names(my_info, friend_1, friend_2, friend_3)
# Output results
print("Shared Birthdays:")
[1] "Shared Birthdays:"
print(shared_birthdays)
[[1]]
  day month 
   20     8 
print("Shared Names:")
[1] "Shared Names:"
print(shared_names)
list()

Question 3)


Using the tree growth data (Section 6.4.3, available from the spuRs package), plot tree age versus height for each tree, broken down by habitat type. That is, create a grid of 5 plots, each showing the trees from a single habitat.

  • In this example instead of having all habitats on one plot, we are going to break it up into 5 different plots for easier viewing
Load in dataset from computer
treegrowth <- read.csv("/Users/bronwynkiernan/math 329/treegrowth.csv")
head(treegrowth)
  tree.ID forest habitat dbh.in height.ft age
1       1      4       5   14.6      71.4  55
2       1      4       5   12.4      61.4  45
3       1      4       5    8.8      40.1  35
4       1      4       5    7.0      28.6  25
5       1      4       5    4.0      19.6  15
6       2      4       5   20.0     103.4 107

Code

# mfrow; sets up the plotting area to display multiple plots in a 2x3 grid
# las; changes the orientation of the axis labels to be horizontal
# mar; sets the margins of the plots (bottom, left, top, right) to ensure sufficient space for labels and titles
par(mfrow=c(2,3), las=1, mar=c(4,4,3,2))
# unique(); retrieves all unique habitat types from the treegrowth dataset, which will be used to create separate plots for each habitat
habitat_types <- unique(treegrowth$habitat)

Better View of Plot

Here we can see all 5 separate plots

par(mfrow = c(2, 3)) for (habitat in habitat_types) { habitat_data <- subset(treegrowth, habitat == treegrowth\(habitat) plot(habitat_data\)age, habitat_data\(height.ft, main = paste("habitat:", habitat), xlab = "age", ylab = "height", type = "n") points(habitat_data\)age, habitat_data$height.ft, pch = 19) }

Question 4) Pascal’s Triangle


What is Pascal’s Triangle?

  • A triangular array of binomial coefficients.
  • Each number is the sum of the two numbers directly above it.

Function to Generate the Next Row

  • Code:
next_row <- function(prev_row) {
  return(c(1, prev_row[-length(prev_row)] + prev_row[-1], 1))
}
  • Explanation: Takes the previous row of Pascal’s Triangle as input. Then adds adjacent elements of the previous row & surrounds the sum with 1 at both ends to form the next row.

Function to Generate Pascal’s Triangle

  • Initializes the first row of Pascal’s Triangle as [1].
  • It generates rows up to the specified depth using next_row.
  • Stores each row in a list to form the entire triangle.
  • Code
pascals_triangle <- function(depth) {
  # Start with the first row
  triangle <- list(c(1))
  
  # Iteratively generate each subsequent row
  for (i in 2:depth) {
    # Get the last row and calculate the next row
    next_r <- next_row(triangle[[i - 1]])
    triangle[[i]] <- next_r
  }
  
  return(triangle)
}

Extending Pascal’s Triangle

  • Takes an existing Pascal’s Triangle.
  • Generates the next row by extending the last row of the triangle.
  • Appends the new row to the triangle.
  • Code
extend_pascals_triangle <- function(triangle) {
  # Get the last row of the current triangle
  last_row <- triangle[[length(triangle)]]
  
  # Generate the next row and append it to the triangle
  next_r <- next_row(last_row)
  triangle[[length(triangle) + 1]] <- next_r
  
  return(triangle)
}

Verifying the 11th Row

  • Generates Pascal’s Triangle up to 10 rows.
  • Extends it to get the 11th row.
  • Verifies that the 11th row corresponds to the binomial coefficients 10 choose i.
  • Code
depth_10_triangle <- pascals_triangle(10)
depth_11_triangle <- extend_pascals_triangle(depth_10_triangle)

# The 11th row of Pascal's triangle
eleventh_row <- depth_11_triangle[[11]]
print(eleventh_row)
 [1]   1  10  45 120 210 252 210 120  45  10   1
print(pascals_triangle(11))
[[1]]
[1] 1

[[2]]
[1] 1 1

[[3]]
[1] 1 2 1

[[4]]
[1] 1 3 3 1

[[5]]
[1] 1 4 6 4 1

[[6]]
[1]  1  5 10 10  5  1

[[7]]
[1]  1  6 15 20 15  6  1

[[8]]
[1]  1  7 21 35 35 21  7  1

[[9]]
[1]  1  8 28 56 70 56 28  8  1

[[10]]
 [1]   1   9  36  84 126 126  84  36   9   1

[[11]]
 [1]   1  10  45 120 210 252 210 120  45  10   1

Question 5) Horse Racing

  • Goal: organize racing data into a structured format & plot the race results, highlighting winning horses.
  • Each line in the data file represents information for a race.
  • This formal is Race number, Win/Loss status (1 or 0), race time, and pairs of values (time and price).

Function to Read & Process the Racing Data

  • Using readLines() to read the entire file into memory.
  • Store information for each race; race_num, win_loss, times & prices
  • Handling race transitions: Each race is stored separately in the list races.

The Code

read_racing_data <- function() {
  raw_data <- readLines("/Users/bronwynkiernan/math 329/racing.txt")
  
  # Initialize variables
  races <- list()
  current_race <- NULL
  current_horses <- list()
  
  for (line in raw_data) {
    # Split the line by tab or spaces
    tokens <- as.numeric(unlist(strsplit(line, "\\s+")))
    tokens <- tokens[!is.na(tokens)]
    
    race_num <- tokens[1]
    win_loss <- tokens[2] == 1
    off_time <- tokens[3]
    # Extract the times and prices from remaining tokens in pairs
    times_prices <- tokens[-c(1:3)]
    times <- times_prices[seq(2, length(times_prices), by = 2)]
    prices <- times_prices[seq(1, length(times_prices), by = 2)]
    # Create a horse entry
    horse <- list(win = win_loss, times = times, prices = prices)
    # Check if we are still in the same race or a new race has started
    if (is.null(current_race)) {
      current_race <- race_num
      current_horses <- list(horse)
    } else if (race_num == current_race) {
      current_horses <- append(current_horses, list(horse))
    } else {
      races[[current_race]] <- current_horses
      current_race <- race_num
      current_horses <- list(horse)
    }
  }
  # Add the last race
  if (!is.null(current_race)) {
    races[[current_race]] <- current_horses
  }
  return(races)
}

Plotting a Single Race

  • Use plot() with empty data (NULL) and set axes limits based on the range of times and prices.
  • Assign red to winning horses and black to the others.
  • For each horse, plot a line connecting the time points (times) and the log of prices (log(prices)), with different line widths for winners (thicker).

Actual Code

# Function to plot a single race
plot_race <- function(race) {
  # Initialize plot
  plot(NULL, xlim = range(unlist(lapply(race, function(horse) horse$times), use.names = FALSE), na.rm = TRUE),
       ylim = range(log(unlist(lapply(race, function(horse) horse$prices), use.names = FALSE)), na.rm = TRUE),
       xlab = "Time", ylab = "Log(Price)", main = "Log Price vs Time")
  
  # Set colors
  colors <- rep("black", length(race))
  for (i in 1:length(race)) {
    if (race[[i]]$win) {
      colors[i] <- "red"  # Highlight the winning horse in red
    }
  }
  
  # Plot each horse
  for (i in 1:length(race)) {
    lines(race[[i]]$times, log(race[[i]]$prices), col = colors[i], lwd = ifelse(race[[i]]$win, 2, 1))
  }
  
  legend("topright", legend = c("Winning Horse", "Losing Horse"), col = c("red", "black"), lty = 1, lwd = c(2, 1))
}

Question 6)

Here is a recursive program that prints all the possible ways that an amount x (in cents) can be made up using Australian coins.

To avoid repetition, each possible decomposition is ordered.
  • Goal: Find all possible ways to make up an amount x (in cents) using Australian coins.
  • x: The target amount to break into coins.
  • y.vec: A vector to keep track of coins used so far.
  • coins: the available denominations: [200, 100, 50, 20, 10, 5] (all divisible by 5).

Breakdown of Code

  • This function prints all possible combinations of coins that sum up to x.
  • If x == 0, it prints y.vec, which holds the combination of coins used.
  • Subtract each coin from x to create a new potential remaining amount (new.x).
  • Filter new.x to keep only non-negative amounts.
  • For each valid new.x, create a new coin combination (y.tmp) and call change() recursively.
  • The identical(y.tmp, sort(y.tmp)) ensures combinations are sorted in descending order to avoid duplicates.

Actual Code

change <- function(x, y.vec = c()) {
  # finds possible ways of making up amount x using Australian coins
  # x is given in cents and we assume it is divisible by 5
  # y.vec are coins already used (so total amount is x + sum(y.vec))
  if (x == 0) {
    cat(y.vec, "\n")
  } else {
    coins <- c(200, 100, 50, 20, 10, 5)
    new.x <- x - coins
    new.x <- new.x[new.x >= 0]
    for (z in new.x) {
      y.tmp <- c(y.vec, x - z)
      if (identical(y.tmp, sort(y.tmp))) {
        change(z, y.tmp)
      }
    }
    return(invisible(NULL))
  }
}

Question 7)

Rewrite our function for finding out which team won the premiership in 1967, so that it can accept a year as an argument.
  • Heres the orginal code:
premierships <- list(
  Adelaide = c(1997, 1998),
  Carlton = c(1906, 1907, 1908, 1914, 1915, 1938, 1945, 1947,1968, 1970, 1972, 1979, 1981, 1982, 1987, 1995),
  Collingwood = c(1902, 1903, 1910, 1917, 1919, 1927, 1928, 1929, 1930, 1935, 1936, 1953, 1958, 1990, 2010),
  Essendon = c(1897, 1901, 1911, 1912, 1923, 1924, 1942, 1946, 1949, 1950, 1962, 1965, 1984, 1985, 1993, 2000),
  Fitzroy_Brisbane = c(1898, 1899, 1904, 1905, 1913, 1916, 1922, 1944, 2001, 2002, 2003),
  Footscray_W.B. = c(1954),
  Fremantle = c(),
  Geelong = c(1925, 1931, 1937, 1951, 1952, 1963, 2007, 2009, 2011),
  Hawthorn = c(1961, 1971, 1976, 1978, 1983, 1986, 1988, 1989, 1991, 2008, 2013),
  Melbourne = c(1900, 1926, 1939, 1940, 1941, 1948, 1955, 1956, 1957, 1959, 1960, 1964),
  N.Melb_Kangaroos = c(1975, 1977, 1996, 1999),
  PortAdelaide = c(2004),
  Richmond = c(1920, 1921, 1932, 1934, 1943, 1967, 1969, 1973, 1974, 1980),
  StKilda = c(1966),
  S.Melb_Sydney = c(1909, 1918, 1933, 2005, 2012),
  WestCoast = c(1992, 1994, 2006)
  )
year <- 1967
for (i in 1:length(premierships)) {
  if (year %in% premierships[[i]]) {
  winner <- names(premierships)[i]
    }
}
winner
[1] "Richmond"

Here’s how to rewrite it:

  • Initialize winner as NULL: This variable will store the name of the winning team if found.
  • Loop through the list of premierships, checking if the year appears in any team’s championship years.
  • If the year is found in one of the team’s premiership years (year %in% premierships[[i]]), the team’s name is extracted from the list names (names(premierships)[i]).
  • Then once a match is found, the loop breaks to avoid unnecessary iteration (break).

Actual Code

team_winner <- function(year, premierships) {
  if (!is.numeric(year) || length(year) != 1 || year < 1870 || year > 2023) {
    stop("provide a valid year between up to 2023.")
  }
  winner <- NULL
  for (i in 1:length(premierships)) {
    if (year %in% premierships[[i]]) {
      winner <- names(premierships)[i]
      break  
    }
  }
  if (!is.null(winner)) {
    return(paste("The team that won in", year, "is", winner))
  } else {
    return(paste("None found", year))
  }
}
team_winner(1967, premierships)
[1] "The team that won in 1967 is Richmond"

Question 8) Which function has the largest number of named arguments?

What is the apropos function?
  • apropos: find objects by (partial) name
  • apropos() returns a character vector giving the names of objects in the search list matching (as a regular expression) what.
Code
args("plot")
function (x, y, ...) 
NULL
str(formals("plot"))
Dotted pair list of 3
 $ x  : symbol 
 $ y  : symbol 
 $ ...: symbol