One way to make your code more readable is to be careful about the order you pass arguments when you call functions, and whether you pass the arguments by position or by name.
gold_medals, a numeric vector of the number of gold medals won by each country in the 2016 Summer Olympics, is provided.
gold_medals <- c(46, 27, 26, 19, 17, 12, 10, 9, 8, 8, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4)
names(gold_medals) <- c("USA", "GBR", "CHN", "RUS", "GER", "JPN", "FRA", "KOR", "ITA", "AUS", "NED", "HUN", "BRA", "ESP", "KEN", "JAM", "CRO", "CUB", "NZL", "CAN")
gold_medals
USA GBR CHN RUS GER JPN FRA KOR ITA AUS NED HUN BRA ESP KEN JAM CRO CUB
46 27 26 19 17 12 10 9 8 8 8 8 7 7 6 6 5 5
NZL CAN
4 4
For convenience, the arguments of median() and rank() are displayed using args(). Setting rank()’s na.last argument to “keep” means “keep the rank of NA values as NA”.
Best practice for calling functions is to include them in the order shown by args(), and to only name rare arguments.
# Note the arguments to mean()
args(mean)
function (x, ...)
NULL
# Rewrite this function call, following best practices
median(gold_medals, na.rm = TRUE)
[1] 8
# Note the arguments to rank()
args(rank)
function (x, na.last = TRUE, ties.method = c("average", "first",
"last", "random", "max", "min"))
NULL
# Rewrite this function call, following best practices
rank(-gold_medals, na.last = "keep", ties.method = "min")
USA GBR CHN RUS GER JPN FRA KOR ITA AUS NED HUN BRA ESP KEN JAM CRO CUB
1 2 3 4 5 6 7 8 9 9 9 9 13 13 15 15 17 17
NZL CAN
19 19
Pass the arguments in the order suggested by the documentation, and give names for rarer arguments.
It’s a really good idea when writing functions to start simple. You can always make a function more complicated later if it’s really necessary, so let’s not worry about arguments for now.
coin_sides <- c("head", "tail")
# Sample from coin_sides once
sample(coin_sides, 1)
[1] "head"
# Your function
toss_coin <- function() {
coin_sides <- c("head", "tail")
sample(coin_sides, 1)
}
# Call your function
toss_coin()
[1] "tail"
Most functions require some sort of input to determine what to compute. The inputs to functions are called arguments. You specify them inside the parentheses after the word “function.”
coin_sides <- c("head", "tail")
n_flips <- 10
# Sample from coin_sides n_flips times with replacement
sample(coin_sides, size = n_flips, replace = TRUE)
[1] "head" "head" "tail" "head" "head" "head" "head" "tail" "head"
[10] "tail"
We can update the definition of toss_coin() to accept a single argument, n_flips. The function should sample coin_sides n_flips times with replacement.
# Update the function to return n coin tosses
toss_coin <- function(n_flips) {
coin_sides <- c("head", "tail")
sample(coin_sides, size = n_flips, replace = TRUE)
}
# Generate 10 coin tosses
toss_coin(10)
[1] "tail" "tail" "tail" "tail" "tail" "tail" "tail" "tail" "head"
[10] "head"
If a function should have more than one argument, list them in the function signature, separated by commas.
To solve this exercise, you need to know how to specify sampling weights to sample(). Set the prob argument to a numeric vector with the same length as x. Each value of prob is the probability of sampling the corresponding element of x, so their values add up to one. In the following example, each sample has a 20% chance of “bat”, a 30% chance of “cat” and a 50% chance of “rat”.
sample(c("bat", "cat", "rat"), 10, replace = TRUE, prob = c(0.2, 0.3, 0.5))
# Bias the coin by weighting the sampling.
n_flips <- 10
p_head <- 0.8
# Define a vector of weights
weights <- c(p_head, 1 - p_head)
# Update so that heads are sampled with prob p_head
sample(coin_sides, n_flips, replace = TRUE, prob = weights)
[1] "tail" "head" "head" "head" "head" "head" "head" "head" "head"
[10] "head"
we can update our function:
# Update the function so heads have probability p_head
toss_coin <- function(n_flips, p_head) {
coin_sides <- c("head", "tail")
# Define a vector of weights
weights <- c(p_head, 1 - p_head)
# Modify the sampling to be weighted
sample(coin_sides, n_flips, replace = TRUE, prob = weights)
}
# Generate 10 coin tosses
toss_coin(10, 0.8)
[1] "tail" "head" "head" "tail" "tail" "head" "head" "head" "tail"
[10] "head"
Data arguments are what a function computes on, and detail arguments advise on how the computation should be performed.
R’s generalized linear regression function, glm(), suffers the same usability problems as lm(): its name is an acronym, and its formula and data arguments are in the wrong order.
To solve this exercise, you need to know two things about generalized linear regression:
glm() formulas are specified like lm() formulas: response is on the left, and explanatory variables are added on the right. To model count data, set glm()’s family argument to poisson, making it a Poisson regression.
library(COUNT)
data(loomis)
# Run a generalized linear regression
glm(
# Model no. of visits vs. gender, income, travel
anvisits ~ gender + income + travel,
# Use the snake_river_visits dataset
data = loomis,
# Make it a Poisson regression
family = poisson
)
Call: glm(formula = anvisits ~ gender + income + travel, family = poisson,
data = loomis)
Coefficients:
(Intercept) gender income travel
5.2318 0.3069 -0.2608 -0.9456
Degrees of Freedom: 341 Total (i.e. Null); 338 Residual
(68 observations deleted due to missingness)
Null Deviance: 18630
Residual Deviance: 11960 AIC: 13290
Re-writing the function
# Write a function to run a Poisson regression
run_poisson_regression <- function(data, formula){
glm(formula, data, family = poisson)
}
Re-run Poisson regression using the new function
library(dplyr)
# Re-run the Poisson regression, using your function
model <- loomis %>%
run_poisson_regression(anvisits ~ gender + income + travel)
model
Call: glm(formula = formula, family = poisson, data = data)
Coefficients:
(Intercept) gender income travel
5.2318 0.3069 -0.2608 -0.9456
Degrees of Freedom: 341 Total (i.e. Null); 338 Residual
(68 observations deleted due to missingness)
Null Deviance: 18630
Residual Deviance: 11960 AIC: 13290
cut_by_quantile() converts a numeric vector into a categorical variable where quantiles define the cut points. This is a useful function, but at the moment you have to specify five arguments to make it work. This is too much thinking and typing.
# quantile function
cut_by_quantile <- function(x, n, na.rm, labels, interval_type) {
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
By specifying default arguments, you can make it easier to use. Let’s start with n, which specifies how many categories to cut x into.
loomis$anvisits[is.na(loomis$anvisits)] <- 0
loomis$anvisits
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[18] 0 0 12 100 35 1 6 2 1 1 1 1 1 1 100 80 104
[35] 55 350 20 60 250 100 50 40 9 200 200 100 8 6 2 15 12
[52] 30 120 52 35 30 75 10 250 15 4 25 50 114 50 100 15 30
[69] 120 30 0 160 12 25 3 15 14 15 8 125 96 260 25 30 30
[86] 1 50 6 12 72 20 25 50 30 1 1 5 1 3 6 50 10
[103] 9 4 12 2 15 50 7 100 10 50 2 50 2 100 30 1 1
[120] 1 1 2 1 1 1 1 1 1 4 1 1 2 2 1 1 2
[137] 1 1 1 2 2 10 3 3 4 5 1 1 2 1 2 6 1
[154] 1 1 1 200 13 150 25 10 40 10 1 30 6 35 24 100 17
[171] 20 40 52 15 60 30 20 6 70 35 30 24 300 100 0 30 50
[188] 26 17 6 100 30 50 20 150 20 70 20 100 1 20 100 50 0
[205] 1 10 60 10 2 7 24 6 0 4 90 20 75 12 70 22 2
[222] 20 3 52 5 25 25 30 1 12 20 20 50 25 3 3 1 1
[239] 1 1 7 2 1 0 1 3 15 1 1 3 1 1 1 1 1
[256] 1 1 1 1 1 150 5 0 150 3 40 5 2 50 6 6 6
[273] 2 25 6 25 50 3 150 60 3 4 50 14 4 60 30 3 2
[290] 1 10 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2
[307] 1 1 1 1 2 1 1 2 1 1 1 1 2 1 1 1 1
[324] 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[341] 1 1 1 1 1 8 1 1 1 1 1 1 1 1 1 1 1
[358] 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 0 1
[375] 50 30 40 208 50 20 150 50 80 75 6 10 6 26 60 30 30
[392] 15 12 30 20 20 120 15 75 4 35 2 30 76 2 1 3 1
[409] 3 2
attr(,"label")
[1] "annual visits to park"
attr(,"class")
[1] "labelled"
attr(,"format")
[1] "%8.0g"
# Set the default for n to 5
cut_by_quantile <- function(x, n = 5, na.rm, labels, interval_type) {
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the n argument from the call
cut_by_quantile(
loomis$anvisits,
na.rm = FALSE,
labels = c("very low", "low", "medium", "high", "very high"),
interval_type = "(lo, hi]"
)
[1] very low very low very low very low very low very low
[7] very low very low very low very low very low very low
[13] very low very low very low very low very low very low
[19] very low high very high high very low medium
[25] low very low very low very low very low very low
[31] very low very high very high very high very high very high
[37] high very high very high very high very high very high
[43] medium very high very high very high medium medium
[49] low high high high very high very high
[55] high high very high medium very high high
[61] medium high very high very high very high very high
[67] high high very high high very low very high
[73] high high medium high high high
[79] medium very high very high very high high high
[85] high very low very high medium high very high
[91] high high very high high very low very low
[97] medium very low medium medium very high medium
[103] medium medium high low high very high
[109] medium very high medium very high low very high
[115] low very high high very low very low very low
[121] very low low very low very low very low very low
[127] very low very low medium very low very low low
[133] low very low very low low very low very low
[139] very low low low medium medium medium
[145] medium medium very low very low low very low
[151] low medium very low very low very low very low
[157] very high high very high high medium very high
[163] medium very low high medium high high
[169] very high high high very high very high high
[175] very high high high medium very high high
[181] high high very high very high very low high
[187] very high high high medium very high high
[193] very high high very high high very high high
[199] very high very low high very high very high very low
[205] very low medium very high medium low medium
[211] high medium very low medium very high high
[217] very high high very high high low high
[223] medium very high medium high high high
[229] very low high high high very high high
[235] medium medium very low very low very low very low
[241] medium low very low very low very low medium
[247] high very low very low medium very low very low
[253] very low very low very low very low very low very low
[259] very low very low very high medium very low very high
[265] medium very high medium low very high medium
[271] medium medium low high medium high
[277] very high medium very high very high medium medium
[283] very high high medium very high high medium
[289] low very low medium very low very low very low
[295] very low very low low very low very low very low
[301] very low very low very low very low very low low
[307] very low very low very low very low low very low
[313] very low low very low very low very low very low
[319] low very low very low very low very low very low
[325] low very low very low very low very low very low
[331] very low very low very low very low very low very low
[337] very low very low very low very low very low very low
[343] very low very low very low medium very low very low
[349] very low very low very low very low very low very low
[355] very low very low very low very low low very low
[361] very low very low very low very low very low very low
[367] very low very low very low very low low very low
[373] very low very low very high high very high very high
[379] very high high very high very high very high very high
[385] medium medium medium high very high high
[391] high high high high high high
[397] very high high very high medium high low
[403] high very high low very low medium very low
[409] medium low
Levels: very low low medium high very high
cut_by_quantile() is now slightly easier to use, but you still always have to specify the na.rm argument. This removes missing values – it behaves the same as the na.rm argument to mean() or sd().
Where functions have an argument for removing missing values, the best practice is to not remove them by default (in case you hadn’t spotted that you had missing values). That means that the default for na.rm should be FALSE.
# Set the default for na.rm to FALSE
cut_by_quantile <- function(x, n = 5, na.rm = FALSE, labels, interval_type) {
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the na.rm argument from the call
cut_by_quantile(
loomis$anvisits,
labels = c("very low", "low", "medium", "high", "very high"),
interval_type = "(lo, hi]"
)
[1] very low very low very low very low very low very low
[7] very low very low very low very low very low very low
[13] very low very low very low very low very low very low
[19] very low high very high high very low medium
[25] low very low very low very low very low very low
[31] very low very high very high very high very high very high
[37] high very high very high very high very high very high
[43] medium very high very high very high medium medium
[49] low high high high very high very high
[55] high high very high medium very high high
[61] medium high very high very high very high very high
[67] high high very high high very low very high
[73] high high medium high high high
[79] medium very high very high very high high high
[85] high very low very high medium high very high
[91] high high very high high very low very low
[97] medium very low medium medium very high medium
[103] medium medium high low high very high
[109] medium very high medium very high low very high
[115] low very high high very low very low very low
[121] very low low very low very low very low very low
[127] very low very low medium very low very low low
[133] low very low very low low very low very low
[139] very low low low medium medium medium
[145] medium medium very low very low low very low
[151] low medium very low very low very low very low
[157] very high high very high high medium very high
[163] medium very low high medium high high
[169] very high high high very high very high high
[175] very high high high medium very high high
[181] high high very high very high very low high
[187] very high high high medium very high high
[193] very high high very high high very high high
[199] very high very low high very high very high very low
[205] very low medium very high medium low medium
[211] high medium very low medium very high high
[217] very high high very high high low high
[223] medium very high medium high high high
[229] very low high high high very high high
[235] medium medium very low very low very low very low
[241] medium low very low very low very low medium
[247] high very low very low medium very low very low
[253] very low very low very low very low very low very low
[259] very low very low very high medium very low very high
[265] medium very high medium low very high medium
[271] medium medium low high medium high
[277] very high medium very high very high medium medium
[283] very high high medium very high high medium
[289] low very low medium very low very low very low
[295] very low very low low very low very low very low
[301] very low very low very low very low very low low
[307] very low very low very low very low low very low
[313] very low low very low very low very low very low
[319] low very low very low very low very low very low
[325] low very low very low very low very low very low
[331] very low very low very low very low very low very low
[337] very low very low very low very low very low very low
[343] very low very low very low medium very low very low
[349] very low very low very low very low very low very low
[355] very low very low very low very low low very low
[361] very low very low very low very low very low very low
[367] very low very low very low very low low very low
[373] very low very low very high high very high very high
[379] very high high very high very high very high very high
[385] medium medium medium high very high high
[391] high high high high high high
[397] very high high very high medium high low
[403] high very high low very low medium very low
[409] medium low
Levels: very low low medium high very high
The cut() function used by cut_by_quantile() can automatically provide sensible labels for each category. The code to generate these labels is pretty complicated, so rather than appearing in the function signature directly, its labels argument defaults to NULL, and the calculation details are shown on the ?cut help page.
# Set the default for labels to NULL
cut_by_quantile <- function(x, n = 5, na.rm = FALSE, labels = NULL, interval_type) {
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the labels argument from the call
cut_by_quantile(
loomis$anvisits,
interval_type = "(lo, hi]"
)
[1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[8] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[15] [0,1] [0,1] [0,1] [0,1] [0,1] (10,35] (35,350]
[22] (10,35] [0,1] (2,10] (1,2] [0,1] [0,1] [0,1]
[29] [0,1] [0,1] [0,1] (35,350] (35,350] (35,350] (35,350]
[36] (35,350] (10,35] (35,350] (35,350] (35,350] (35,350] (35,350]
[43] (2,10] (35,350] (35,350] (35,350] (2,10] (2,10] (1,2]
[50] (10,35] (10,35] (10,35] (35,350] (35,350] (10,35] (10,35]
[57] (35,350] (2,10] (35,350] (10,35] (2,10] (10,35] (35,350]
[64] (35,350] (35,350] (35,350] (10,35] (10,35] (35,350] (10,35]
[71] [0,1] (35,350] (10,35] (10,35] (2,10] (10,35] (10,35]
[78] (10,35] (2,10] (35,350] (35,350] (35,350] (10,35] (10,35]
[85] (10,35] [0,1] (35,350] (2,10] (10,35] (35,350] (10,35]
[92] (10,35] (35,350] (10,35] [0,1] [0,1] (2,10] [0,1]
[99] (2,10] (2,10] (35,350] (2,10] (2,10] (2,10] (10,35]
[106] (1,2] (10,35] (35,350] (2,10] (35,350] (2,10] (35,350]
[113] (1,2] (35,350] (1,2] (35,350] (10,35] [0,1] [0,1]
[120] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[127] [0,1] [0,1] (2,10] [0,1] [0,1] (1,2] (1,2]
[134] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] (1,2]
[141] (1,2] (2,10] (2,10] (2,10] (2,10] (2,10] [0,1]
[148] [0,1] (1,2] [0,1] (1,2] (2,10] [0,1] [0,1]
[155] [0,1] [0,1] (35,350] (10,35] (35,350] (10,35] (2,10]
[162] (35,350] (2,10] [0,1] (10,35] (2,10] (10,35] (10,35]
[169] (35,350] (10,35] (10,35] (35,350] (35,350] (10,35] (35,350]
[176] (10,35] (10,35] (2,10] (35,350] (10,35] (10,35] (10,35]
[183] (35,350] (35,350] [0,1] (10,35] (35,350] (10,35] (10,35]
[190] (2,10] (35,350] (10,35] (35,350] (10,35] (35,350] (10,35]
[197] (35,350] (10,35] (35,350] [0,1] (10,35] (35,350] (35,350]
[204] [0,1] [0,1] (2,10] (35,350] (2,10] (1,2] (2,10]
[211] (10,35] (2,10] [0,1] (2,10] (35,350] (10,35] (35,350]
[218] (10,35] (35,350] (10,35] (1,2] (10,35] (2,10] (35,350]
[225] (2,10] (10,35] (10,35] (10,35] [0,1] (10,35] (10,35]
[232] (10,35] (35,350] (10,35] (2,10] (2,10] [0,1] [0,1]
[239] [0,1] [0,1] (2,10] (1,2] [0,1] [0,1] [0,1]
[246] (2,10] (10,35] [0,1] [0,1] (2,10] [0,1] [0,1]
[253] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[260] [0,1] (35,350] (2,10] [0,1] (35,350] (2,10] (35,350]
[267] (2,10] (1,2] (35,350] (2,10] (2,10] (2,10] (1,2]
[274] (10,35] (2,10] (10,35] (35,350] (2,10] (35,350] (35,350]
[281] (2,10] (2,10] (35,350] (10,35] (2,10] (35,350] (10,35]
[288] (2,10] (1,2] [0,1] (2,10] [0,1] [0,1] [0,1]
[295] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[302] [0,1] [0,1] [0,1] [0,1] (1,2] [0,1] [0,1]
[309] [0,1] [0,1] (1,2] [0,1] [0,1] (1,2] [0,1]
[316] [0,1] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1]
[323] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[330] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[337] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[344] [0,1] [0,1] (2,10] [0,1] [0,1] [0,1] [0,1]
[351] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[358] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1] [0,1]
[365] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] (1,2]
[372] [0,1] [0,1] [0,1] (35,350] (10,35] (35,350] (35,350]
[379] (35,350] (10,35] (35,350] (35,350] (35,350] (35,350] (2,10]
[386] (2,10] (2,10] (10,35] (35,350] (10,35] (10,35] (10,35]
[393] (10,35] (10,35] (10,35] (10,35] (35,350] (10,35] (35,350]
[400] (2,10] (10,35] (1,2] (10,35] (35,350] (1,2] [0,1]
[407] (2,10] [0,1] (2,10] (1,2]
Levels: [0,1] (1,2] (2,10] (10,35] (35,350]
If you use this capability, make sure to document how the argument behaves in the function’s help
When cutting up a numeric vector, you need to worry about what happens if a value lands exactly on a boundary. You can either put this value into a category of the lower interval or the higher interval. That is, you can choose your intervals to include values at the top boundary but not the bottom (in mathematical terminology, “open on the left, closed on the right”, or (lo, hi]). Or you can choose the opposite (“closed on the left, open on the right”, or [lo, hi)). cut_by_quantile() should allow these two choices.
The pattern for categorical defaults is:
function(cat_arg = c("choice1", "choice2")) {
cat_arg <- match.arg(cat_arg)
}
# Set the categories for interval_type to "(lo, hi]" and "[lo, hi)"
cut_by_quantile <- function(x, n = 5, na.rm = FALSE, labels = NULL,
interval_type = c("(lo, hi]" , "[lo, hi)")) {
# Match the interval_type argument
interval_type <- match.arg(interval_type)
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the interval_type argument from the call
cut_by_quantile(loomis$anvisits)
[1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[8] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[15] [0,1] [0,1] [0,1] [0,1] [0,1] (10,35] (35,350]
[22] (10,35] [0,1] (2,10] (1,2] [0,1] [0,1] [0,1]
[29] [0,1] [0,1] [0,1] (35,350] (35,350] (35,350] (35,350]
[36] (35,350] (10,35] (35,350] (35,350] (35,350] (35,350] (35,350]
[43] (2,10] (35,350] (35,350] (35,350] (2,10] (2,10] (1,2]
[50] (10,35] (10,35] (10,35] (35,350] (35,350] (10,35] (10,35]
[57] (35,350] (2,10] (35,350] (10,35] (2,10] (10,35] (35,350]
[64] (35,350] (35,350] (35,350] (10,35] (10,35] (35,350] (10,35]
[71] [0,1] (35,350] (10,35] (10,35] (2,10] (10,35] (10,35]
[78] (10,35] (2,10] (35,350] (35,350] (35,350] (10,35] (10,35]
[85] (10,35] [0,1] (35,350] (2,10] (10,35] (35,350] (10,35]
[92] (10,35] (35,350] (10,35] [0,1] [0,1] (2,10] [0,1]
[99] (2,10] (2,10] (35,350] (2,10] (2,10] (2,10] (10,35]
[106] (1,2] (10,35] (35,350] (2,10] (35,350] (2,10] (35,350]
[113] (1,2] (35,350] (1,2] (35,350] (10,35] [0,1] [0,1]
[120] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[127] [0,1] [0,1] (2,10] [0,1] [0,1] (1,2] (1,2]
[134] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] (1,2]
[141] (1,2] (2,10] (2,10] (2,10] (2,10] (2,10] [0,1]
[148] [0,1] (1,2] [0,1] (1,2] (2,10] [0,1] [0,1]
[155] [0,1] [0,1] (35,350] (10,35] (35,350] (10,35] (2,10]
[162] (35,350] (2,10] [0,1] (10,35] (2,10] (10,35] (10,35]
[169] (35,350] (10,35] (10,35] (35,350] (35,350] (10,35] (35,350]
[176] (10,35] (10,35] (2,10] (35,350] (10,35] (10,35] (10,35]
[183] (35,350] (35,350] [0,1] (10,35] (35,350] (10,35] (10,35]
[190] (2,10] (35,350] (10,35] (35,350] (10,35] (35,350] (10,35]
[197] (35,350] (10,35] (35,350] [0,1] (10,35] (35,350] (35,350]
[204] [0,1] [0,1] (2,10] (35,350] (2,10] (1,2] (2,10]
[211] (10,35] (2,10] [0,1] (2,10] (35,350] (10,35] (35,350]
[218] (10,35] (35,350] (10,35] (1,2] (10,35] (2,10] (35,350]
[225] (2,10] (10,35] (10,35] (10,35] [0,1] (10,35] (10,35]
[232] (10,35] (35,350] (10,35] (2,10] (2,10] [0,1] [0,1]
[239] [0,1] [0,1] (2,10] (1,2] [0,1] [0,1] [0,1]
[246] (2,10] (10,35] [0,1] [0,1] (2,10] [0,1] [0,1]
[253] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[260] [0,1] (35,350] (2,10] [0,1] (35,350] (2,10] (35,350]
[267] (2,10] (1,2] (35,350] (2,10] (2,10] (2,10] (1,2]
[274] (10,35] (2,10] (10,35] (35,350] (2,10] (35,350] (35,350]
[281] (2,10] (2,10] (35,350] (10,35] (2,10] (35,350] (10,35]
[288] (2,10] (1,2] [0,1] (2,10] [0,1] [0,1] [0,1]
[295] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[302] [0,1] [0,1] [0,1] [0,1] (1,2] [0,1] [0,1]
[309] [0,1] [0,1] (1,2] [0,1] [0,1] (1,2] [0,1]
[316] [0,1] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1]
[323] [0,1] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1]
[330] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[337] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[344] [0,1] [0,1] (2,10] [0,1] [0,1] [0,1] [0,1]
[351] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1]
[358] [0,1] (1,2] [0,1] [0,1] [0,1] [0,1] [0,1]
[365] [0,1] [0,1] [0,1] [0,1] [0,1] [0,1] (1,2]
[372] [0,1] [0,1] [0,1] (35,350] (10,35] (35,350] (35,350]
[379] (35,350] (10,35] (35,350] (35,350] (35,350] (35,350] (2,10]
[386] (2,10] (2,10] (10,35] (35,350] (10,35] (10,35] (10,35]
[393] (10,35] (10,35] (10,35] (10,35] (35,350] (10,35] (35,350]
[400] (2,10] (10,35] (1,2] (10,35] (35,350] (1,2] [0,1]
[407] (2,10] [0,1] (2,10] (1,2]
Levels: [0,1] (1,2] (2,10] (10,35] (35,350]
As a bonus, match.arg() handles throwing an error if the user types a value that wasn’t specified.
The harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the data. That is
harmonic_mean(x)=1/arithmetic_mean(1/x)
The harmonic mean is often used to average ratio data.
library(readr)
std_and_poor500 <- read_delim("sp500.csv", ";", escape_double = FALSE,
trim_ws = TRUE)
Parsed with column specification:
cols(
symbol = [31mcol_character()[39m,
company = [31mcol_character()[39m,
sector = [31mcol_character()[39m,
industry = [31mcol_character()[39m,
pe_ratio = [32mcol_double()[39m
)
# Look at the Standard and Poor 500 data
glimpse(std_and_poor500)
Rows: 505
Columns: 5
$ symbol [3m[90m<chr>[39m[23m "MMM", "ABT", "ABBV", "ABMD", "ACN", "ATVI", "ADBE", …
$ company [3m[90m<chr>[39m[23m "3M Company", "Abbott Laboratories", "AbbVie Inc.", "…
$ sector [3m[90m<chr>[39m[23m "Industrials", "Health Care", "Health Care", "Health …
$ industry [3m[90m<chr>[39m[23m "Industrial Conglomerates", "Health Care Equipment", …
$ pe_ratio [3m[90m<dbl>[39m[23m 18.31678, 57.66621, 22.43805, 45.63993, 27.00233, 20.…
# Write a function to calculate the reciprocal
get_reciprocal <- function(x) {
1/x
}
# Write a function to calculate the harmonic mean
calc_harmonic_mean <- function(x) {
x %>%
get_reciprocal() %>%
mean %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarise(hmean_pe_ratio = calc_harmonic_mean(pe_ratio))
It looks like we have a problem though: most sectors have missing values.
Many sectors had an NA value for the harmonic mean. It would be useful for your function to be able to remove missing values before calculating.
we can modify the signature and body of calc_harmonic_mean() so it has an na.rm argument, defaulting to false, that gets passed to mean().
calc_harmonic_mean <- function(x, na.rm = FALSE) {
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarise(hmean_pe_ratio = calc_harmonic_mean(pe_ratio, na.rm = TRUE))
Using this metric, Real Estate is by far the most expensive sector.
Rather than explicitly giving calc_harmonic_mean() and na.rm argument, you can use … to simply “pass other arguments” to mean().
# Swap na.rm arg for ... in signature and body
calc_harmonic_mean <- function(x, ...) {
x %>%
get_reciprocal() %>%
mean(...) %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarise(hmean_pe_ratio = calc_harmonic_mean(pe_ratio, na.rm = TRUE))
Using … doesn’t change how people use your function; it just means the function is more flexible. Whether flexible means better (or not) is up to you to decide.
If a user provides a bad input to a function, the best course of action is to throw an error letting them know. The two rules are
You can use the assert_*() functions from assertive to check inputs and throw errors when they fail.
library(assertive)
calc_harmonic_mean <- function(x, na.rm = FALSE) {
# Assert that x is numeric
assert_is_numeric(x)
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it strings
calc_harmonic_mean(std_and_poor500$sector)
Error in calc_harmonic_mean(std_and_poor500$sector) :
is_numeric : x is not of class 'numeric'; it has class 'character'.
Sometimes the assert_*() functions in assertive don’t give the most informative error message. For example, the assertions that check if a number is in a numeric range will tell the user that a value is out of range, but the won’t say why that’s a problem. In that case, you can use the is_*() functions in conjunction with messages, warnings, or errors to define custom feedback.
calc_harmonic_mean <- function(x, na.rm = FALSE) {
assert_is_numeric(x)
# Check if any values of x are non-positive
if(any(is_non_positive(x), na.rm = TRUE)) {
# Throw an error
stop("x contains non-positive values, so the harmonic mean makes no sense.")
}
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it negative numbers
calc_harmonic_mean(std_and_poor500$pe_ratio - 20)
Error in calc_harmonic_mean(std_and_poor500$pe_ratio - 20) :
x contains non-positive values, so the harmonic mean makes no sense.
We still need to provide some checks on the na.rm argument. This time, rather than throwing errors when the input is in an incorrect form, we are going to try to fix it.
na.rm should be a logical vector with one element (that is, TRUE, or FALSE).
# Update the function definition to fix the na.rm argument
calc_harmonic_mean <- function(x, na.rm = FALSE) {
assert_is_numeric(x)
if(any(is_non_positive(x), na.rm = TRUE)) {
stop("x contains non-positive values, so the harmonic mean makes no sense.")
}
# Use the first value of na.rm, and coerce to logical
na.rm <- coerce_to(use_first(na.rm), target_class = "logical")
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it malformed na.rm
calc_harmonic_mean(std_and_poor500$pe_ratio, na.rm = 1:5)
Only the first value of na.rm (= 1) will be used.Coercing use_first(na.rm) to class ‘logical’.
[1] 18.23871
Sometimes, you don’t need to run through the whole body of a function to get the answer. In that case you can return early from that function using return().
Leap year is every 400th year (like the year 2000) or every 4th year that isn’t a century (like 1904 but not 1900 or 1905).
is_leap_year <- function(year) {
# If year is div. by 400 return TRUE
if(year %% 400 == 0) {
return(TRUE)
}
# If year is div. by 100 return FALSE
if(year %% 100 == 0) {
return(FALSE)
}
# If year is div. by 4 return TRUE
if (year %% 4 == 0){
TRUE
}
# Otherwise return FALSE
else {
FALSE
}
}
is_leap_year(year = 1900)
[1] FALSE
When the main purpose of a function is to generate output, like drawing a plot or printing something in the console, you may not want a return value to be printed as well. In that case, the value should be invisibly returned.
The base R plot function returns NULL, since its main purpose is to draw a plot. This isn’t helpful if you want to use it in piped code: instead it should invisibly return the plot data to be piped on to the next step.
Recall that plot() has a formula interface: instead of giving it vectors for x and y, you can specify a formula describing which columns of a data frame go on the x and y axes, and a data argument for the data frame. Note that just like lm(), the arguments are the wrong way round because the detail argument, formula, comes before the data argument.
plot(y ~ x, data = data)
# Using cars, draw a scatter plot of dist vs. speed
plt_dist_vs_speed <- plot(dist ~ speed, data = cars)
# Oh no! The plot object is NULL
plt_dist_vs_speed
NULL
# Define a pipeable plot fn with data and formula args
pipeable_plot <- function(data, formula) {
# Call plot() with the formula interface
plot(formula, data)
# Invisibly return the input dataset
invisible(data)
}
# Draw the scatter plot of dist vs. speed again
plt_dist_vs_speed <- cars %>%
pipeable_plot(dist ~ speed)
# Now the plot object has a value
plt_dist_vs_speed
Return values are usually desirable (so you can use the objects in later code), even if you don’t want them printing to the console.
library(broom)
library(zeallot)
# Look at the structure of model (it's a mess!)
str(model)
List of 31
$ coefficients : Named num [1:4] 5.232 0.307 -0.261 -0.946
..- attr(*, "names")= chr [1:4] "(Intercept)" "gender" "income" "travel"
$ residuals : 'labelled' Named num [1:342] -0.72 -0.86 -0.926 -0.81 -0.892 ...
..- attr(*, "label")= chr "annual visits to park"
..- attr(*, "format")= chr "%8.0g"
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ fitted.values : Named num [1:342] 7.14 7.14 13.52 5.25 9.27 ...
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ effects : Named num [1:342] -357.82 -30.23 34.15 59.34 -2.41 ...
..- attr(*, "names")= chr [1:342] "(Intercept)" "gender" "income" "travel" ...
$ R : num [1:4, 1:4] -97.4 0 0 0 -149.8 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:4] "(Intercept)" "gender" "income" "travel"
.. ..$ : chr [1:4] "(Intercept)" "gender" "income" "travel"
$ rank : int 4
$ qr :List of 5
..$ qr : num [1:342, 1:4] -97.3858 0.0274 0.0378 0.0235 0.0313 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:342] "25" "26" "27" "29" ...
.. .. ..$ : chr [1:4] "(Intercept)" "gender" "income" "travel"
..$ rank : int 4
..$ qraux: num [1:4] 1.03 1.02 1.05 1.05
..$ pivot: int [1:4] 1 2 3 4
..$ tol : num 1e-11
..- attr(*, "class")= chr "qr"
$ family :List of 12
..$ family : chr "poisson"
..$ link : chr "log"
..$ linkfun :function (mu)
..$ linkinv :function (eta)
..$ variance :function (mu)
..$ dev.resids:function (y, mu, wt)
..$ aic :function (y, n, mu, wt, dev)
..$ mu.eta :function (eta)
..$ initialize: expression({ if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") n <- rep.int(1, nobs| __truncated__
..$ validmu :function (mu)
..$ valideta :function (eta)
..$ simulate :function (object, nsim)
..- attr(*, "class")= chr "family"
$ linear.predictors: Named num [1:342] 1.97 1.97 2.6 1.66 2.23 ...
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ deviance : num 11959
$ aic : num 13289
$ null.deviance : num 18630
$ iter : int 6
$ weights : Named num [1:342] 7.14 7.14 13.52 5.25 9.27 ...
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ prior.weights : Named num [1:342] 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ df.residual : int 338
$ df.null : int 341
$ y : 'labelled' Named int [1:342] 2 1 1 1 1 1 80 104 55 350 ...
..- attr(*, "label")= chr "annual visits to park"
..- attr(*, "format")= chr "%8.0g"
..- attr(*, "names")= chr [1:342] "25" "26" "27" "29" ...
$ converged : logi TRUE
$ boundary : logi FALSE
$ model :'data.frame': 342 obs. of 4 variables:
..$ anvisits: 'labelled' int [1:342] 2 1 1 1 1 1 80 104 55 350 ...
.. ..- attr(*, "label")= chr "annual visits to park"
.. ..- attr(*, "format")= chr "%8.0g"
..$ gender : 'labelled' int [1:342] 2 2 1 1 2 1 2 2 1 2 ...
.. ..- attr(*, "label")= chr "1=male:0=female"
.. ..- attr(*, "format")= chr "%8.0g"
..$ income : int [1:342] 4 4 4 4 3 1 1 4 2 2 ...
.. ..- attr(*, "format")= chr "%9.0g"
..$ travel : int [1:342] 3 3 2 3 3 1 1 1 2 1 ...
.. ..- attr(*, "format")= chr "%9.0g"
..- attr(*, "terms")=Classes 'terms', 'formula' language anvisits ~ gender + income + travel
.. .. ..- attr(*, "variables")= language list(anvisits, gender, income, travel)
.. .. ..- attr(*, "factors")= int [1:4, 1:3] 0 1 0 0 0 0 1 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:4] "anvisits" "gender" "income" "travel"
.. .. .. .. ..$ : chr [1:3] "gender" "income" "travel"
.. .. ..- attr(*, "term.labels")= chr [1:3] "gender" "income" "travel"
.. .. ..- attr(*, "order")= int [1:3] 1 1 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: 0x5611cb7ffc08>
.. .. ..- attr(*, "predvars")= language list(anvisits, gender, income, travel)
.. .. ..- attr(*, "dataClasses")= Named chr [1:4] "numeric" "numeric" "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:4] "anvisits" "gender" "income" "travel"
..- attr(*, "na.action")= 'omit' Named int [1:68] 1 2 3 4 5 6 7 8 9 10 ...
.. ..- attr(*, "names")= chr [1:68] "1" "2" "3" "4" ...
$ na.action : 'omit' Named int [1:68] 1 2 3 4 5 6 7 8 9 10 ...
..- attr(*, "names")= chr [1:68] "1" "2" "3" "4" ...
$ call : language glm(formula = formula, family = poisson, data = data)
$ formula :Class 'formula' language anvisits ~ gender + income + travel
.. ..- attr(*, ".Environment")=<environment: 0x5611cb7ffc08>
$ terms :Classes 'terms', 'formula' language anvisits ~ gender + income + travel
.. ..- attr(*, "variables")= language list(anvisits, gender, income, travel)
.. ..- attr(*, "factors")= int [1:4, 1:3] 0 1 0 0 0 0 1 0 0 0 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:4] "anvisits" "gender" "income" "travel"
.. .. .. ..$ : chr [1:3] "gender" "income" "travel"
.. ..- attr(*, "term.labels")= chr [1:3] "gender" "income" "travel"
.. ..- attr(*, "order")= int [1:3] 1 1 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: 0x5611cb7ffc08>
.. ..- attr(*, "predvars")= language list(anvisits, gender, income, travel)
.. ..- attr(*, "dataClasses")= Named chr [1:4] "numeric" "numeric" "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:4] "anvisits" "gender" "income" "travel"
$ data :'data.frame': 410 obs. of 11 variables:
..$ anvisits: 'labelled' int [1:410] NA NA NA NA NA NA NA NA NA NA ...
.. ..- attr(*, "label")= chr "annual visits to park"
.. ..- attr(*, "format")= chr "%8.0g"
..$ gender : 'labelled' int [1:410] 1 1 1 2 1 2 2 2 1 1 ...
.. ..- attr(*, "label")= chr "1=male:0=female"
.. ..- attr(*, "format")= chr "%8.0g"
..$ income : int [1:410] 4 2 4 2 4 2 4 4 4 4 ...
.. ..- attr(*, "format")= chr "%9.0g"
..$ income1 : 'labelled' int [1:410] 0 0 0 0 0 0 0 0 0 0 ...
.. ..- attr(*, "label")= chr "<=$25000"
.. ..- attr(*, "format")= chr "%8.0g"
..$ income2 : 'labelled' int [1:410] 0 1 0 1 0 1 0 0 0 0 ...
.. ..- attr(*, "label")= chr ">$25000 - $55000"
.. ..- attr(*, "format")= chr "%8.0g"
..$ income3 : 'labelled' int [1:410] 0 0 0 0 0 0 0 0 0 0 ...
.. ..- attr(*, "label")= chr ">$55000 - $95000"
.. ..- attr(*, "format")= chr "%8.0g"
..$ income4 : 'labelled' int [1:410] 1 0 1 0 1 0 1 1 1 1 ...
.. ..- attr(*, "label")= chr ">$95000"
.. ..- attr(*, "format")= chr "%8.0g"
..$ travel : int [1:410] NA NA NA NA NA NA NA NA NA NA ...
.. ..- attr(*, "format")= chr "%9.0g"
..$ travel1 : 'labelled' int [1:410] NA NA NA NA NA NA NA NA NA NA ...
.. ..- attr(*, "label")= chr "<.25 hrs"
.. ..- attr(*, "format")= chr "%8.0g"
..$ travel2 : 'labelled' int [1:410] NA NA NA NA NA NA NA NA NA NA ...
.. ..- attr(*, "label")= chr ">=.25 - <4 hrs"
.. ..- attr(*, "format")= chr "%8.0g"
..$ travel3 : 'labelled' int [1:410] NA NA NA NA NA NA NA NA NA NA ...
.. ..- attr(*, "label")= chr ">=4 hrs"
.. ..- attr(*, "format")= chr "%8.0g"
..- attr(*, "stata.info")=List of 5
.. ..$ datalabel : chr ""
.. ..$ version : int 10
.. ..$ time.stamp: chr " 1 Jul 2010 17:34"
.. ..$ val.labels: chr [1:11] "" "" "" "" ...
.. ..$ NA : NULL
$ offset : NULL
$ control :List of 3
..$ epsilon: num 1e-08
..$ maxit : num 25
..$ trace : logi FALSE
$ method : chr "glm.fit"
$ contrasts : NULL
$ xlevels : Named list()
- attr(*, "class")= chr [1:2] "glm" "lm"
# Use broom tools to get a list of 3 data frames
list(
# Get model-level values
model = glance(model),
# Get coefficient-level values
coefficients = tidy(model),
# Get observation-level values
observations = augment(model)
)
$model
$coefficients
$observations
NA
# Wrap this code into a function, groom_model
groom_model <- function(model){
list(
model = glance(model),
coefficients = tidy(model),
observations = augment(model)
)
}
# Call groom_model on model, assigning to 3 variables
c(mdl, cff, obs) %<-% groom_model(model)
# See these individual variables
mdl; cff; obs
Returning many values is as easy as collecting them into a list. The groomed model has data frames that are easy to program against.
Sometimes you want the return multiple things from a function, but you want the result to have a particular class (for example, a data frame or a numeric vector), so returning a list isn’t appropriate. This is common when you have a result plus metadata about the result. (Metadata is “data about the data”. For example, it could be the file a dataset was loaded from, or the username of the person who created the variable, or the number of iterations for an algorithm to converge.)
In that case, you can store the metadata in attributes. Recall the syntax for assigning attributes is as follows.
attr(object, "attribute_name") <- attribute_value
pipeable_plot <- function(data, formula) {
plot(formula, data)
# Add a "formula" attribute to data
attr(data, "formula") <- formula
invisible(data)
}
# From previous exercise
plt_dist_vs_speed <- cars %>%
pipeable_plot(dist ~ speed)
# Examine the structure of the result
str(plt_dist_vs_speed)
'data.frame': 50 obs. of 2 variables:
$ speed: num 4 4 7 7 8 9 10 10 10 11 ...
$ dist : num 2 10 4 22 16 10 18 26 34 17 ...
- attr(*, "formula")=Class 'formula' language dist ~ speed
.. ..- attr(*, ".Environment")=<environment: 0x5611cba2a698>
We can include metadata in the return value by storing it as attributes.
Environments are used to store other variables. Mostly, you can think of them as lists, but there’s an important extra property that is relevant to writing functions. Every environment has a parent environment (except the empty environment, at the root of the environment tree). This determines which variables R know about at different places in your code.
# Add capitals, national_parks, & population to a named list
rsa_lst <- list(
capitals = capitals,
national_parks = national_parks,
population = population
)
# List the structure of each element of rsa_lst
ls.str(rsa_lst)
# Convert the list to an environment
rsa_env <- list2env(rsa_lst)
# List the structure of each variable
ls.str(rsa_env)
# Find the parent environment of rsa_env
parent <- parent.env(rsa_env)
# Print its name
environmentName(parent)
The parent of the environment you defined is the global environment.
If R cannot find a variable in the current environment, it will look in the parent environment, then the grandparent environment, and so on until it finds it.
# Compare the contents of the global environment and rsa_env
ls.str(globalenv())
ls.str(rsa_env)
# Does population exist in rsa_env?
exists("population", envir = rsa_env)
# Does population exist in rsa_env, ignoring inheritance?
exists("population", envir = rsa_env, inherits = FALSE)
R searches for variables in all the parent environments, unless you explicitly tell it not to.
x_times_y <- function(x) {
x * y
}
x_times_y(10)
[1] 40
x_times_y <- function(x) {
x * y
}
y <- 4
x_times_y(10)
[1] 40
x_times_y <- function(x) {
x * y
}
y <- 4
x_times_y(10)
[1] 40
print(x)
Error in print(x) : object 'x' not found
x_times_y <- function(x) {
y <- 6
x * y
}
y <- 4
x_times_y(10)
[1] 60
x_times_y <- function(x) {
x <- 9
y <- 6
x * y
}
y <- 4
x_times_y(10)
[1] 54