Today you will get some practice writing functions. Load package tidyverse
.
library(tidyverse)
Below is a template for how R functions are structured. The three main components are
my_fcn_name <- function(arg_1, arg_2, arg_3) {
# body
# of
# the
# function
# goes
# here
return(r_object) # object to return
}
viewr()
Your goal is to create a function called viewr()
that will take a data frame as its data input and return a randomly specified number of rows and specified number of columns.
The following two functions will be useful in your development of viewr()
.
dim()
Function dim()
retrieves the dimension of an object. For example,
dim(mtcars)
dim(diamonds)
gives the number of rows and columns of the data frames mtcars
and diamonds
as a vector of length two.
sample_n()
Function sample_n()
is a function in package dplyr
that randomly selects rows from a data frame. For example
randomly selects five rows from mtcars
.
The first step in building any function is to get your code working before turning it into a function.
Use a sequence of dplyr
functions to select the first 3 columns of diamonds
and randomly pick 7 rows.
Modify your above sequence to select all the columns and randomly pick 20 rows.
Use your code in Step 1 as a template to create a function named viewr_1()
. The function should have three arguments:
df
: will take a data frame as its inputnrow
: will take a positive integer as its inputncol
: will take a positive integer as its inputThe function will return part of the data frame provided as an input, but with a randomly chosen number of rows specified by nrow
and the first ncol
columns as specified.
After you write your function, test your function with the following calls:
viewr_1(df = diamonds, nrow = 4, ncol = 3)
viewr_1(df = diamonds, nrow = 10, ncol = 10)
viewr_1(df = diamonds, nrow = 10, ncol = 11)
You should have received an error when you ran the third line. Data set diamonds
only has 10 variables, so it cannot select 11 columns. You will fix this in step 3.
Modify viewr_1()
to include an if-else statement. If ncol
does not exceed the number of columns in the data frame, then the function should run as normal. Otherwise, the function should output a message with cat("The data frame does not have that many columns!")
. Call this new function viewr_2()
Test viewr_2()
with the following calls:
viewr_2(df = diamonds, nrow = 4, ncol = 3)
viewr_2(df = diamonds, nrow = 10, ncol = 10)
viewr_2(df = diamonds, nrow = 10, ncol = 11)
viewr_2(df = diamonds, nrow = 10000000, ncol = 3)
You should have received an error when you ran the fourth line. Data set diamonds
does not have 10000000 rows. To fix this, proceed to Step 4.
Modify viewr_1()
to include a stop-if-not statement with function stopifnot()
. Call this new function viewr().
If any of the expressions are not all true, stop is called, producing an error message indicating the first expression which was not true. Separate conditions with a comma. Below is an example.
ranger <- function(x){
# check if x is a numeric vector of length at least 2
stopifnot(is.numeric(x), length(x)>=2)
value <- max(x) - min(x)
return(value)
}
# function test
ranger(x = 1:10)
ranger(x = 3)
ranger(x = letters)
Using stopifnot()
is more convenient than including multiple if-else statements.
Test viewr()
with the following calls:
viewr(df = diamonds, nrow = 4, ncol = 3)
viewr(df = diamonds, nrow = 10, ncol = 11)
viewr(df = diamonds, nrow = 10000000, ncol = 3)
limit_e()
Create a function called limit_e()
that takes one argument, n
. Argument n
shoud be a vector of integers greater than zero. Function limit_e()
will compute and return the evaluated quantity
\[\bigg(1 + \frac{1}{n}\bigg)^n.\]
From calculus you know that the mathematical constant \(e\) is defined as
\[e = \lim_{n \to \infty}\bigg(1 + \frac{1}{n}\bigg)^n.\]
After you write your function, test it with the following function calls. What do you notice happens? Why is this happening?
limit_e(100)
limit_e(1000000)
limit_e(c(1, 1000000, 100000000000))
limit_e(c(1, 1000000, 1000000000000000000))
R, as does most software, uses floating point arithmetic, which is not the same as the arithmetic we learn. Computers cannot represent all numbers exactly.
For your mind to be blown, run the following examples.
# example 1
0.2 == 0.6 / 3
# example 2
point3 <- c(0.3, 0.4 - 0.1, 0.5 - 0.2, 0.6 - 0.3, 0.7 - 0.4)
point3
point3 == 0.3
To work around these issues, use all.equal()
for checking the equality of two numeric quantities in R.
# example 1, all.equal()
all.equal(0.2, 0.6 / 3)
# example 2, all.equal()
point3 <- c(0.3, 0.4 - 0.1, 0.5 - 0.2, 0.6 - 0.3, 0.7 - 0.4)
point3
all.equal(point3, rep(.3, length(point3)))