From an R script in RStudio (a file with extension .R), you can send code for execution.
Use the ‘Run’ arrow execute the commands that have been selected.
The key combination CTRL+ENTER can also used to execute the commands that have been selected.
Find the following windows in R Studio. What is their purpose?
Control flows connect several building blocks of your code together
A loop can be used to repeat the same portion of code (or block of code) for a number of times
In an algorithm, logical operations are used to decide whether a loop should continue or not
An ‘if’ statement is used to
execute a command if some condition is
satisfied
A ‘for’ loop is used to repeat a
building block a pre-determined number of
times
A ‘while’ loop is used to repeat a
building block until some condition fails
Note: We will come back to the notion of ‘Algorithm’ in much more details in Week 7.
x <- 3
y <- 2
if (x<=y) {
print("x smaller than y")
} else {
print("x larger than y")
}
## [1] "x larger than y"
m,n
are integers):for (i in m:n){}
{} for
all values of i starting at i=m up to
i=n. Note i is just a name here (it can be
anything).x <- c(1,3,7,2)
for (i in 1:4){print(x[i])}
## [1] 1
## [1] 3
## [1] 7
## [1] 2
for (k in 2:4){
x[k-1] <- x[k]
}
x
## [1] 3 7 2 2

Suppose you open a new bank account. At time \(0\), you deposit \(\$500\) into your account. Your account earns \(i\%\) per annum of interest in year \(i\) (i.e. \(1\%\) in the first year, \(2\%\) in the second year, etc.)
What would be your account balance at the end of the 10th year? Use a ‘for’ loop in your solution.
balance <- 500
for (i in 1:10) {
balance <- balance*(1 + i/100)
}
balance
## [1] 850.9107
while (condition){}
{} as
long as condition is TRUE.x <- 2
y <- 1
while(x+y<6){
x<-x+y
print(x+y)}
## [1] 4
## [1] 5
## [1] 6
x
## [1] 5
# (x=2,y=1) -> (x+y=3<6) -> (x=3,y=1) -> (x+y=4<6) ->
#(x=4,y=1) -> (x+y=5<6) -> (x=5,y=1) -> (x+y=6) -> End

With the help of function rpois(n=1, lambda = 2),
generate a series of random variables \(X_1,
X_2, X_3, \ldots\) until one of them is equal to \(5\). Store those random variables and
display the full sequence.
X <- rpois(n=1, lambda = 2) # We generate our first random variable
X.vector <- X # To start with, our sequence only contains one result (X)
while (X != 5) {
X <- rpois(n=1, lambda = 2) # We generate a new random variable
X.vector <- c(X.vector,X) # We add it to the sequence
}
X.vector
## [1] 0 0 1 0 4 3 2 1 1 1 2 3 2 2 6 0 4 4 3 1 1 1 6 0 3 2 1 1 1 4 1 3 2 3 1 4 1 0
## [39] 2 2 2 3 2 3 0 0 3 2 1 3 0 6 3 3 1 1 0 2 3 0 0 4 2 1 5
Loops can be inefficient and, if possible, you should avoid them.
Other loops exists: break can be used to exit a loop, and next moves to the next iteration within a loop. See more info and examples on Pages 119-120 or here.
x <- c(1,2,3)
y <- c(4,5,6)
x+y
## [1] 5 7 9
M <- matrix(1:9, nrow=3)
exp(M)
## [,1] [,2] [,3]
## [1,] 2.718282 54.59815 1096.633
## [2,] 7.389056 148.41316 2980.958
## [3,] 20.085537 403.42879 8103.084
x <- runif(5000000) # Generate 5 million random elements
z <- 0;
system.time(for(i in 1:5000000){z <- z + x[i]})
## user system elapsed
## 0.073 0.000 0.073
z
## [1] 2500198
system.time(zz <- sum(x))
## user system elapsed
## 0.008 0.000 0.008
zz
## [1] 2500198
A function in R is defined by its name and by the list of its parameters (or arguments). Most functions output a value.
Using a function (or calling or executing it) is done by typing
its name followed, in brackets, by the list of arguments to be used.
Arguments are separated by commas. Each argument can be followed by the
sign = and the value to be given to the argument.
functionname(arg1 = value1, arg2 = value2, arg3 = value3)
Note that you do not necessarily need to indicate the names of the arguments, but only their values, as long as you follow their order.
For any R function, some arguments must be specified and others are optional (because a default value is already given in the code of the function).
Can you name some functions you already know and that we have seen?
Some functions don’t have arguments!
factorial(6)
## [1] 720
date()
## [1] "Mon Jun 19 02:03:04 2023"
The function log(x, base = exp(1)) can take two
arguments: x (its value must be specified) and
base (optional, because a default value is provided as
exp(1)).
You can call a function by playing with the arguments in several different ways. This is an important feature of R which makes it easier to use. All the following commands will execute the same calculations
log(3)
## [1] 1.098612
log(x = 3)
## [1] 1.098612
log(x = 3, base = exp(1))
## [1] 1.098612
log(x = 3, exp(1))
## [1] 1.098612
log(3, base = exp(1))
## [1] 1.098612
log(3, exp(1))
## [1] 1.098612
log(base = exp(1), 3)
## [1] 1.098612
log(base = exp(1), x = 3)
## [1] 1.098612
An important part of coding in R is creating your own functions.
Creating a function is done following the general syntax:
function(<list of arguments>){<body of the function>},
where
<list of arguments> is a list of named
arguments (also called formal arguments) ;
<body of the function> represents, as the name
suggests, the contents of the code to execute when the function is
called.
To execute it, the user needs to call the function, followed by the
effective arguments listed between brackets () and
separated by commas. Here an effective argument is the value affected to
a formal argument.
# This line creates a function called 'hello' with one argument called 'name'
hello <- function(name){cat("Hello, my dear", name, "!")}
# This line executes the function, with the the effective argument 'Josephine'
hello(name = "Josephine")
## Hello, my dear Josephine !
Note: R allows calling a function without typing in the complete name of a formal argument:
hello(na="Jinxia")
## Hello, my dear Jinxia !
hello(n="Samantha")
## Hello, my dear Samantha !
The body of a function can be a simple R instruction, or a sequence of R instructions. In the latter case, the instructions must be enclosed between the characters { and } to delimit the beginning and end of the body of the function.
Several R instructions can be written on the same line as long as they are separated by a semicolon ‘;’
Create a function hello() in R, such that
there is a single argument called name
it returns ‘Hello, my dear’ followed by the name in capital letters followed by ’ !’.
Target output
>hello("Peter")
Hello, my dear PETER !
Hint: Use function toupper().
hello <- function(name){
# Convert the name to upper case.
name <- toupper(name)
cat("Hello, my dear", name, "!")
}
hello("Peter")
## Hello, my dear PETER !
Of course, a function can have more than one argument. Here, function
CDF.pois() has two arguments, x and
lambda. It calculates the CDF \(F_X(x)\) at x of a Poisson
random variable with parameter equal to lambda. Note
the use of a for loop.
CDF.pois <- function(x, lambda){
# Initialise the cdf to 0
cdf = 0
# For k from 0 to x, add together the probablity masses p(k)
for (k in 0:x){
cdf = cdf + exp(-lambda)*lambda^k/factorial(k)
}
# Return the result
return(cdf)
}
CDF.pois(x = 3, lambda = 4)
## [1] 0.4334701
Note: we have every right to use a function within a function. For
instance, here we used the (already defined) function
factorial() inside our new function
CDF.pois().
Code a function which takes two arguments \(n\) and \(p\) and calculates the binomial coefficient \[{n \choose p}=\frac{n!}{p!(n-p)!}\]
Test your function by evaluating the result of \[{5 \choose 3}\] which should yield \(10\).
binomial <- function(n,p) {factorial(n)/(factorial(p)*factorial(n-p))}
binomial(5,3)
## [1] 10
When declaring a function, all arguments are identified by a unique name.
Each argument can be associated with a default value. To specify a default value, use the character = followed by the default value.
When the function is called with no effective argument for that argument, the default value will be used.
# Declare function 'binomial' with default values
binomial <- function(n=5,p=3){factorial(n)/(factorial(p)*factorial(n-p))}
binomial() # Use both default values
## [1] 10
binomial(n=6) # Specify first argument, but use default value for the second
## [1] 20
return(). This instruction halts the
execution of the code in the body of the function and returns the object
between brackets.binomial <- function(n=5,p=3){
return(factorial(n)/(factorial(p)*factorial(n-p)))
# Everything below will not be executed, because we returned the first part
my.unif <- runif(1)
while (my.unif <= 1){ # Note that this part would be an infinite loop! Beware of those!
my.unif <- runif(1)
}
}
return()’ in the body of the
function, then the function will return the result of the last evaluated
expression.Variables defined inside the body of a function have a local scope during function execution. This means that a variable inside the body of a function is physically different from another variable with the same name, but defined in the work space of your R session.
Generally speaking, local scope means that a variable only exists inside the body of the function. After the execution of the function, the variable is thus automatically deleted from the memory of the computer.
Create a function in R that calculates the present value of an annuity (paying \(1\) per year). The inputs are
the number of years, which is by default \(1\)
whether the payments are paid in arrears or not, which is by default TRUE
the annual interest rate, which is by default \(6\%\)
Note: recall that the present value of an annuity that pays \(1\) at the end of each year for \(n\) years is \[\frac{1+(1+i)^{-n}}{i}.\] If payments occur at the beginning of the year (rather than in arrears), then the expected present value is \[(1+i)\frac{1+(1+i)^{-n}}{i}.\]
annuity_cal <- function(n=1, arrears=TRUE, i=0.06){
discount <- 1/(1+i)
temp <- (1-discount^n)/i
if (arrears == T){
return(temp)
}
else{
return(temp*(1+i))
}
}
annuity_cal()
## [1] 0.9433962
annuity_cal(a=FALSE)
## [1] 1
annuity_cal(n=5, a=FALSE, i=0.07)
## [1] 4.387211
Create a function in R that plots the density/distribution function of a normal random variable. The arguments are
mean \(\mu\), which is by default 0
variance \(\sigma^2\), which is by default 1
whether a density function is plotted, which is by default TRUE; if FALSE, then the cumulative distribution function is plotted
The output is either the density or the distribution function over the range \((\mu-4\sigma,\mu+4\sigma)\).
Hint: You will need functions dnorm() and
pnorm() as well as function plot().
Note: There is more to come about graphical tools in Weeks 5.
plot_norm <- function(mean=0, variance=1, density=TRUE) {
temp <- seq(from=mean-4*sqrt(variance), to=mean+4*sqrt(variance), by=sqrt(variance)/50)
if (density) # Note writing 'if (density)' is equivalent to writing 'if (density = T)'
plot(temp, dnorm(temp, mean, sqrt(variance)))
else
plot(temp, pnorm(temp, mean, sqrt(variance)))
}
plot_norm()
plot_norm(1,4,TRUE)
plot_norm(1,4,FALSE)