Using for(i in x) is an efficient way to repeatedly execute a code for i number of iterations. In this recitation, we will (1) execute basic loops, (2) create basic functions and (3) demonstrate the Law of Large Numbers and the Central Limit Theorem using loops.
Relevant functions: set.seed(), rnorm(), for(i in x), sample(), cat(), print(), replicate(), sqrt(), prod().
rnorm()set.seed(150) # Setting the seed for replication purposes
myData <- rnorm(1000,45,15) # Creating a random normal distribution (n=1000, mean=45, sd=15)
length(), mean() and sd()length(myData) # How many observations?
## [1] 1000
mean(myData) # What is the mean?
## [1] 44.52407
sd(myData) # What is the standard deviation?
## [1] 14.85105
for(i in x)
sample() and print()set.seed(300) # Setting the seed for replication purposes
for (i in 1:5) # Specifying the number of iterations
{
obs <- sample(myData,size=1) # Sampling one observation from the myData vector and storing it into the "obs" object
print(obs) # Printing the value of that "obs" object
cat("I have finished", i,"iterations \n") # Printing a string of characters after each iteration
}
## [1] 51.39313
## I have finished 1 iterations
## [1] 46.5734
## I have finished 2 iterations
## [1] 58.46658
## I have finished 3 iterations
## [1] 46.90793
## I have finished 4 iterations
## [1] 26.78559
## I have finished 5 iterations
sample() and sqrt()set.seed(300) # Setting the seed for replication purposes
results <- c() # Creating an empty vector to hold the results
for (i in 1:5) # Specifying the number of iterations
{
obs <- sample(myData,size=1) # Sampling one observation from the myData vector and storing it into the "obs" object
results[i] <- sqrt(obs) # Calculating the square root of that "obs" object and storing it into the "results" vector
cat("The square root of", obs, "is", results[i],"\n") # Printing a string of characters after each iteration
}
## The square root of 51.39313 is 7.1689
## The square root of 46.5734 is 6.824471
## The square root of 58.46658 is 7.646344
## The square root of 46.90793 is 6.848937
## The square root of 26.78559 is 5.175479
summary(results) # Using summary as a sanity check
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.175 6.824 6.849 6.733 7.169 7.646
sample() and sum()set.seed(300) # Setting the seed for replication purposes
results <- c() # Creating an empty vector to hold the results
for (i in 1:5)
{
obs <- sample(myData,size=2) # Sampling two observations from the myData vector and storing it into the "obs" object
results[i] <- sum(obs) # Calculating the sum of the elements encompassed within the "obs" object and storing it into the "results" vector
cat("The sum of", obs[1], "and", obs[2], "is", results[i],"\n") # Printing a string of characters after each iteration
}
## The sum of 51.39313 and 42.72983 is 94.12297
## The sum of 58.46658 and 46.90793 is 105.3745
## The sum of 26.78559 and 44.83856 is 71.62414
## The sum of 41.26901 and 26.99162 is 68.26063
## The sum of 68.06239 and 43.11086 is 111.1733
summary(results) # Using summary as a sanity check
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 68.26 71.62 94.12 90.11 105.37 111.17
die_mean() function to calculate the mean of n number of die rollsdie <- 1:6 # Creating a "die" vector with all numbers from 1 to 6
die_mean <- function(n) {
mean(sample(die, size = n, replace = TRUE)) # Sampling with replacement "n" number of times from the following vector: c(1,2,3,4,5,6)
}
n number of die rolls using die_mean()set.seed(500) # Setting the seed for replication purposes
for (n in c(10,100,1000,10000,100000)) # Here, we are using 5 iterations with the values within this vector for "n"
{
result <- die_mean(n) # Calculating the mean of n die rolls
cat("The mean of", n, "number of die rolls is", result,"\n") # Printing a string of characters after each iteration
}
## The mean of 10 number of die rolls is 4.7
## The mean of 100 number of die rolls is 3.26
## The mean of 1000 number of die rolls is 3.526
## The mean of 10000 number of die rolls is 3.485
## The mean of 1e+05 number of die rolls is 3.49953
The mean gets closer to the expected value (or population parameter) as the sample size increases. This is the Law of Large Numbers.
i number of observations from a vector of means of 15 die rolls using replicate() and graphing the results using hist()
set.seed(100) # Setting the seed for replication purposes
for (i in c(100,1000,100000))
{
x <- replicate(i, { # Using replicate to repeat the "randomized experiment" several times without getting the same answer, and storing the results within a x vector (in other words, replicate reevaluates the given expresion for each replication)
mm <- die_mean(15) # Calculating the mean of 15 die rolls i number of times
mean(mm) # Calculating the mean of these means (this isn't as confusing as it sounds: we basically calculate the mean of 15 die rolls multiple times (either 100, 1000 or 100000 times) and calculate the mean of these 100, 1000 or 100000 observations)
})
a <- round(sd(x),digits =3)
hist(x, # Specifying which data will be plotted
xlab="Mean of 15 die rolls", # Labeling the x axis
xlim=c(1,6), # Delimiting the x axis
col="#4286f4", # Check out "color picker" on Google if you want to select a custom color!
main=paste("Histogram for ", i," random draws (sd = ", a,")", sep="")) # Using the paste command to generate the main plot title
}
The distribution of the sample mean tends toward a normal distribution as the sample size increases. This is the Central Limit Theorem.
To learn more about graphical parameters in base R, you can check this out: https://www.statmethods.net/advgraphs/parameters.html
Calculate the square root of all i numbers from 100 to 115. Print a textual output that states “The square root of i is answer” for each iteration.
Relevant functions: sqrt(), cat().
Set your seed at 250. Generate a random normal distribution of 5000 observations, with a mean of 20 and a standard deviation of 3. Calculate the product (*) of 10 pairs of observations from this distribution. Print a textual output that states “The product of first value and second value is answer” for each iteration.
Relevant functions: set.seed(), rnorm(), prod(), cat().
Set your seed at 100. Provide an example of the Central Limit Theorem by generating 4 histograms of the means of 20 coin flips (one for 10 draws, one for 50 draws, one for 100 draws, and one for 1000 draws).
Relevant functions: function(), mean(), replicate(), hist().