Let’s print the following: Today is [day of the week] where [day of the week] is equal to Monday, Tuesday, and so on.
print(paste("Today is", "Monday"))
## [1] "Today is Monday"
print(paste("Today is", "Tuesday"))
## [1] "Today is Tuesday"
print(paste("Today is", "Wednesday"))
## [1] "Today is Wednesday"
print(paste("Today is", "Thursday"))
## [1] "Today is Thursday"
print(paste("Today is", "Friday"))
## [1] "Today is Friday"
print(paste("Today is", "Saturday"))
## [1] "Today is Saturday"
print(paste("Today is", "Sunday"))
## [1] "Today is Sunday"
However, this violates the DRY principle, known in every programming language: Don’t Repeat Yourself we can use a for loop
for (day in c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")){
print(paste("Today is", day))
}
## [1] "Today is Monday"
## [1] "Today is Tuesday"
## [1] "Today is Wednesday"
## [1] "Today is Thursday"
## [1] "Today is Friday"
## [1] "Today is Saturday"
## [1] "Today is Sunday"
Using the index i
for (i in c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")){
print(paste("Today is", i))
}
## [1] "Today is Monday"
## [1] "Today is Tuesday"
## [1] "Today is Wednesday"
## [1] "Today is Thursday"
## [1] "Today is Friday"
## [1] "Today is Saturday"
## [1] "Today is Sunday"
Now, you write a for loop. Using a for() loop, adds 1 to each number of this sequence c(7, 4, 3, 8, 9, 25), Store the result in an object y, then print the first 4 elements of y
x <- c(7, 4, 3, 8, 9, 25)
for(i in x){
y <- x + 1
}
y[1:4]
## [1] 8 5 4 9
Using a for loop, print integers from 0 to 100, with increments of 5
for(i in seq(0, 100, 5)){
print(i)
}
## [1] 0
## [1] 5
## [1] 10
## [1] 15
## [1] 20
## [1] 25
## [1] 30
## [1] 35
## [1] 40
## [1] 45
## [1] 50
## [1] 55
## [1] 60
## [1] 65
## [1] 70
## [1] 75
## [1] 80
## [1] 85
## [1] 90
## [1] 95
## [1] 100
Remember the example from class? Calculate the expected value, aka the mean
y <- c(0, 1, 2, 3, 4)
p <- c(0.1, 0.35, 0.07, 0.36, 0.12)
sum(y*p)
## [1] 2.05
# alternatively
weighted.mean(y, p)
## [1] 2.05
Calculate the variance
ev <- sum(y*p)
v <- sum((y - ev)^2)
Assume that the test scores of a college entrance exam fits a normal distribution. Furthermore, the mean test score is 72, and the standard deviation is 15.2. What is the percentage of students scoring 84 or more in the exam?
look up ?pnorm
We apply the function pnorm of the normal distribution with mean 72 and standard deviation 15.2. Since we are looking for the percentage of students scoring higher than 84, we are interested in the upper tail of the normal distribution.
pnorm(84, mean=72, sd=15.2, lower.tail=FALSE) # notice the use of lower.tail=FALSE.
## [1] 0.2149176
right - upper, left - lower
lower.tail logical; if TRUE (default), probabilities are P[X greater then or equal to x] otherwise, P[X > x].
pnorm(84, mean=72, sd=15.2)
## [1] 0.7850824
Suppose IQ scores are normally distributed with mean 100 and standard deviation 15. What is the probability that a person has an IQ score higher than 107?
Keep in mind that a standard normal distribution has the mean of 0 and st.dev of 1. If we don’t specify otherwise in the function, that’s what the functions will use as default
returns the height of the probability distribution at each point
dnorm(0)
## [1] 0.3989423
dnorm(0, mean = 3, sd = 1.3)
## [1] 0.02140727
for looking up probabilities. It is also know as Cumulative Distribution Function. It computes the probability that a normally distributed random number will be less than that number.
pnorm(0)
## [1] 0.5
pnorm(1)
## [1] 0.8413447
pnorm(0, mean = 3, sd = 1.3)
## [1] 0.01050813
To get the probability that a number is larger than the given number, we can use the lower.tail argument
pnorm(1)
## [1] 0.8413447
pnorm(1,lower.tail=FALSE)
## [1] 0.1586553
for generating samples of normally distributed variables
rnorm(5)
## [1] -1.8583180 -0.9333285 0.3557916 2.4206877 -1.5596239
y <- rnorm(200, mean = 2, sd = 4)
hist(y)
Create this histogram in ggplot. Add a title and make the bars purple
for the quantile function. is the inverse of pnorm, meaning that we give it a probality, and it returns the number whose cumulative distribution matches the probability
qnorm(0.5)
## [1] 0
qnorm(0.8413)
## [1] 0.9998151
qnorm(0.25,mean = 2, sd = 2)
## [1] 0.6510205
Go back to the slides for more exercises! :)
The end!