Source file ⇒ R-MT-Review_final_proposal.Rmd
## Problem 1 A prospective graduate student is willing to take the GRE (graduate version of SAT). He evaluates his probability to fail a single GRE test to be 0.7, but he needs to book all his test sessions right now. As a result, he wants to book several tests to make sure he passes at least one test.
To help him, write a function GRE that outputs the minimum number \(n\) of tests he needs to take so that the probability he fails all of the tests is less than or equal to 0.1.
Hint: The probability of failing \(n\) tests is \(0.7^n\). Your function does not need any argument.
GRE <- function(){
prob <- 1
n <- 0
while(prob > 0.1){
prob <- prob*0.7
n <- n+1
}
return(n)
}
#test your function
GRE()
## [1] 7
Write a function convert_time
that takes as input a vector with three components such as this one c(7,13,1)
which means the time 7:13 pm. (The last component can be interpreted as follows: 0 means “am”, 1 means “pm”). The output should be the time converted in 24 hours format. With the example given, the output should be c(19,13)
.
convert_time <- function(time){
if(time[3]==0){
return(time[1:2])
}
else{
return(c(time[1]+12,time[2]))
}
}
#test your function
convert_time(c(7,13,0))
## [1] 7 13
convert_time(c(7,13,1))
## [1] 19 13
3a) Write a function percentile
that takes two arguments: - An integer between 0 and 100 perc
- a vector of real numbers x
that is already sorted Your function should return the perc
-th percentile of the vector. You will need the function ceiling
which rounds up a real number.
Hint: Try to do it first for the median, the first quartile, the third quartile. Afterwards, generalize your thought process to any percentile.
percentile <- function(x,perc){
n <- length(x)
index <- ceiling(n*perc)
return(x[index])
}
3b) To test if your function from 1a works, you can run the following R-chunk. If it only prints “TRUE”, it is highly likely your function percentile
above is correct. Try to write a few lines about what this function test_percentile
is exactly doing.
test_percentile <- function(){
for(i in 0:10){
print(percentile(cars$speed,i/10)==quantile(cars$speed,i/10,type=1))
}
}
test_percentile()
## logical(0)
## 10%
## TRUE
## 20%
## TRUE
## 30%
## TRUE
## 40%
## TRUE
## 50%
## TRUE
## 60%
## TRUE
## 70%
## TRUE
## 80%
## TRUE
## 90%
## TRUE
## 100%
## TRUE
Write a function cumulative
that takes as input a vector u
. The output should be a vector where the component of position i is the sum of the components of u
at a position before i (i included).
For instance, if u <- c(4,14,19,9,2,13,4,9)
, cumulative(u)
should output the following: 4 18 37 46 48 61 65 74
. Notice that 18 = 4 + 14, 37 = 4 + 14 + 19 = 18 + 19, etc.
cumulative <- function(u){
cum_u <- c()
for(i in 1:length(u)){
sum=0
for(j in 1:i){
sum=sum+u[j]
}
cum_u <-c(cum_u,sum)
}
return(cum_u)
}
u <- c(4,14,19,9,2,13,4,9)
cumulative(u)
## [1] 4 18 37 46 48 61 65 74
cumulative <- function(u){
res <- c(u[1])
n <- length(u)
for(i in 2:n){
res <- c(res,res[i-1] + u[i])
}
return(res)
}
u <- c(4,14,19,9,2,13,4,9)
cumulative(u)
## [1] 4 18 37 46 48 61 65 74
#test your function
u <- c(4,14,19,9,2,13,4,9)
cumulative(u)
## [1] 4 18 37 46 48 61 65 74
The sequence of Fibonacci numbers is defined as follows:
Write a function fibo
to compute the 100th Fibonacci number. You should find a number like 3.542248e+20.
Hint: The simplest solution is to have three variables Fn_2
, Fn_1
and Fn
that you update at each iteration.
fibo <- function(n){
Fn_2 <- 1
Fn_1 <- 1
Fn <- 0
for(i in 3:n){
Fn <- Fn_2 + Fn_1
Fn_2 <- Fn_1
Fn_1 <- Fn
}
return(Fn)
}
#test your function fibo
paste("The 100th Fibonacci number is ", fibo(100))
## [1] "The 100th Fibonacci number is 3.54224848179262e+20"
Imagine you want to find the z-score corresponding to the 60th percentile in a standard normal distribution, and you only have access to the function pnorm(z)
which gives you access to the percentile correspond to the z-score z
.
Write a function using the dichotomy technique and targeting an accuracy of 0.0001. The principle is quite intuitive. You know that the z-score of the 60th percentile is between 0 and 1. Evaluate what percentile corresponds to a z-score of 0.5. If the percentile is more than 60%, then what you are looking for is less than 0.5. If not, what you are looking for is more than 0.5. And so on… Until your z-score corresponds to a value in the interval [0.6 - 0.0001 , 0.6 + 0.0001 ].
Hint: You can have three variables inf
, sup
and mid
where [inf
,sup
] is an interval where the z-score you are looking for is, and mid
is simply (inf
+sup
)/2.
If the percentile corresponding to the z-score mid
is greater than 60 then you want to restrict your search area to [inf
,mid
]. Otherwise, you want to restrict your search area to [mid
,sup
]. And so on at each iteration…
Note: Your function doesn’t need any argument.
dichotomy <- function(){
inf = 0
sup = 1
mid = (inf + sup) / 2
while(abs(pnorm(mid) - 0.6) > 0.0001){
if(pnorm(mid)>0.6){
sup = mid
}
else{
inf = mid
}
mid = (inf + sup) / 2
}
return(mid)
}
#test your function
paste("Your function should yield ",dichotomy(), " to compare with the actual value ", qnorm(0.6))
## [1] "Your function should yield 0.25341796875 to compare with the actual value 0.2533471031358"