Exercises on Functions

Harold Nelson

1/26/2022

Exercise 1

Create a function my_range() that returns the value of the range of a numeric vector.

Solution

my_range = function(x){
  
  return(max(x) - min(x))
}

rn = rnorm(1000)
my_range(rn)
## [1] 6.100267
range(rn)
## [1] -3.022921  3.077346

Note that the built-in range() function does not do the arithmetic.

Exercise 2

Create a function range_95() that returns the difference between the 95th percentile and the 5th percentile of a numeric vector.

Solution

range_95 = function(x){
  
  return(quantile(x,.95) - quantile(x,.05))
}

range_95(rn)
##      95% 
## 3.158651

Exercise 3

Create a function range_85() that returns the difference between the 85th percentile and the 15th percentile of a numeric vector.

Solution

range_85 = function(x){
  
  return(quantile(x,.85) - quantile(x,.15))
}

range_85(rn)
##      85% 
## 2.110154

Exercise 4

We’ve created separate functions range_85() and range_95(). In addition we have the built-in function IQR(), which is essentially range_75(). Create a function gen_range(x,pct), where the parameter pct takes the place of the 75, 85, and 95 in our examples.

Solution

gen_range = function(x,pct){
  
  top = quantile(x,pct/100)
  bottom = quantile(x,1 - pct/100)
  return(top - bottom)
}

rn = rnorm(1000)
gen_range(rn,85)
##      85% 
## 2.080405
range_85(rn)
##      85% 
## 2.080405

Exercise 5

Create a function rmsd(x,y) which returns the square root of the mean of the squares of the differences between x and y.

Solution

rmsd = function(x,y){
  
  diffs = x - y
  diffs_sq = diffs^2
  mdiffs_sq = mean(diffs_sq)
  return(sqrt(mdiffs_sq))
}

x = rnorm(1000)
y = rnorm(1000)
rmsd(x,y)
## [1] 1.415952

Exercise 6

Create a function mad(x,y) which returns the mean of the absolute values of the differences between x and y.

Solution

mad = function(x,y){
  
  diffs = x - y
  abs_diffs = abs(diffs)
  return(mean(abs_diffs))
  
}

x = c(1,2,3,4)
y = c(2,1,4,3)

mad(x,y)
## [1] 1

Quantile

You probably noticed that the quantile() function produces a named vector as a result. You may want to know why. The answer is that its second argument may be a vector of percentiles. In that case, the labels would be important.

Example

rn = rnorm(1000)
values = quantile(rn,c(.1,.25,.5,.75,.9))
values
##         10%         25%         50%         75%         90% 
## -1.34313095 -0.70910765  0.05523986  0.69312419  1.34090791

Exercise 7

Create an inverse of the quantile() function, qinv(x,val). The parameter x is a numeric vector.The parameter val is a single number. The function returns the fraction of the values of x that are less than val.

Solution

qinv = function(x,val){
  
  return( mean(x < val) )
}

# Example 

rn = rnorm(1000)
qinv(rn,2)
## [1] 0.981

The summary function

When we apply the summary function to a numeric vector like county$pop2017, we get some useful results.

Example

load("county.rda")
res = summary(county$pop2017)
res
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
##       88    10976    25857   103763    67756 10163507        3
str(res)
##  'summaryDefault' Named num [1:7] 88 10976 25857 103763 67756 ...
##  - attr(*, "names")= chr [1:7] "Min." "1st Qu." "Median" "Mean" ...

res is a named vector. We can wrap this function inside another function and add to the output vector before producing a final result.

tb_summary = function(x){
  res = summary(x)
  out = c(res,sd(x,na.rm = T))
  names(out) = c(names(res),"SD")
  return(out)
}

tb_summary(county$pop2017)
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
##       88.0    10975.5    25857.0   103763.4    67756.0 10163507.0        3.0 
##         SD 
##   333194.5