Harold Nelson
1/26/2022
Create a function my_range() that returns the value of the range of a numeric vector.
## [1] 6.100267
## [1] -3.022921 3.077346
Note that the built-in range() function does not do the arithmetic.
Create a function range_95() that returns the difference between the 95th percentile and the 5th percentile of a numeric vector.
## 95%
## 3.158651
Create a function range_85() that returns the difference between the 85th percentile and the 15th percentile of a numeric vector.
## 85%
## 2.110154
We’ve created separate functions range_85() and range_95(). In addition we have the built-in function IQR(), which is essentially range_75(). Create a function gen_range(x,pct), where the parameter pct takes the place of the 75, 85, and 95 in our examples.
gen_range = function(x,pct){
top = quantile(x,pct/100)
bottom = quantile(x,1 - pct/100)
return(top - bottom)
}
rn = rnorm(1000)
gen_range(rn,85)
## 85%
## 2.080405
## 85%
## 2.080405
Create a function rmsd(x,y) which returns the square root of the mean of the squares of the differences between x and y.
rmsd = function(x,y){
diffs = x - y
diffs_sq = diffs^2
mdiffs_sq = mean(diffs_sq)
return(sqrt(mdiffs_sq))
}
x = rnorm(1000)
y = rnorm(1000)
rmsd(x,y)
## [1] 1.415952
Create a function mad(x,y) which returns the mean of the absolute values of the differences between x and y.
mad = function(x,y){
diffs = x - y
abs_diffs = abs(diffs)
return(mean(abs_diffs))
}
x = c(1,2,3,4)
y = c(2,1,4,3)
mad(x,y)
## [1] 1
You probably noticed that the quantile() function produces a named vector as a result. You may want to know why. The answer is that its second argument may be a vector of percentiles. In that case, the labels would be important.
## 10% 25% 50% 75% 90%
## -1.34313095 -0.70910765 0.05523986 0.69312419 1.34090791
Create an inverse of the quantile() function, qinv(x,val). The parameter x is a numeric vector.The parameter val is a single number. The function returns the fraction of the values of x that are less than val.
## [1] 0.981
When we apply the summary function to a numeric vector like county$pop2017, we get some useful results.
Example
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 88 10976 25857 103763 67756 10163507 3
## 'summaryDefault' Named num [1:7] 88 10976 25857 103763 67756 ...
## - attr(*, "names")= chr [1:7] "Min." "1st Qu." "Median" "Mean" ...
res is a named vector. We can wrap this function inside another function and add to the output vector before producing a final result.
tb_summary = function(x){
res = summary(x)
out = c(res,sd(x,na.rm = T))
names(out) = c(names(res),"SD")
return(out)
}
tb_summary(county$pop2017)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 88.0 10975.5 25857.0 103763.4 67756.0 10163507.0 3.0
## SD
## 333194.5