Aim: to grasp the concept of random variable as a function with Tidyeval.

generate i.i.d. random random variables

Load the libraries:

library(purrr)
library(rlang)
## 
## Attaching package: 'rlang'
## The following objects are masked from 'package:purrr':
## 
##     %@%, as_function, flatten, flatten_chr, flatten_dbl,
##     flatten_int, flatten_lgl, flatten_raw, invoke, list_along,
##     modify, prepend, splice

Let’s define a function generating random variables as a list of independent and identically distributed (i.i.d.) functions from an distribution:

generate_Xn <- function(sample_size, distribution, ...){
  parameters <- rlang::enexprs(...)
  X <- rlang::enexpr(distribution)
  Xi <- rlang::call2(X, 1, !!!parameters)
  Xn <- c()
  purrr::imap(1:sample_size, ~c(Xn, Xi)) %>% unlist()
}

Here is an example of generating a series of i.i.d. random variables:

Xn <- generate_Xn(sample_size=5, distribution=rnorm, mean=2, sd=3)
Xn
## [[1]]
## rnorm(1, mean = 2, sd = 3)
## 
## [[2]]
## rnorm(1, mean = 2, sd = 3)
## 
## [[3]]
## rnorm(1, mean = 2, sd = 3)
## 
## [[4]]
## rnorm(1, mean = 2, sd = 3)
## 
## [[5]]
## rnorm(1, mean = 2, sd = 3)

Let’s define a function that realize values from the series:

realize_Xn <- function(Xn){
  purrr::map(Xn, eval)
}
realize_Xn(Xn)
## [[1]]
## [1] -0.4390016
## 
## [[2]]
## [1] 8.507265
## 
## [[3]]
## [1] 3.211729
## 
## [[4]]
## [1] 1.761915
## 
## [[5]]
## [1] 5.65294

It should be noted that they are just an example (a sample) of the series of random variables. In the real world problem, we obtain realized values from an unkown distribution. Here, we artificially generate the random variables from a normal distribution.

Indeed, if we perform another trial of the realization,

realize_Xn(Xn)
## [[1]]
## [1] 9.257417
## 
## [[2]]
## [1] 1.221062
## 
## [[3]]
## [1] 7.220217
## 
## [[4]]
## [1] -0.1279103
## 
## [[5]]
## [1] -3.746255

a different set of values was realized.

Let’s realize many random variables from a normal distribution and make a histogram:

Xn2 <- generate_Xn(1e3, rnorm, mean=4.5, sd = 2)
realize_Xn(Xn2) %>% unlist() %>% hist(.,40)

Let’s generate random variables from a different distribution (an uniform distribution):

Xn3 <- generate_Xn(5, runif, min=1, max=5)
Xn3
## [[1]]
## runif(1, min = 1, max = 5)
## 
## [[2]]
## runif(1, min = 1, max = 5)
## 
## [[3]]
## runif(1, min = 1, max = 5)
## 
## [[4]]
## runif(1, min = 1, max = 5)
## 
## [[5]]
## runif(1, min = 1, max = 5)

function of random variables

A random variable is a function of a random event. Therefore, a function of a function of a random event is also a random variable.

Sample mean is a random variable composed from the sample. Let’s define the varialbe:

expr_plus <- function(a, b){
  aa <- enexpr(a)
  bb <- enexpr(b)
  expr(!!aa + !!bb)
}

my_sum <- reduce(Xn3, expr_plus)
my_mean <- expr(!!my_sum/length(Xn3))
my_mean
## (runif(1, min = 1, max = 5) + runif(1, min = 1, max = 5) + runif(1, 
##     min = 1, max = 5) + runif(1, min = 1, max = 5) + runif(1, 
##     min = 1, max = 5))/length(Xn3)

Let’s realize a value from my_mean:

realize_X <- function(X){
  eval(X)
}
my_mean %>% realize_X()
## [1] 2.700836

Another trial:

my_mean %>% realize_X()
## [1] 2.703306

What does a distribution of the mean from the sample from a uniform distribution look like?

Xn4 <- generate_Xn(64, runif, min=2, max=10)
my_sum <- reduce(Xn4, expr_plus)
my_mean <- expr(!!my_sum/length(Xn4))
my_mean %>% realize_X()
## [1] 6.340585
my_mean_lst <- c()
my_mean_lst <- imap(1:1e3, ~c(my_mean_lst, my_mean %>% eval()))

my_mean_lst %>% unlist() %>% hist(., 40)

This simulation revealed that it looks like a normal distribution.

By the way, we do not need those functions if you just want a realized values from a distribution.

my_mean <- map(1:1e3, ~mean(runif(64)))
my_mean %>% unlist() %>% hist(., 40)

This code is faster than the method shown above.