Simulation

There are number of functions for generating random variables from kind of the standard probability distributions.

rnorm: for generating random Normal variates with a given mean and std rpois: for generating random Poisson variates with a given rate(value of lambda)

str(rnorm)

## function (n, mean = 0, sd = 1)

rnorm(10) # to generate random Normal variates with mean zero and std one (default)

##  [1] -0.6932141 -0.8622156  1.3121772  1.2695060 -0.4875785 -0.9104652
##  [7] -1.0864099  2.0544296 -0.7017115  0.8121763

rnorm(10,20,2) # to generate random Normal variates with mean 20 and std 2 (explicitly)

##  [1] 21.61330 22.03902 20.76473 19.19569 19.71153 17.42064 23.41415 24.19128
##  [9] 19.34963 19.64094

str(rpois)

## function (n, lambda)

x<-rpois(10,2)
str(rbinom)

## function (n, size, prob)

# we can generate a single random variable that represents the number of heads in 100 flips of our unfair coin using
rbinom(1, size = 100, prob = 0.7) # prob of success .7

## [1] 68

# if we want to see all of the 0s and 1s, we can request 100 observations,each of size 1, with success probability of 0.7
rbinom(100,size=1,prob=.7)

##   [1] 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0
##  [38] 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 1 1 1 0 0 1 1 0 0 0 1 1 0 1 0 0 1 1 1
##  [75] 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1

set.seed() allows for us to reproduce random numbers that we generate. The seed can be any integer we want.

set.seed(1)
rnorm(10)

##  [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684
##  [7]  0.4874291  0.7383247  0.5757814 -0.3053884

rnorm(10)

##  [1]  1.51178117  0.38984324 -0.62124058 -2.21469989  1.12493092 -0.04493361
##  [7] -0.01619026  0.94383621  0.82122120  0.59390132

set.seed(1)
rnorm(10)

##  [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684
##  [7]  0.4874291  0.7383247  0.5757814 -0.3053884

for each probability distribution there are 4 functions available

that starts with p for cumulative distribution q for quantile function d for density

str(ppois) # lower.tail logical; if TRUE (default), probabilities are P[X ≤ x](for example P[X ≤ 2]) otherwise, P[X > x]

## function (q, lambda, lower.tail = TRUE, log.p = FALSE)

# to know the probability that a Poisson random variable is less than or equal to 2 with rate 2
ppois(2,2)

## [1] 0.6766764

ppois(4,2)

## [1] 0.947347

ppois(6,2)

## [1] 0.9954662

simulate random numbers from linear model y = b0+b1*x+e; here b0 = .5, b1 = 2, x follows standard normal distribution with mean 0 and std 1, random noise(epsilon) follows standard normal distribution with mean 0 and std 2

set.seed(20)
x<-rnorm(100,0,1)
e<-rnorm(100,0,2)
y<-.5+2*x+e
summary(y)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -6.4084 -1.5402  0.6789  0.6893  2.9303  6.5052

Random Sampling: sample() is used to draw randomly from a specific set of objects that we specify

str(sample) # prob: a vector of probability weights for obtaining the elements of the vector being sampled.

## function (x, size, replace = FALSE, prob = NULL)

sample(1:10,4)

## [1]  5 10  3  6

sample(1:10,4)

## [1] 2 3 5 6

# suppose we want to simulate 100 flips of an unfair two-sided coin. This particular coin has a 0.3 probability of landing 'tails' and a 0.7 probability of landing 'heads'.Let the value 0 represent tails and the value 1 represent heads. Use sample() to draw a sample of size 100 from the vector c(0,1), with replacement.
flips<-sample(c(0,1),100,replace=TRUE,prob=c(.3,.7))
flips

##   [1] 1 1 0 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 1 1 0 0 1 1 1 0 0 0 0 1 1 1 0 1 0 1 1
##  [38] 0 1 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0
##  [75] 1 1 1 0 1 0 1 1 1 0 0 1 1 0 0 1 0 1 1 1 1 1 0 1 1 1

table(flips)  # to find no of heads and tails

## flips
##  0  1 
## 28 72

sum(flips) # to find no of heads

## [1] 72

sample(1:10) # permutation

##  [1]  9  8  6  4  7  2  5  1  3 10

sample(1:10,replace=TRUE)

##  [1] 8 8 6 7 1 3 1 2 7 2

If you are curious as to how much space the dataset is occupying in memory, you can use object.size() function.

x<-matrix(1:6,2,3,byrow = TRUE)
object.size(x)

## 248 bytes

names() will return a character vector of column (i.e. variable) names.

df<-data.frame(a=1:3,b=c(0,0,0))
df

##   a b
## 1 1 0
## 2 2 0
## 3 3 0

names(df)

## [1] "a" "b"

Simulation

Prottoy Kumar Prodhan Joy

2/27/2021