DATA 605 - Assignment #9

PROBLEM SET 1

(1) Write a function that will produce a sample of random variable that is distributed according to the PDF.

# Function based on PDF 1
pdf_function_1 <- function(){
  
  # get random variable from 0 to 2
  x <- runif(1, 0, 2)
  
  if (x <= 1) { return(x) }
  else return(2-x)
}

(2)Write a function that will produce a sample of random variable that is distributed as follows

# Function based on PDF 2

pdf_function_2 <- function(x){
  x <- runif(1, 0, 2)
  if (x <= 1) { return(1-x) }
  else return(x-1)
  
}

(3) Draw 1000 samples from each of the two distributions and plot the resulting histograms.

# Call each function 1000 times and store their output vectors f1 and f2
# create histogram for each variable

f1 <- replicate(1000, pdf_function_1())
hist(f1)

f2 <- replicate(1000, pdf_function_2())
hist(f2)

The histogram distributions are uniform. This tells us that the numbers in sample sets are picked randomly.

(4) Write a program that will take a sample set size n as a parameter and the PDF as the second parameter, and perform 1000 iterations where it samples from the PDF, each time taking n samples and computes the mean of yhese n samples. It then plots a histogram of these 1000 means that it computes.

# This function takes in sample size n and function to process
sample_means <- function(samplesize, pdf_function){
  # run function provided in parameter 1000 times with the sample size given
  m <- replicate(1000, mean(replicate(samplesize, pdf_function())))
  hist(m)
}

(4) Verify that as you get n to something like 10 or 20, each of the two PDF, produce normally distributed mean of smaples, empirically verifying the Central Limit Theorem.

# Run samples with first function
sample_means(10,{pdf_function_1})

sample_means(20,{pdf_function_1})

# Run samples with second function
sample_means(10,{pdf_function_2})

sample_means(20,{pdf_function_2})

As noted from the histograms, the mean distribution of sample sets is nearly normal as the sample size becomes larger. The peak in the middle becomes larger. This is true for both function 1 and function 2.