Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Moodle. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
# place the code to simulate the data here
set.seed(123)
x <- rnorm(n=30, mean=c(50,80,102), sd=c(5.2,10.5,20))
x
##  [1]  47.08553  77.58314 133.17417  50.36664  81.35752 136.30130  52.39676
##  [8]  66.71686  88.26294  47.68256  92.85286 109.19628  52.08401  81.16217
## [15]  90.88318  59.29195  85.22743  62.66766  53.64705  75.03569  80.64353
## [22]  48.86653  69.22695  87.42218  46.74980  62.28972 118.75574  50.79754
## [29]  68.04956 127.07630
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
# place the code to simulate the data here
x = rnorm(30, mean = 20, sd=5.5)
y = rnorm(30, mean = 30, sd=10.5)
plot(y~x)

  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
# place the code to simulate the data here
set.seed(12)
x1 = runif(40 ,min= 0,max=1)
x2 = runif(40 ,min = 0.5, max =4)
y = rnorm(40,mean=0,sd=1)
lm(y ~ x1 + x2)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##     0.51357     -0.01396     -0.19519
  1. Simulate 3 letters repeating each letter twice, 2 times.
# place the code to simulate the data here
rep(letters[1:3], each=2, times=2)
##  [1] "a" "a" "b" "b" "c" "c" "a" "a" "b" "b" "c" "c"
  1. Create a dataframe with 3 groups, 2 factors and two quantitative response variables. Use the replicate function (n = 25).
# place the code to simulate the data here
set.seed(12)
group = rep(letters[10:12],length.out=25)
factor= rep(letters[15:16], length.out=25)
response = replicate(n=2, expr = rnorm(25, mean = 3, sd = 4))
data.frame(group, factor, response)
##    group factor         X1         X2
## 1      j      o -2.9222704  1.9304607
## 2      k      p  9.3086779  2.2035774
## 3      l      o -0.8269779  3.5244904
## 4      j      p -0.6800210  3.5831996
## 5      k      o -4.9905684  4.4482589
## 6      l      p  1.9108158  5.6959247
## 7      j      o  1.7386052 11.2881431
## 8      k      p  0.4869791  0.8358854
## 9      l      o  2.5741445 -1.2819686
## 10     j      p  4.7120592  1.5101731
## 11     k      o -0.1108783  1.0594346
## 12     l      p -2.1755292  4.0991367
## 13     j      o -0.1182660  1.0819498
## 14     k      p  3.0478070  6.1924213
## 15     l      o  2.3903350 -1.0178048
## 16     j      p  0.1861430  3.4199369
## 17     k      o  7.7555166 -1.6239716
## 18     l      p  4.3620491  5.3125385
## 19     j      o  5.0278727 -3.3825026
## 20     k      p  1.8267794  1.7659854
## 21     l      o  3.8945657  4.7978637
## 22     j      p 11.0288058 -0.9082131
## 23     k      o  7.0479165  3.7599914
## 24     l      p  1.7901630  5.9258134
## 25     j      o -1.1009794  1.0296036