Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Moodle. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
# place the code to simulate the data here
rnorm(30,mean = c(10, 20, 30), sd = c(10, 20, 30))
##  [1]  0.4642120 23.3603678 65.7516882 -3.9554267  5.5847457 19.8950573
##  [7]  2.6945169  3.7808181 40.9022037 -2.6708016 14.4075507 40.9902862
## [13]  6.5518186 14.4189569 73.3193926 -5.8883631 -3.8203586 14.3069844
## [19]  4.7733510 49.4985842 52.1800248  9.4080226 -5.7537133 -4.2309170
## [25]  3.1446177  0.6270015  5.6515733 15.9383017 12.8850809 14.1226818
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
# place the code to simulate the data here
x = rnorm(20, mean = 10, sd = 10)
y = rnorm(20, mean = 20, sd = 2)
plot(y ~ x)
abline(lm(y~x), col="red")

  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
# place the code to simulate the data here
x1 = runif(25, min=10, max=10)
x2 = runif(25, min=2, max=20) 
y = rnorm(25, mean = 10, sd = 10)
fit = lm(y ~ x1 + x2)
fit
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##     16.5345           NA      -0.3797
  1. Simulate 3 letters repeating each letter twice, 2 times.
# place the code to simulate the data here
rep(LETTERS[1:3], times = 2, each = 2)
##  [1] "A" "A" "B" "B" "C" "C" "A" "A" "B" "B" "C" "C"
  1. Create a dataframe with 3 groups, 2 factors and two quantitative response variables. Use the replicate function (n = 25).
# place the code to simulate the data here
n = 25
groups = rep(LETTERS[1:3], length.out = n)
factors = rep(letters[4:5], length.out = n)
var1 = rnorm(n, mean = 10, sd = 10)
var2 = runif(n, min=2, max=20)

dat = data.frame(groups, factors, var1, var2)
dat
##    groups factors      var1      var2
## 1       A       d  8.975084 11.264521
## 2       B       e -9.010786  8.579722
## 3       C       d 17.593911 12.174055
## 4       A       e 12.924234 15.022979
## 5       B       d  8.851101 17.178417
## 6       C       e 13.839431  9.156156
## 7       A       d  1.553243  5.854456
## 8       B       e 22.755929  4.829224
## 9       C       d 27.695548 16.721481
## 10      A       e  2.912156 12.544045
## 11      B       d 11.782754  2.360489
## 12      C       e 13.995570 18.832731
## 13      A       d 19.352594 19.064665
## 14      B       e 14.742201  8.308876
## 15      C       d 13.044270 16.718763
## 16      A       e 29.641786 19.209226
## 17      B       d 15.103992  6.161840
## 18      C       e -2.472361 16.957045
## 19      A       d  3.395860 17.915644
## 20      B       e 10.037858  4.022905
## 21      C       d 10.256646 10.123290
## 22      A       e  9.753056  4.566554
## 23      B       d  7.619863  3.747702
## 24      C       e 19.456485 11.756887
## 25      A       d  6.594690  9.206108