Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Moodle. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
rnorm(30,mean = c(10,20,30),sd=c(5,15,25))
##  [1] 15.3345035  8.1569921 58.2696878  7.4199825 36.8410360 10.0698822
##  [7]  6.7375014 19.6683760 10.8452232 14.5262757 39.4518779  9.5418867
## [13] 12.6712691 11.5471868 39.3310589 12.7576049 29.8074537 37.5201608
## [19] 11.6128182 42.6691894 31.6457251  7.1563238 28.3989176 10.0986324
## [25] 19.4638160 -0.7853741 21.6161883 15.2661620 -0.2788826 42.5345483
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
x=rnorm(20,mean = 10,sd=10)
y=rnorm(20,mean=10,sd=10)
plot(x,y)

  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
x1=runif(10,min = 0,max = 10)
x2=runif(10,min = 10,max=20)
y=rnorm(10,mean=5,sd=5)
lm(y~x1+x2)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##     15.0522      -1.0558      -0.5281
  1. Simulate 3 letters repeating each letter twice, 2 times.
rep(letters[24:26],each=2,times=2)
##  [1] "x" "x" "y" "y" "z" "z" "x" "x" "y" "y" "z" "z"
  1. Create a dataframe (n = 27) with 3 groups, 2 factors and two quantitative response variables. Use the replicate function.
data.frame(group = rep(letters[24:26]),
           factor = rep(LETTERS[22:23]),
           x = rnorm(30, mean =5, sd=10),
           y = rnorm(30, mean = 10, sd =15))
##    group factor           x           y
## 1      x      V   2.1402916  19.8696669
## 2      y      W  13.0919709  46.1252129
## 3      z      V -12.5404129  13.1850831
## 4      x      W   7.9292425  25.3979756
## 5      y      V  10.4382985   4.0575204
## 6      z      W   6.2830973   9.2277501
## 7      x      V  12.1413293   9.1803589
## 8      y      W   2.3827360 -16.1060147
## 9      z      V  13.6474568  -4.9131286
## 10     x      W   7.4295104   1.4957102
## 11     y      V  -1.3319576  11.7015417
## 12     z      W  11.9336959 -16.8972679
## 13     x      V   9.7701289  13.5985927
## 14     y      W   2.6552614   4.7634648
## 15     z      V   5.3271742  13.8838140
## 16     x      W   8.7982415  13.4642673
## 17     y      V  -0.4968102  -4.3623327
## 18     z      W  15.9380359  15.3218145
## 19     x      V  25.1346867 -18.2980601
## 20     y      W   1.9546573   7.8525988
## 21     z      V  -7.8164726   0.9049078
## 22     x      W  -0.8779896   8.7140074
## 23     y      V -12.5494418   3.7661348
## 24     z      W  10.4746700  18.9670676
## 25     x      V  -1.5182702  15.4574353
## 26     y      W  23.3907257 -13.4268685
## 27     z      V  23.0396131   6.4339132
## 28     x      W -15.8844719  -1.7311025
## 29     y      V  12.2954227  19.0803778
## 30     z      W  15.3209601  16.0518760