Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
# place the code to simulate the data here
set.seed(16)
rnorm(30,mean = c(0,5,20),sd=c(2,14,25))
##  [1]   0.9528268   3.2446800  47.4054050  -2.8884581  21.0696101   8.2896989
##  [7]  -2.0119012   5.8898775  45.6243150   1.1462840  30.8605494  22.7983342
## [13]  -1.4920746  28.2149912  38.0430142  -3.3261610  13.0627335  31.8190029
## [19]  -1.0854633  20.7876190 -21.1949404  -0.6283479   2.4424580  56.7619623
## [25]  -1.7317976  26.3845378  46.3544515   2.0601420  16.7622520  25.4241176
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
set.seed(16)
x=rnorm(20, mean = 0, sd=1)
set.seed(16)
y=rnorm(20, mean = 1, sd=2)
  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
set.seed(16)
x1=runif(25,min=0,max=30)
x2=runif(25,min=30,max=100)
y=rnorm(25,mean = 10,sd=2)
model=lm(y~x1+x2)
  1. Simulate 3 letters repeating each letter twice, 2 times.
rep(letters[1:3], each =2, times =2)
##  [1] "a" "a" "b" "b" "c" "c" "a" "a" "b" "b" "c" "c"
  1. Create a dataframe (n = 27) with 3 groups, 2 factors and two quantitative response variables. Use the replicate function.
set.seed(16)
data.frame(group = rep(letters[1:3]),
factor = rep(LETTERS[4:6]),
x = rnorm(27, mean =0, sd=1),
y = rnorm(27, mean = 10, sd =15))
##    group factor           x           y
## 1      a      D  0.47641339 25.45106513
## 2      b      E -0.12538000 22.60241286
## 3      c      F  1.09621620 13.25447055
## 4      a      D -1.44422904 -0.08788364
## 5      b      E  1.14782930 11.98897791
## 6      c      F -0.46841204  8.93608980
## 7      a      D -1.00595059 -4.14043206
## 8      b      E  0.06356268 -5.33046500
## 9      c      F  1.02497260 14.20832686
## 10     a      D  0.57314202 18.17175050
## 11     b      E  1.84718210 11.96304628
## 12     c      F  0.11193337 14.22766589
## 13     a      D -0.74603732  5.60903873
## 14     b      E  1.65821366 -9.88029682
## 15     c      F  0.72172057 40.97703478
## 16     a      D -1.66308050 13.63259561
## 17     b      E  0.57590953  4.76354156
## 18     c      F  0.47276012  0.53781440
## 19     a      D -0.54273166 14.25860074
## 20     b      E  1.12768707 11.82365480
## 21     c      F -1.64779762 18.49516164
## 22     a      D -0.31417395 18.53549346
## 23     b      E -0.18268157  8.64119856
## 24     c      F  1.47047849 13.45744741
## 25     a      D -0.86589878 21.29280677
## 26     b      E  1.52746698 22.95078117
## 27     c      F  1.05417806 22.57297957