Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
# place the code to simulate the data here
random = rnorm(30, mean = c(0,10,15), sd= c(1,10,15))
random
##  [1]   0.04234419  19.42119050  -3.95479323   0.09587249 -11.55391971
##  [6]   1.25486930   0.42952044  -4.08173062  28.34804285   0.76950597
## [11]   5.69970601   5.66903042  -0.11247435  -3.32118975  21.76707827
## [16]  -2.17521579  21.22360083  25.01348132  -0.70347591   3.20077530
## [21]   3.77232841  -0.81555328   3.86329722   7.22337982   0.58427312
## [26]  10.31222783  12.86797706   1.45203162   7.82917491   1.67358472
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
# place the code to simulate the data here
set.seed(9)
x=rnorm(20, mean = 1, sd=2.5)
set.seed(9)
y=rnorm(20, mean = 2.5, sd=4)

plot(y~x)

  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
# place the code to simulate the data here
set.seed(10)
y=rnorm(40, mean= 0, sd=1)
x1 =runif(40, min = 10, max = 50 )
x2= runif(40, min= 200, max= 500)

model =lm(y~x1+x2)
plot(model)

  1. Simulate 3 letters repeating each letter twice, 2 times.
# place the code to simulate the data here
rep(letters[1:3], each=2, times =2)
##  [1] "a" "a" "b" "b" "c" "c" "a" "a" "b" "b" "c" "c"
  1. Create a list of 6 datasets (n = 30) each with 3 groups, 2 factors and two quantitative response variables. Use the replicate function.
# place the code to simulate the data here
set.seed(16)
DataFrame1 = data.frame(group=rep(LETTERS[1:3], length.out=30),factor= rep(letters[7:8],length.out=30),
                            response1 = rnorm(30, mean= 10, sd= 20), response2 = rnorm(30, mean= 100, sd= 150))
set.seed(16)
DataFrame2 = data.frame(group=rep(LETTERS[2:4], length.out=30),factor= rep(letters[8:9],length.out=30),
                            response1 = rnorm(30, mean= 10, sd= 20), response2 = rnorm(30, mean= 10, sd= 15))

set.seed(16)
DataFrame3 = data.frame(group=rep(LETTERS[3:5], length.out=30),factor= rep(letters[8:9],length.out=30),
                            response1 = rnorm(30, mean= 0, sd= 1), response2 = rnorm(30, mean= 1, sd= 15))

set.seed(16)
DataFrame4 = data.frame(group=rep(LETTERS[1:3], length.out=30),factor= rep(letters[9:10],length.out=30),
                            response1 = rnorm(30, mean= 10, sd= 100), response2 = rnorm(30, mean= 1, sd= 2))

set.seed(16)
DataFrame5 = data.frame(group=rep(LETTERS[2:4], length.out=30),factor= rep(letters[10:11],length.out=30),
                            response1 = rnorm(30, mean= 100, sd= 250), response2 = rnorm(30, mean= 1, sd= 20))

set.seed(16)
DataFrame6 = data.frame(group=rep(LETTERS[2:4], length.out=30),factor= rep(letters[5:6],length.out=30),
                            response1 = rnorm(30, mean= 100, sd= 250), response2 = rnorm(30, mean= 150, sd= 200))
DataFrame6
##    group factor  response1  response2
## 1      B      e  219.10335   15.49488
## 2      C      f   68.65500  176.51971
## 3      D      e  374.05405  135.81453
## 4      B      f -261.05726  -38.53909
## 5      C      e  386.95732  -54.40620
## 6      D      f  -17.10301  206.11102
## 7      B      e -151.48765  258.95667
## 8      C      f  115.89067  176.17395
## 9      D      e  356.24315  206.36888
## 10     B      f  243.28550   91.45385
## 11     C      e  561.79553 -115.07062
## 12     D      f  127.98334  563.02713
## 13     B      e  -86.50933  198.43461
## 14     C      f  514.55342   80.18055
## 15     D      e  280.43014   23.83753
## 16     B      f -315.77012  206.78134
## 17     C      e  243.97738  174.31540
## 18     D      f  218.19003  263.26882
## 19     B      e  -35.68291  263.80658
## 20     C      f  381.92177  131.88265
## 21     D      e -311.94940  196.09930
## 22     B      f   21.45651  300.57076
## 23     C      e   54.32961  322.67708
## 24     D      f  467.61962  317.63973
## 25     B      e -116.47470 -203.17042
## 26     C      f  481.86675  286.69890
## 27     D      e  363.54451  382.60226
## 28     B      f  357.51775  -90.12037
## 29     C      e  310.04021  799.90602
## 30     D      f  154.24118   14.85285
DataFrame5
##    group factor  response1   response2
## 1      B      j  219.10335 -12.4505115
## 2      C      k   68.65500   3.6519705
## 3      D      j  374.05405  -0.4185469
## 4      B      k -261.05726 -17.8539094
## 5      C      j  386.95732 -19.4406200
## 6      D      k  -17.10301   6.6111025
## 7      B      j -151.48765  11.8956673
## 8      C      k  115.89067   3.6173950
## 9      D      j  356.24315   6.6368878
## 10     B      k  243.28550  -4.8546150
## 11     C      j  561.79553 -25.5070624
## 12     D      k  127.98334  42.3027130
## 13     B      j  -86.50933   5.8434608
## 14     C      k  514.55342  -5.9819446
## 15     D      j  280.43014 -11.6162475
## 16     B      k -315.77012   6.6781343
## 17     C      j  243.97738   3.4315397
## 18     D      k  218.19003  12.3268822
## 19     B      j  -35.68291  12.3806579
## 20     C      k  381.92177  -0.8117353
## 21     D      j -311.94940   5.6099299
## 22     B      k   21.45651  16.0570757
## 23     C      j   54.32961  18.2677082
## 24     D      k  467.61962  17.7639728
## 25     B      j -116.47470 -34.3170423
## 26     C      k  481.86675  14.6698900
## 27     D      j  363.54451  24.2602256
## 28     B      k  357.51775 -23.0120370
## 29     C      j  310.04021  65.9906021
## 30     D      k  154.24118 -12.5147146
DataFrame4
##    group factor   response1  response2
## 1      A      i   57.641339 -0.3450512
## 2      B      j   -2.538000  1.2651971
## 3      C      i  119.621620  0.8581453
## 4      A      j -134.422904 -0.8853909
## 5      B      i  124.782930 -1.0440620
## 6      C      j  -36.841204  1.5611102
## 7      A      i  -90.595059  2.0895667
## 8      B      j   16.356268  1.2617395
## 9      C      i  112.497260  1.5636888
## 10     A      j   67.314202  0.4145385
## 11     B      i  194.718210 -1.6507062
## 12     C      j   21.193337  5.1302713
## 13     A      i  -64.603732  1.4843461
## 14     B      j  175.821366  0.3018055
## 15     C      i   82.172057 -0.2616247
## 16     A      j -156.308050  1.5678134
## 17     B      i   67.590953  1.2431540
## 18     C      j   57.276012  2.1326882
## 19     A      i  -44.273166  2.1380658
## 20     B      j  122.768707  0.8188265
## 21     C      i -154.779762  1.4609930
## 22     A      j  -21.417395  2.5057076
## 23     B      i   -8.268157  2.7267708
## 24     C      j  157.047849  2.6763973
## 25     A      i  -76.589878 -2.5317042
## 26     B      j  162.746698  2.3669890
## 27     C      i  115.417806  3.3260226
## 28     A      j  113.007101 -1.4012037
## 29     B      i   94.016086  7.4990602
## 30     C      j   31.696470 -0.3514715
DataFrame3
##    group factor   response1   response2
## 1      C      h  0.47641339  -9.0878836
## 2      D      i -0.12538000   2.9889779
## 3      E      h  1.09621620  -0.0639102
## 4      C      i -1.44422904 -13.1404321
## 5      D      h  1.14782930 -14.3304650
## 6      E      i -0.46841204   5.2083269
## 7      C      h -1.00595059   9.1717505
## 8      D      i  0.06356268   2.9630463
## 9      E      h  1.02497260   5.2276659
## 10     C      i  0.57314202  -3.3909613
## 11     D      h  1.84718210 -18.8802968
## 12     E      i  0.11193337  31.9770348
## 13     C      h -0.74603732   4.6325956
## 14     D      i  1.65821366  -4.2364584
## 15     E      h  0.72172057  -8.4621856
## 16     C      i -1.66308050   5.2586007
## 17     D      h  0.57590953   2.8236548
## 18     E      i  0.47276012   9.4951616
## 19     C      h -0.54273166   9.5354935
## 20     D      i  1.12768707  -0.3588014
## 21     E      h -1.64779762   4.4574474
## 22     C      i -0.31417395  12.2928068
## 23     D      h -0.18268157  13.9507812
## 24     E      i  1.47047849  13.5729796
## 25     C      h -0.86589878 -25.4877818
## 26     D      i  1.52746698  11.2524175
## 27     E      h  1.05417806  18.4451692
## 28     C      i  1.03007101 -17.0090277
## 29     D      h  0.84016086  49.7429516
## 30     E      i  0.21696470  -9.1360360
DataFrame2
##    group factor   response1    response2
## 1      B      h  19.5282679  -0.08788364
## 2      C      i   7.4924000  11.98897791
## 3      D      h  31.9243240   8.93608980
## 4      B      i -18.8845807  -4.14043206
## 5      C      h  32.9565859  -5.33046500
## 6      D      i   0.6317591  14.20832686
## 7      B      h -10.1190119  18.17175050
## 8      C      i  11.2712536  11.96304628
## 9      D      h  30.4994520  14.22766589
## 10     B      i  21.4628403   5.60903873
## 11     C      h  46.9436420  -9.88029682
## 12     D      i  12.2386674  40.97703478
## 13     B      h  -4.9207464  13.63259561
## 14     C      i  43.1642732   4.76354156
## 15     D      h  24.4344114   0.53781440
## 16     B      i -23.2616099  14.25860074
## 17     C      h  21.5181907  11.82365480
## 18     D      i  19.4552023  18.49516164
## 19     B      h  -0.8546331  18.53549346
## 20     C      i  32.5537414   8.64119856
## 21     D      h -22.9559523  13.45744741
## 22     B      i   3.7165210  21.29280677
## 23     C      h   6.3463686  22.95078117
## 24     D      i  39.4095699  22.57297957
## 25     B      h  -7.3179757 -16.48778176
## 26     C      i  40.5493397  20.25241750
## 27     D      h  31.0835612  27.44516919
## 28     B      i  30.6014202  -8.00902772
## 29     C      h  26.8032171  58.74295156
## 30     D      i  14.3392941  -0.13603598
DataFrame1
##    group factor   response1    response2
## 1      A      g  19.5282679   -0.8788364
## 2      B      h   7.4924000  119.8897791
## 3      C      g  31.9243240   89.3608980
## 4      A      h -18.8845807  -41.4043206
## 5      B      g  32.9565859  -53.3046500
## 6      C      h   0.6317591  142.0832686
## 7      A      g -10.1190119  181.7175050
## 8      B      h  11.2712536  119.6304628
## 9      C      g  30.4994520  142.2766589
## 10     A      h  21.4628403   56.0903873
## 11     B      g  46.9436420  -98.8029682
## 12     C      h  12.2386674  409.7703478
## 13     A      g  -4.9207464  136.3259561
## 14     B      h  43.1642732   47.6354156
## 15     C      g  24.4344114    5.3781440
## 16     A      h -23.2616099  142.5860074
## 17     B      g  21.5181907  118.2365480
## 18     C      h  19.4552023  184.9516164
## 19     A      g  -0.8546331  185.3549346
## 20     B      h  32.5537414   86.4119856
## 21     C      g -22.9559523  134.5744741
## 22     A      h   3.7165210  212.9280677
## 23     B      g   6.3463686  229.5078117
## 24     C      h  39.4095699  225.7297957
## 25     A      g  -7.3179757 -164.8778176
## 26     B      h  40.5493397  202.5241750
## 27     C      g  31.0835612  274.4516919
## 28     A      h  30.6014202  -80.0902772
## 29     B      g  26.8032171  587.4295156
## 30     C      h  14.3392941   -1.3603598