Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Canvas. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

  1. Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.
set.seed(2020)
rnorm(30, mean = c(0,10,20), sd = c(1,2,3))
##  [1]  0.3769721 10.6030967 16.7059305 -1.1304059  4.4069314 22.1617205
##  [7]  0.9391210  9.5412445 25.2773940  0.1173668  8.2937544 22.7277775
## [13]  1.1963730  9.2568322 19.6302193  1.8000431 13.4079918 10.8837062
## [19] -2.2889749 10.1166070 26.5230958  1.0981827 10.6364406 19.7805573
## [25]  0.8342687 10.3975013 23.8935242  0.9367183  9.7051336 20.3312960
  1. Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them
x = rnorm(20, mean = 10, sd=1)
y = rnorm(20, mean = 20, sd=2)
plot(y~x)

  1. Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.
x1 = runif(100, min = 0, max = 10)
x2 = runif(100, min = 10, max = 100)
y = rnorm(100)
model=lm(y ~ x1 + x2)
summary(model)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.55503 -0.65567 -0.02635  0.72341  2.86790 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -0.490241   0.301358  -1.627   0.1070  
## x1          -0.017637   0.033474  -0.527   0.5995  
## x2           0.010318   0.003941   2.618   0.0103 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.031 on 97 degrees of freedom
## Multiple R-squared:  0.0716, Adjusted R-squared:  0.05245 
## F-statistic:  3.74 on 2 and 97 DF,  p-value: 0.02724
  1. Simulate 3 letters repeating each letter twice, 2 times.
letter_pick=sample(1:26, 3, replace=TRUE)
letter_pick
## [1] 1 5 3
rep(LETTERS[letter_pick],each=2,times=2)
##  [1] "A" "A" "E" "E" "C" "C" "A" "A" "E" "E" "C" "C"
  1. Create a dataframe (n = 27) with 3 groups, 2 factors and two quantitative response variables. Use the replicate function.
data.frame(
  group = rep(c("Group 1", "Group 2", "Group 3"), length.out=27),
  factor = as.factor(rep(LETTERS[24:25], length.out=27)),
  a = rnorm(27, 10, 2),
  b = runif(27, min = 0, max = 10)
  )
##      group factor         a         b
## 1  Group 1      X  9.619306 6.2680398
## 2  Group 2      Y 10.772977 2.8300062
## 3  Group 3      X 15.045563 5.8312475
## 4  Group 1      Y 11.932518 8.2034376
## 5  Group 2      X 10.284003 8.5307469
## 6  Group 3      Y  8.692623 4.1025107
## 7  Group 1      X 13.688665 7.0722686
## 8  Group 2      Y  8.659964 8.8143742
## 9  Group 3      X 10.660699 2.7943871
## 10 Group 1      Y 11.372948 9.3593118
## 11 Group 2      X 13.377795 1.6994108
## 12 Group 3      Y 10.646845 9.2477975
## 13 Group 1      X  8.434141 4.6915996
## 14 Group 2      Y 11.725131 9.9002539
## 15 Group 3      X  8.653601 2.7236549
## 16 Group 1      Y 11.034717 9.8671852
## 17 Group 2      X 12.944128 1.9130525
## 18 Group 3      Y  9.899534 8.0183751
## 19 Group 1      X 11.688621 4.7926237
## 20 Group 2      Y  5.057191 5.4103349
## 21 Group 3      X 11.265932 7.9785120
## 22 Group 1      Y  8.008532 4.2869039
## 23 Group 2      X 11.361114 7.1188112
## 24 Group 3      Y 12.380601 5.2609393
## 25 Group 1      X 10.845043 0.7032919
## 26 Group 2      Y 11.668283 5.1255029
## 27 Group 3      X 11.567692 4.8654911