ANLY 505 - Problem Set #1

Directions

The objective of this assignment is to introduce you to R and R markdown and to complete some basic data simulation exercises.

Please include all code needed to perform the tasks. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Moodle. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.

set.seed(1)
rnorm(30, mean=c(1,5,15), sd=c(1,10,50))

##  [1]   0.3735462   6.8364332 -26.7814306   2.5952808   8.2950777
##  [6] -26.0234192   1.4874291  12.3832471  43.7890676   0.6946116
## [11]  20.1178117  34.4921618   0.3787594 -17.1469989  71.2465459
## [16]   0.9550664   4.8380974  62.1918105   1.8212212  10.9390132
## [21]  60.9488686   1.7821363   5.7456498 -84.4675848   1.6198257
## [26]   4.4387126   7.2102247  -0.4707524   0.2184994  35.8970780

Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them

set.seed(2)
{x=rnorm(20,10,1)
y=rnorm(20,5,2)
plot(x,y)}

Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.

set.seed(3)
{
  x1=runif(20,0,100)
  x2=runif(20,50,100)
  y=rnorm(20,0,10)
  z=lm(y~x1+x2)
  z
  
}

## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##     -2.2102      -0.1276       0.1325

Simulate 3 letters repeating each letter twice, 2 times.

set.seed(4)
{
  rep(letters[4:6],each=2, times=2)
}

##  [1] "d" "d" "e" "e" "f" "f" "d" "d" "e" "e" "f" "f"

Create a dataframe with 3 groups, 2 factors and two quantitative response variables. Use the replicate function (n = 25).

set.seed(5)
{
  z=data.frame(group=rep(LETTERS[1:3],length.out=25),factor=rep(letters[1:2], length.out=25),x=rnorm(25,5,1), y=rnorm(25,0,3))
z

}

##    group factor        x          y
## 1      A      a 4.159145 -0.8804455
## 2      B      b 6.384359  4.2557672
## 3      C      a 3.744508  4.4963215
## 4      A      b 5.070143 -1.9712463
## 5      B      a 6.711441 -2.5583863
## 6      C      b 4.397092  0.9477451
## 7      A      a 4.527834  3.3290825
## 8      B      b 4.364629  6.6463817
## 9      C      a 4.714226  3.6513109
## 10     A      b 5.138108  4.4376654
## 11     B      a 6.227630  2.8547215
## 12     C      b 4.198221 -3.0285979
## 13     A      a 3.919607 -6.0014182
## 14     B      b 4.842466 -5.2865576
## 15     C      a 3.928240 -0.4278244
## 16     A      b 4.861014  4.6501811
## 17     B      a 4.402687 -2.4072695
## 18     C      b 2.816033 -0.2237368
## 19     A      a 5.240817  5.6870039
## 20     B      b 4.740645 -1.3697068
## 21     C      a 5.900512  1.6866701
## 22     A      b 5.941869 -2.6610255
## 23     B      a 6.467962 -1.3807337
## 24     C      b 5.706761 -2.1729855
## 25     A      a 5.819009 -0.2076335

ANLY 505 - Problem Set #1

Dan Liu

June 4, 2019

Directions

Questions