ANLY 505 - Problem Set #1

Questions

Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.

# place the code to simulate the data here
set.seed(505)
q1 <- rnorm(30, mean=c(0,-1,1), sd=c(1,2,3))
q1

##  [1] -1.1211894 -3.5641141 -5.1180276 -0.9377324 -2.0203213 -0.2385342
##  [7] -0.9103679 -0.7184777  2.6941523 -0.4322744 -5.4137940  0.2630103
## [13] -0.4786858 -1.4017334  2.4403230  0.3570010 -0.1326702  5.0896719
## [19]  0.8242841 -2.6801050 -5.1995761 -0.3895488  1.0792671  2.5229397
## [25]  0.9979214  0.1768755 -0.1880582  0.2071501 -0.7283914  5.4006912

Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them

# place the code to simulate the data here
set.seed(505); x <- rnorm(20, 0, 1)
set.seed(512); y <- rnorm(20, 1, 1)
par(mfrow=c(1,2))
plot(x, y, pch=20, main="Two variables plot A", sub="x and y are not sorted")
plot(sort(x), sort(y), pch=20, xlab="x", ylab="y", main="Two variables plot B", sub="x and y are sorted")

Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.

# place the code to simulate the data here
set.seed(1); x1 <- runif(50, min=-1, max=1)
set.seed(2); x2 <- runif(50, 0, 2)
set.seed(3); y <- rnorm(50, 0, 1)
fit <- lm(y~x1+x2)
summary(fit)

## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.14971 -0.71589 -0.05474  0.79216  1.82903 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.06848    0.25121  -0.273    0.786
## x1          -0.11375    0.24043  -0.473    0.638
## x2           0.01171    0.21303   0.055    0.956
## 
## Residual standard error: 0.9064 on 47 degrees of freedom
## Multiple R-squared:  0.004744,   Adjusted R-squared:  -0.03761 
## F-statistic: 0.112 on 2 and 47 DF,  p-value: 0.8943

Simulate 3 letters repeating each letter twice, 2 times.

# place the code to simulate the data here
q4 <- rep(LETTERS[1:3], each=2, times=2)
q4

##  [1] "A" "A" "B" "B" "C" "C" "A" "A" "B" "B" "C" "C"

Create a dataframe with 3 groups, 2 factors and two quantitative response variables. Use the replicate function (n = 25).

# place the code to simulate the data here
q5 <- replicate(n=3, expr=data.frame("Group"=rep(letters[1:2], length.out=25), "Response1"=rnorm(25, 0, 1), "Response2"=runif(25, -1, 1)), simplify=F)
q5

## [[1]]
##    Group  Response1    Response2
## 1      a  0.7268389  0.545293995
## 2      b -0.8094409 -0.749749524
## 3      a  0.2670851  0.776421847
## 4      b -1.7372637 -0.381306476
## 5      a -1.4114251  0.298545919
## 6      b -0.4535512 -0.475211228
## 7      a -1.0354913 -0.676873218
## 8      b  1.3621429 -0.788986073
## 9      a  0.9174567 -0.124639215
## 10     b -0.7851422 -0.922958508
## 11     a  0.5735182  0.917353949
## 12     b  0.9181962  0.015484269
## 13     a  0.2562873 -0.275385440
## 14     b  0.3519666  0.758956562
## 15     a  1.1743374  0.508950183
## 16     b -0.4808464  0.519018507
## 17     a -0.4188297  0.779200941
## 18     b  0.9551128 -0.639299639
## 19     a -1.2890066  0.572977041
## 20     b  0.1861974  0.094749399
## 21     a -0.0313255 -0.005108341
## 22     b  0.4670973 -0.782438051
## 23     a  1.0241977  0.173467279
## 24     b  0.2673585  0.794645122
## 25     a  0.2318261 -0.624632324
## 
## [[2]]
##    Group   Response1   Response2
## 1      a  0.07520150  0.43610804
## 2      b  0.78211420  0.89450250
## 3      a  0.16721830 -0.83713728
## 4      b -0.50320660 -0.52912476
## 5      a -1.29118679 -0.27490533
## 6      b -0.40806768 -0.34951834
## 7      a -1.15556232 -0.16694680
## 8      b -0.45595162  0.01137519
## 9      a  1.04157993  0.23467284
## 10     b  0.14629904  0.17082877
## 11     a -0.27739764  0.79846135
## 12     b  1.31969417  0.14980545
## 13     a -0.58542196 -0.37347620
## 14     b  1.08252391 -0.03999595
## 15     a -0.01774240  0.19124274
## 16     b -0.28648589 -0.86519504
## 17     a  0.47925590  0.46974672
## 18     b -1.84112900  0.28700579
## 19     a -0.05863315 -0.86529030
## 20     b -0.81669816  0.39494438
## 21     a  1.93104304 -0.30147908
## 22     b -1.17659954 -0.37185220
## 23     a  0.62585862  0.96463378
## 24     b  0.87962268  0.50023250
## 25     a  0.24321807  0.40971438
## 
## [[3]]
##    Group  Response1   Response2
## 1      a -0.7624486 -0.80341742
## 2      b  0.3860738 -0.36106340
## 3      a -0.6640033  0.99158739
## 4      b -1.7243442  0.45257728
## 5      a  1.1563191  0.37379289
## 6      b  0.6935066  0.25851857
## 7      a  0.1431564  0.60683534
## 8      b  1.4928136  0.08638962
## 9      a -1.6321535  0.72360027
## 10     b  0.1278460  0.96839008
## 11     a -2.4036637  0.17880944
## 12     b  1.4439283 -0.88847207
## 13     a -0.8788931  0.05437264
## 14     b -1.3064383  0.23830242
## 15     a -0.8771990 -0.67528531
## 16     b -1.1643805  0.88091454
## 17     a -1.9823477 -0.81009305
## 18     b -0.9899442  0.62264850
## 19     a -0.1516846  0.98626251
## 20     b  0.9125068  0.28865800
## 21     a  0.4076698 -0.49422327
## 22     b -1.2421844 -0.58466051
## 23     a -0.6426944  0.63868522
## 24     b  1.9302437 -0.45655068
## 25     a  0.4101994  0.66528698

ANLY 505 - Problem Set #1

Zhengxiao Wei

2019-04-20

Directions

Questions