ANLY 505 - Problem Set #1

Questions

Simulate data for 30 draws from a normal distribution where the means and standard deviations vary among three distributions.

rnorm(n=30, mean=c(1,2,3), sd= c(4,5,6))

##  [1]  -0.3067032   2.8180971   6.9897292   2.3074020  -1.4061108
##  [6]   7.9029913  -6.3842908  -2.1422810   7.3223359  -3.9906920
## [11]   7.9602306  11.5993905   7.9464580  -3.1864483  -0.5861491
## [16]   1.2742506  14.9678922 -11.6288837   4.7243579   0.1304831
## [21]  -2.1442739  -1.3278426  -2.1214800   5.0850962  -0.7540866
## [26]   4.1949175   4.8625771   0.7484552   5.5428752  -1.0099523

Simulate 2 continuous variables (normal distribution) (n=20) and plot the relationship between them

x=rnorm(n=20, mean = 1, sd=2)
y=rnorm(n=20, mean = 1, sd=2)

plot(x,y, main = "Relationship between x and y", xlab="x", ylab="y")

Simulate 3 variables (x1, x2 and y). x1 and x2 should be drawn from a uniform distribution and y should be drawn from a normal distribution. Fit a multiple linear regression.

x1= runif(n=1000, min=0, max =2)
x2= runif(n=1000, min=2, max =4)

y= rnorm(n=1000, mean=3, sd=1)

reg_mod<-lm(y~x1+x2)
reg_mod

## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##     2.81859      0.05442      0.04671

summary(reg_mod)

## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.10126 -0.63950  0.00546  0.64658  2.85365 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.81859    0.17138  16.446   <2e-16 ***
## x1           0.05442    0.05332   1.021    0.308    
## x2           0.04671    0.05307   0.880    0.379    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9859 on 997 degrees of freedom
## Multiple R-squared:  0.00179,    Adjusted R-squared:  -0.0002122 
## F-statistic: 0.894 on 2 and 997 DF,  p-value: 0.4093

Simulate 3 letters repeating each letter twice, 2 times.

rep(letters[1:3], each=2, times=2)

##  [1] "a" "a" "b" "b" "c" "c" "a" "a" "b" "b" "c" "c"

Create a dataframe with 3 groups, 2 factors and two quantitative response variables. Use the replicate function (n = 25).

group= rep(c("yes", "no", "neutral"), length.out=25)
factor= as.factor(rep(c("male", "female"), length.out=25))
response1= rnorm(n=25, mean=10, sd=5)
response2= runif(n=25, min=3, max=300)
data.frame(group,factor, response1, response2 )

##      group factor    response1 response2
## 1      yes   male  8.603269227 267.43980
## 2       no female 12.651119509 149.48640
## 3  neutral   male  6.022458818 187.43164
## 4      yes female  5.768209431 285.07984
## 5       no   male  6.776919052  27.18542
## 6  neutral female 17.143810734  15.16214
## 7      yes   male -0.003513084  16.75086
## 8       no female  3.946159404 254.36478
## 9  neutral   male 13.282292344 174.58154
## 10     yes female 12.565481394 112.89924
## 11      no   male  7.981242373 136.73615
## 12 neutral female 14.866150678  68.27719
## 13     yes   male 14.703748983 145.86205
## 14      no female 13.444493147 120.69169
## 15 neutral   male 10.351937444 271.52680
## 16     yes female 19.311115948  99.15760
## 17      no   male  4.524498946 220.78543
## 18 neutral female  3.570664210  75.16221
## 19     yes   male 15.338909863 204.24108
## 20      no female  8.708006591 123.05323
## 21 neutral   male 16.579883972 119.86432
## 22     yes female 13.957163314 241.02280
## 23      no   male  9.848620630 172.20234
## 24 neutral female 17.888703111 254.52428
## 25     yes   male 20.144823589  25.95670

ANLY 505 - Problem Set #1

anil jhanwar

April 20, 2019

Directions

Questions