The objectives of this problem set is to orient you to a number of activities in R. And to conduct a thoughtful exercise in appreciating the importance of data visualization. For each question create a code chunk or text response that completes/answers the activity or question requested. Finally, upon completion name your final output .html file as: YourName_ANLY512-Section-Year-Semester.html and upload it to the “Problem Set 2” assignmenet on Moodle.
anscombe data that is part of the library(datasets) in R. And assign that data to a new object called data.data <- anscombe
fBasics() package!)summary(data)
## x1 x2 x3 x4
## Min. : 4.0 Min. : 4.0 Min. : 4.0 Min. : 8
## 1st Qu.: 6.5 1st Qu.: 6.5 1st Qu.: 6.5 1st Qu.: 8
## Median : 9.0 Median : 9.0 Median : 9.0 Median : 8
## Mean : 9.0 Mean : 9.0 Mean : 9.0 Mean : 9
## 3rd Qu.:11.5 3rd Qu.:11.5 3rd Qu.:11.5 3rd Qu.: 8
## Max. :14.0 Max. :14.0 Max. :14.0 Max. :19
## y1 y2 y3 y4
## Min. : 4.260 Min. :3.100 Min. : 5.39 Min. : 5.250
## 1st Qu.: 6.315 1st Qu.:6.695 1st Qu.: 6.25 1st Qu.: 6.170
## Median : 7.580 Median :8.140 Median : 7.11 Median : 7.040
## Mean : 7.501 Mean :7.501 Mean : 7.50 Mean : 7.501
## 3rd Qu.: 8.570 3rd Qu.:8.950 3rd Qu.: 7.98 3rd Qu.: 8.190
## Max. :10.840 Max. :9.260 Max. :12.74 Max. :12.500
var(data$x1)
## [1] 11
var(data$x2)
## [1] 11
var(data$x3)
## [1] 11
var(data$x4)
## [1] 11
var(data$y1)
## [1] 4.127269
var(data$y2)
## [1] 4.127629
var(data$y3)
## [1] 4.12262
var(data$y4)
## [1] 4.123249
cor(data)
## x1 x2 x3 x4 y1 y2
## x1 1.0000000 1.0000000 1.0000000 -0.5000000 0.8164205 0.8162365
## x2 1.0000000 1.0000000 1.0000000 -0.5000000 0.8164205 0.8162365
## x3 1.0000000 1.0000000 1.0000000 -0.5000000 0.8164205 0.8162365
## x4 -0.5000000 -0.5000000 -0.5000000 1.0000000 -0.5290927 -0.7184365
## y1 0.8164205 0.8164205 0.8164205 -0.5290927 1.0000000 0.7500054
## y2 0.8162365 0.8162365 0.8162365 -0.7184365 0.7500054 1.0000000
## y3 0.8162867 0.8162867 0.8162867 -0.3446610 0.4687167 0.5879193
## y4 -0.3140467 -0.3140467 -0.3140467 0.8165214 -0.4891162 -0.4780949
## y3 y4
## x1 0.8162867 -0.3140467
## x2 0.8162867 -0.3140467
## x3 0.8162867 -0.3140467
## x4 -0.3446610 0.8165214
## y1 0.4687167 -0.4891162
## y2 0.5879193 -0.4780949
## y3 1.0000000 -0.1554718
## y4 -0.1554718 1.0000000
plot(data$x1,data$y1,main="Scatter Plots for each X and Y",xlab="x1",ylab="y1",pch=0)
plot(data$x2,data$y2,main="Scatter Plots for each X and Y",xlab="x2",ylab="y2",pch=0)
plot(data$x3,data$y3,main="Scatter Plots for each X and Y",xlab="x3",ylab="y3",pch=0)
plot(data$x4,data$y4,main="Scatter Plots for each X and Y",xlab="x4",ylab="y4",pch=0)
par(mfrow=c(2,2))
plot(data$x1,data$y1, main="Scatterplot between x1,y1",pch=19)
plot(data$x2,data$y2, main="Scatterplot between x2,y2",pch=19)
plot(data$x3,data$y3, main="Scatterplot between x3,y3",pch=19)
plot(data$x4,data$y4, main="Scatterplot between x4,y4",pch=19)
lm() function.model1<-lm(data$y1~data$x1)
model1
##
## Call:
## lm(formula = data$y1 ~ data$x1)
##
## Coefficients:
## (Intercept) data$x1
## 3.0001 0.5001
model2<-lm(data$y2~data$x2)
model2
##
## Call:
## lm(formula = data$y2 ~ data$x2)
##
## Coefficients:
## (Intercept) data$x2
## 3.001 0.500
model3<-lm(data$y3~data$x3)
model3
##
## Call:
## lm(formula = data$y3 ~ data$x3)
##
## Coefficients:
## (Intercept) data$x3
## 3.0025 0.4997
model4<-lm(data$y4~data$x4)
model4
##
## Call:
## lm(formula = data$y4 ~ data$x4)
##
## Coefficients:
## (Intercept) data$x4
## 3.0017 0.4999
par(mfrow=c(2,2))
plot(model1)
plot(model2)
plot(model3)
plot(model4)
## Warning: not plotting observations with leverage one:
## 8
## Warning: not plotting observations with leverage one:
## 8
anova(model1)
Analysis of Variance Table
Response: data\(y1 Df Sum Sq Mean Sq F value Pr(>F) data\)x1 1 27.510 27.5100 17.99 0.00217 ** Residuals 9 13.763 1.5292
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(model2)
Analysis of Variance Table
Response: data\(y2 Df Sum Sq Mean Sq F value Pr(>F) data\)x2 1 27.500 27.5000 17.966 0.002179 ** Residuals 9 13.776 1.5307
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(model3)
Analysis of Variance Table
Response: data\(y3 Df Sum Sq Mean Sq F value Pr(>F) data\)x3 1 27.470 27.4700 17.972 0.002176 ** Residuals 9 13.756 1.5285
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(model4)
Analysis of Variance Table
Response: data\(y4 Df Sum Sq Mean Sq F value Pr(>F) data\)x4 1 27.490 27.4900 18.003 0.002165 ** Residuals 9 13.742 1.5269
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1