install.packages(“contrib.url”)
The objectives of this problem set is to orient you to a number of activities in R. And to conduct a thoughtful exercise in appreciating the importance of data visualization. For each question create a code chunk or text response that completes/answers the activity or question requested. Finally, upon completion post your assignment on Rpubs and upload a link to it to the “Problem Set 2” assignmenet on Moodle.
anscombe data that is part of the library(datasets) in R. And assign that data to a new object called data.library(datasets)
data<-anscombe
data
## x1 x2 x3 x4 y1 y2 y3 y4
## 1 10 10 10 8 8.04 9.14 7.46 6.58
## 2 8 8 8 8 6.95 8.14 6.77 5.76
## 3 13 13 13 8 7.58 8.74 12.74 7.71
## 4 9 9 9 8 8.81 8.77 7.11 8.84
## 5 11 11 11 8 8.33 9.26 7.81 8.47
## 6 14 14 14 8 9.96 8.10 8.84 7.04
## 7 6 6 6 8 7.24 6.13 6.08 5.25
## 8 4 4 4 19 4.26 3.10 5.39 12.50
## 9 12 12 12 8 10.84 9.13 8.15 5.56
## 10 7 7 7 8 4.82 7.26 6.42 7.91
## 11 5 5 5 8 5.68 4.74 5.73 6.89
fBasics() package!)install.packages(“fBasics”)
library(“fBasics”)
library(“timeDate”) library(“timeSeries”)
mean(data$x1)
## [1] 9
var(data$x1)
## [1] 11
mean(data$x2)
## [1] 9
var(data$x2)
## [1] 11
mean(data$x3)
## [1] 9
var(data$x3)
## [1] 11
mean(data$x4)
## [1] 9
var(data$x4)
## [1] 11
mean(data$y1)
## [1] 7.500909
var(data$y1)
## [1] 4.127269
mean(data$y2)
## [1] 7.500909
var(data$y2)
## [1] 4.127629
mean(data$y3)
## [1] 7.5
var(data$y3)
## [1] 4.12262
mean(data$y4)
## [1] 7.500909
var(data$y4)
## [1] 4.123249
correlationTest(data\(x1,data\)y1)
correlationTest(data\(x2,data\)y2)
correlationTest(data\(x3,data\)y3)
correlationTest(data\(x4,data\)y4)
plot(data$x1, data$y1, main="Scatterplot between x1,y1")
plot(data$x2, data$y2, main="Scatterplot between x2,y2")
plot(data$x3, data$y3, main="Scatterplot between x3,y3")
plot(data$x4, data$y4, main="Scatterplot between x4,y4")
par(mfrow=c(2,2))
plot(data$x1,data$y1, main="Scatterplot between x1,y1",pch=20)
plot(data$x2,data$y2, main="Scatterplot between x2,y2",pch=20)
plot(data$x3,data$y3, main="Scatterplot between x3,y3",pch=20)
plot(data$x4,data$y4, main="Scatterplot between x4,y4",pch=20)
lm() function.fit1<-lm(data$y1~data$x1)
fit2<-lm(data$y2~data$x2)
fit3<-lm(data$y3~data$x3)
fit4<-lm(data$y4~data$x4)
par(mfrow=c(2,2))
plot(data$x1,data$y1, main="Scatterplot between x1,y1",pch=20)
abline(fit1, col="blue")
plot(data$x2,data$y2, main="Scatterplot between x2,y2",pch=20)
abline(fit2, col="blue")
plot(data$x3,data$y3, main="Scatterplot between x3,y3",pch=20)
abline(fit3, col="blue")
plot(data$x4,data$y4, main="Scatterplot between x4,y4",pch=20)
abline(fit4, col="blue")
anova(fit1)
Analysis of Variance Table
Response: data\(y1 Df Sum Sq Mean Sq F value Pr(>F) data\)x1 1 27.510 27.5100 17.99 0.00217 ** Residuals 9 13.763 1.5292
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(fit2)
Analysis of Variance Table
Response: data\(y2 Df Sum Sq Mean Sq F value Pr(>F) data\)x2 1 27.500 27.5000 17.966 0.002179 ** Residuals 9 13.776 1.5307
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(fit3)
Analysis of Variance Table
Response: data\(y3 Df Sum Sq Mean Sq F value Pr(>F) data\)x3 1 27.470 27.4700 17.972 0.002176 ** Residuals 9 13.756 1.5285
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
anova(fit4)
Analysis of Variance Table
Response: data\(y4 Df Sum Sq Mean Sq F value Pr(>F) data\)x4 1 27.490 27.4900 18.003 0.002165 ** Residuals 9 13.742 1.5269
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
Anscombe’s Quartet is a proof that even when the statistical values of seems to be alike, they can be totally different when graphed.The dataset’s data vizualization helps in understanding the importance of Data Viz and how it helps in better understanding the data.