The objectives of this problem set is to orient you to a number of activities in R. And to conduct a thoughtful exercise in appreciating the importance of data visualization. For each question create a code chunk or text response that completes/answers the activity or question requested. Finally, upon completion name your final output .html file as: YourName_ANLY512-Section-Year-Semester.html and upload it to the “Problem Set 2” assignment to your R Pubs account and submit the link to Moodle. Points will be deducted for uploading the improper format.
anscombe data that is part of the library(datasets) in R. And assign that data to a new object called data.#Place your code here and delete this!
library(datasets)
data <- anscombe
fBasics() package!)#Place your code here and delete this!
Mean <- apply(data, 2, mean)
Mean
## x1 x2 x3 x4 y1 y2 y3 y4
## 9.000000 9.000000 9.000000 9.000000 7.500909 7.500909 7.500000 7.500909
Var <- apply(data, 2, var)
Var
## x1 x2 x3 x4 y1 y2 y3
## 11.000000 11.000000 11.000000 11.000000 4.127269 4.127629 4.122620
## y4
## 4.123249
Cor <- cor(data[, 1:4], data[, 5:8])
Cor <- c(Cor[1, 1], Cor[2, 2], Cor[3, 3], Cor[4, 4])
Cor
## [1] 0.8164205 0.8162365 0.8162867 0.8165214
#Place your code here and delete this!
plot(data$x1, data$y1)
plot(data$x2, data$y2)
plot(data$x3, data$y3)
plot(data$x4, data$y4)
#Place your code here and delete this!
par(mfrow = c(2, 2))
plot(data$x1, data$y1, pch = 16)
plot(data$x2, data$y2, pch = 16)
plot(data$x3, data$y3, pch = 16)
plot(data$x4, data$y4, pch = 16)
lm() function.#Place your code here and delete this!
lm1 <- lm(data$y1 ~ data$x1)
lm2 <- lm(data$y2 ~ data$x2)
lm3 <- lm(data$y3 ~ data$x3)
lm4 <- lm(data$y4 ~ data$x4)
#Place your code here and delete this!
par(mfrow = c(2, 2))
with(data, plot(x1, y1, pch = 16))
abline(lm1)
with(data, plot(x2, y2, pch = 16))
abline(lm2)
with(data, plot(x3, y3, pch = 16))
abline(lm3)
with(data, plot(x4, y4, pch = 16))
abline(lm4)
#Place your code here and delete this!
summary(lm1)$adj.r.squared
[1] 0.6294916
summary(lm2)$adj.r.squared
[1] 0.6291578
summary(lm3)$adj.r.squared
[1] 0.6292489
summary(lm4)$adj.r.squared
[1] 0.6296747
I applied simple statistics analysis and data visulization to anscombe dataset. As we can observed, there is no much difference from simple statistics analysis between each groups. However, they actually have different relationship aftering visulate them So, I can conclude it is very important to visulaze the data. Only simple statistics analysis is only enough for understanding your data.