In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.1 Correlation plots from Data Visualization with R.

# import data
data(SaratogaHouses, package="mosaicData")

# select numeric variables
df <- dplyr::select_if(SaratogaHouses, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##            price lotSize   age landValue livingArea pctCollege bedrooms
## price       1.00    0.16 -0.19      0.58       0.71       0.20     0.40
## lotSize     0.16    1.00 -0.02      0.06       0.16      -0.03     0.11
## age        -0.19   -0.02  1.00     -0.02      -0.17      -0.04     0.03
## landValue   0.58    0.06 -0.02      1.00       0.42       0.23     0.20
## livingArea  0.71    0.16 -0.17      0.42       1.00       0.21     0.66
## pctCollege  0.20   -0.03 -0.04      0.23       0.21       1.00     0.16
## bedrooms    0.40    0.11  0.03      0.20       0.66       0.16     1.00
## fireplaces  0.38    0.09 -0.17      0.21       0.47       0.25     0.28
## bathrooms   0.60    0.08 -0.36      0.30       0.72       0.18     0.46
## rooms       0.53    0.14 -0.08      0.30       0.73       0.16     0.67
##            fireplaces bathrooms rooms
## price            0.38      0.60  0.53
## lotSize          0.09      0.08  0.14
## age             -0.17     -0.36 -0.08
## landValue        0.21      0.30  0.30
## livingArea       0.47      0.72  0.73
## pctCollege       0.25      0.18  0.16
## bedrooms         0.28      0.46  0.67
## fireplaces       1.00      0.44  0.32
## bathrooms        0.44      1.00  0.52
## rooms            0.32      0.52  1.00
library(ggplot2)
library(ggcorrplot)

# visualize the correlations
ggcorrplot(r, 
           hc.order = TRUE, 
           type = "lower",
           lab = TRUE)

Q1 What factors have strong positve correlation with home price?

The factors are a srtong relationship correlations are living area and batheooms. ## Q2 Continued from Q1: Does the strong correlation mean the variable causes home price to go up and down? No,because there is no other way that we can prove that they are connected to each other to begein with. ## Q3 Continued from Q1: Do you think there is a confounding variable? no,ther is a relathinship between the factors the third variable tricks you in thinking ther is a relationship. ## Q4 What factors have strong negative correlation with home price? No, there is a negtive corrlation with home price, there is no variable connected. ## Q5 What factors have little correlation with home price? Lotsize have very least coorelation with the home price. ## Q6 Simply based on the correlation coefficient, would you be sure that there is no relation at all? What would you do to check? no, they are not connected to a nonlinner corrlation. ## Q7 Plot correlation for CPS85 in the same way as above. Repeat Q1-Q6. Hint: The CPS85 data set is from the mosaicData package. Explain wage instead of home price.

# import data
data(CPS85, package="mosaicData")

# select numeric variables
df <- dplyr::select_if(CPS85, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##       wage  educ exper   age
## wage  1.00  0.38  0.09  0.18
## educ  0.38  1.00 -0.35 -0.15
## exper 0.09 -0.35  1.00  0.98
## age   0.18 -0.15  0.98  1.00

Q8 Hide the messages, the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.