In this exercise you will learn to visualize the pairwise relationships between a set of quantitative variables. To this end, you will make your own note of 8.1 Correlation plots from Data Visualization with R.

# import data
data(SaratogaHouses, package="mosaicData")

# select numeric variables
df <- dplyr::select_if(SaratogaHouses, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##            price lotSize   age landValue livingArea pctCollege bedrooms
## price       1.00    0.16 -0.19      0.58       0.71       0.20     0.40
## lotSize     0.16    1.00 -0.02      0.06       0.16      -0.03     0.11
## age        -0.19   -0.02  1.00     -0.02      -0.17      -0.04     0.03
## landValue   0.58    0.06 -0.02      1.00       0.42       0.23     0.20
## livingArea  0.71    0.16 -0.17      0.42       1.00       0.21     0.66
## pctCollege  0.20   -0.03 -0.04      0.23       0.21       1.00     0.16
## bedrooms    0.40    0.11  0.03      0.20       0.66       0.16     1.00
## fireplaces  0.38    0.09 -0.17      0.21       0.47       0.25     0.28
## bathrooms   0.60    0.08 -0.36      0.30       0.72       0.18     0.46
## rooms       0.53    0.14 -0.08      0.30       0.73       0.16     0.67
##            fireplaces bathrooms rooms
## price            0.38      0.60  0.53
## lotSize          0.09      0.08  0.14
## age             -0.17     -0.36 -0.08
## landValue        0.21      0.30  0.30
## livingArea       0.47      0.72  0.73
## pctCollege       0.25      0.18  0.16
## bedrooms         0.28      0.46  0.67
## fireplaces       1.00      0.44  0.32
## bathrooms        0.44      1.00  0.52
## rooms            0.32      0.52  1.00
library(ggplot2)
library(ggcorrplot)

# visualize the correlations
ggcorrplot(r, 
           hc.order = TRUE,
           type= "lower",
           lab = TRUE)

Q1 What factors have positve correlation with home price?

Living area and home price have a positive correlation with the home price. Both are above .5 and in the red. The bigger the house the more expensive it gets.

Q2 What factors have strong positve correlation with home price?

Living are has a strong positive correlation with home price.

Q3 What factors have negative correlation with home price?

Home price has a negative relationship with age. The older the home the less expensive.

Q4 What factors have strong negative correlation with home price?

No factors have a strong negative correlation with home price.

Q5 What set of two variables has the highest positive Pearson Product-Moment correlation coefficient? What set of two variables has the greatest negative Pearson Product-Moment correlation coefficient?

Living are and number of rooms has the highest positive correlation. Bathrooms and age have the greatest negative correlation.

Q7 Plot correlation for CPS85 in the same way as above. Repeat Q1-Q6.

Hint: The CPS85 data set is from the mosaicData package. Explain wage instead of home price.

# import data
data(CPS85, package="mosaicData")

# select numeric variables
df <- dplyr::select_if(CPS85, is.numeric)

# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
##       wage  educ exper   age
## wage  1.00  0.38  0.09  0.18
## educ  0.38  1.00 -0.35 -0.15
## exper 0.09 -0.35  1.00  0.98
## age   0.18 -0.15  0.98  1.00
library(ggplot2)
library(ggcorrplot)

# visualize the correlations
ggcorrplot(r, 
           hc.order = TRUE,
           type= "lower",
           lab = TRUE)

Q8 Hide the messages, the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.