You will use college tuition and diversity data for this quiz. See below for the definition of some of the variables.
Hint: The data file is posted in Moodle. See Module 5. It’s named as “college_tuition.csv”.
data <- read.csv("~//BusStats/Data/college_tuition.csv")
Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 4.2. Map in_state_total to the y-axis and percent_minority to the x-axis.
library(tidyverse)
ggplot(data,
aes(x = percent_minority ,
y = in_state_total)) +
geom_point()
Hint: Interpret both the direction and the strength of the correlation
cor(data$percent_minority, data$in_state_total)
## [1] -0.245447
Hint: For the code, refer to one of our textbooks, Data Visualization with R: Chapter 8.1.
# select numeric variables
df <- dplyr::select_if(data, is.numeric)
# calulate the correlations
r <- cor(df, use="complete.obs")
round(r,2)
## room_and_board in_state_tuition in_state_total
## room_and_board 1.00 0.72 0.78
## in_state_tuition 0.72 1.00 0.98
## in_state_total 0.78 0.98 1.00
## out_of_state_tuition 0.77 0.95 0.95
## out_of_state_total 0.82 0.93 0.97
## total_enrollment 0.10 -0.17 -0.13
## percent_minority -0.16 -0.24 -0.25
## percent_foreign 0.34 0.39 0.40
## total_minority -0.01 -0.22 -0.21
## foreign_enrollment 0.29 0.16 0.20
## out_of_state_tuition out_of_state_total total_enrollment
## room_and_board 0.77 0.82 0.10
## in_state_tuition 0.95 0.93 -0.17
## in_state_total 0.95 0.97 -0.13
## out_of_state_tuition 1.00 0.98 -0.01
## out_of_state_total 0.98 1.00 -0.01
## total_enrollment -0.01 -0.01 1.00
## percent_minority -0.25 -0.25 0.13
## percent_foreign 0.41 0.41 0.09
## total_minority -0.11 -0.12 0.85
## foreign_enrollment 0.28 0.29 0.62
## percent_minority percent_foreign total_minority
## room_and_board -0.16 0.34 -0.01
## in_state_tuition -0.24 0.39 -0.22
## in_state_total -0.25 0.40 -0.21
## out_of_state_tuition -0.25 0.41 -0.11
## out_of_state_total -0.25 0.41 -0.12
## total_enrollment 0.13 0.09 0.85
## percent_minority 1.00 -0.09 0.40
## percent_foreign -0.09 1.00 0.03
## total_minority 0.40 0.03 1.00
## foreign_enrollment -0.01 0.46 0.44
## foreign_enrollment
## room_and_board 0.29
## in_state_tuition 0.16
## in_state_total 0.20
## out_of_state_tuition 0.28
## out_of_state_total 0.29
## total_enrollment 0.62
## percent_minority -0.01
## percent_foreign 0.46
## total_minority 0.44
## foreign_enrollment 1.00
library(ggplot2)
library(ggcorrplot)
ggcorrplot(r)
ggcorrplot(r,
hc.order = TRUE,
type = "lower",
lab = TRUE)
room_and_board was the only varibale with a posotive accociation.
Hint: A correct answer must include all of the following: 1) direction and strength of the correlation coefficient, and 2) linear versus non-linear relationship.
I would disagree with that statement. Based on the graphs negative linear realtionship, you can argue the other way.
Hint: Use message, warning, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.