Illya Mowerman, Ph.D.
It brings satistical significance when describing a realtionship between two continuous variables.
Can you see the relationship?
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point() +
labs(title = "Scatter Plot: MPG vs. Horsepower",
x = "Horsepower",
y = "Miles per Gallon")Can you see the relationship better?
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: MPG vs. Horsepower",
x = "Horsepower",
y = "Miles per Gallon")This is how to prove it statistically
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$hp
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8852686 -0.5860994
## sample estimates:
## cor
## -0.7761684
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$wt
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9338264 -0.7440872
## sample estimates:
## cor
## -0.8676594
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: MPG vs. Weight",
x = "Weight (1000 lbs)",
y = "Miles per Gallon")Interpretation: We can clearly see that as weight increases, fuel efficiency (MPG) decreases.
##
## Pearson's product-moment correlation
##
## data: mtcars$hp and mtcars$qsec
## t = -5.4946, df = 30, p-value = 5.766e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8475998 -0.4774331
## sample estimates:
## cor
## -0.7082234
ggplot(mtcars, aes(x = hp, y = qsec)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Scatter Plot: Quarter-mile Time vs. Horsepower",
x = "Horsepower",
y = "Quarter-mile Time (seconds)")Interpretation: Cars with more horsepower generally complete the quarter-mile in less time.
## mpg cyl disp hp drat wt qsec vs am gear carb
## mpg 1.00 -0.85 -0.85 -0.78 0.68 -0.87 0.42 0.66 0.60 0.48 -0.55
## cyl -0.85 1.00 0.90 0.83 -0.70 0.78 -0.59 -0.81 -0.52 -0.49 0.53
## disp -0.85 0.90 1.00 0.79 -0.71 0.89 -0.43 -0.71 -0.59 -0.56 0.39
## hp -0.78 0.83 0.79 1.00 -0.45 0.66 -0.71 -0.72 -0.24 -0.13 0.75
## drat 0.68 -0.70 -0.71 -0.45 1.00 -0.71 0.09 0.44 0.71 0.70 -0.09
## wt -0.87 0.78 0.89 0.66 -0.71 1.00 -0.17 -0.55 -0.69 -0.58 0.43
## qsec 0.42 -0.59 -0.43 -0.71 0.09 -0.17 1.00 0.74 -0.23 -0.21 -0.66
## vs 0.66 -0.81 -0.71 -0.72 0.44 -0.55 0.74 1.00 0.17 0.21 -0.57
## am 0.60 -0.52 -0.59 -0.24 0.71 -0.69 -0.23 0.17 1.00 0.79 0.06
## gear 0.48 -0.49 -0.56 -0.13 0.70 -0.58 -0.21 0.21 0.79 1.00 0.27
## carb -0.55 0.53 0.39 0.75 -0.09 0.43 -0.66 -0.57 0.06 0.27 1.00
corrplot(cor_matrix, method = "color", type = "upper", order = "hclust",
tl.col = "black", tl.srt = 45)##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$wt
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9338264 -0.7440872
## sample estimates:
## cor
## -0.8676594
Interpretation: - Correlation coefficient: -0.8677 (strong negative correlation) - P-value: 1.294e-10 (much smaller than 0.05, so statistically significant) - We can be very confident that this correlation isn’t due to chance
pairs(mtcars[, c("mpg", "disp", "hp", "wt")],
main = "Scatter Plot Matrix of Selected mtcars Variables")This matrix shows relationships between multiple variables at once.
Questions? Comments?