We will use the state.x77 dataset available in the base R installation. It provides data on the following for 50 US states in 1977. * population * income * illiteracy rate * life expectancy * murder rate and * high school graduation rate
states<- state.x77[,1:6]
library(psych)
describe(states)[, c(1:5, 7:9)] # selected column
## vars n mean sd median mad min max
## Population 1 50 4246.42 4464.49 2838.50 2890.33 365.00 21198.0
## Income 2 50 4435.80 614.47 4519.00 581.18 3098.00 6315.0
## Illiteracy 3 50 1.17 0.61 0.95 0.52 0.50 2.8
## Life Exp 4 50 70.88 1.34 70.67 1.54 67.96 73.6
## Murder 5 50 7.38 3.69 6.85 5.19 1.40 15.1
## HS Grad 6 50 53.11 8.08 53.25 8.60 37.80 67.3
Consider the correlations among the variables in the states data frame.
library(corrgram)
corrgram(states, order=TRUE, lower.panel=panel.shade,
upper.panel=panel.pie, text.panel=panel.txt,
main="Corrgram of states intercorrelations")
Corrgram of the correlations among the variables in the states data frame. Rows and columns have been reordered using principal components analysis.
Start with the lower triangle of the cells:
corrgram(states, order=TRUE, lower.panel=panel.ellipse,
upper.panel=panel.pts, text.panel=panel.txt,
diag.panel=panel.minmax,
main="Corrgram of states data using scatter plots
and ellipses")
Here we are using smoothed fit lines and confidence ellipses in the lower triangle and the scatter plots in the upper triangle.
corrgram of the correlations among the variables in the states data frame. The lower triangle contains smoothed best fit lines and confidence ellipses, and the yupper triangle contains scatter plots. The diagonal panel contains minimum and maximum values. Rows and columns have been reordered using principal component analysis.
cols <- colorRampPalette(c("darkgoldenrod4", "burlywood1",
"darkkhaki", "darkgreen"))
corrgram(states, order=TRUE, col.regions=cols,
lower.panel=panel.shade,
upper.panel=panel.conf, text.panel=panel.txt,
main="A Corrgram (or Horse) of a Different Color")
Here we are using shading in the lower triangle, keeping the original variable order.
corrgram of the correlations among the variables in the states data frame. The lower triangle is shaded to represent the magnitude and direction of the correlations. The variables are plotted in their original order.