State dataset

We will use the state.x77 dataset available in the base R installation. It provides data on the following for 50 US states in 1977. * population * income * illiteracy rate * life expectancy * murder rate and * high school graduation rate

Summary State dataset

states<- state.x77[,1:6]
library(psych)
describe(states)[, c(1:5, 7:9)]  # selected column
##            vars  n    mean      sd  median     mad     min     max
## Population    1 50 4246.42 4464.49 2838.50 2890.33  365.00 21198.0
## Income        2 50 4435.80  614.47 4519.00  581.18 3098.00  6315.0
## Illiteracy    3 50    1.17    0.61    0.95    0.52    0.50     2.8
## Life Exp      4 50   70.88    1.34   70.67    1.54   67.96    73.6
## Murder        5 50    7.38    3.69    6.85    5.19    1.40    15.1
## HS Grad       6 50   53.11    8.08   53.25    8.60   37.80    67.3

Corrgrams

Consider the correlations among the variables in the states data frame.

library(corrgram)
corrgram(states, order=TRUE, lower.panel=panel.shade,
         upper.panel=panel.pie, text.panel=panel.txt,
         main="Corrgram of states intercorrelations")

Corrgrams

Corrgram of the correlations among the variables in the states data frame. Rows and columns have been reordered using principal components analysis.

Interpretation of above Corrgram

Start with the lower triangle of the cells:

Corrgrams

corrgram(states, order=TRUE, lower.panel=panel.ellipse,
         upper.panel=panel.pts, text.panel=panel.txt,
         diag.panel=panel.minmax,
         main="Corrgram of states data using scatter plots
and ellipses")

Here we are using smoothed fit lines and confidence ellipses in the lower triangle and the scatter plots in the upper triangle.

Corrgrams

corrgram of the correlations among the variables in the states data frame. The lower triangle contains smoothed best fit lines and confidence ellipses, and the yupper triangle contains scatter plots. The diagonal panel contains minimum and maximum values. Rows and columns have been reordered using principal component analysis.

Corrgrams

cols <- colorRampPalette(c("darkgoldenrod4", "burlywood1",
                           "darkkhaki", "darkgreen"))
corrgram(states, order=TRUE, col.regions=cols,
         lower.panel=panel.shade,
         upper.panel=panel.conf, text.panel=panel.txt,
         main="A Corrgram (or Horse) of a Different Color")

Here we are using shading in the lower triangle, keeping the original variable order.

Corrgrams

corrgram of the correlations among the variables in the states data frame. The lower triangle is shaded to represent the magnitude and direction of the correlations. The variables are plotted in their original order.