An Introduction to corrplot package

Introduction

The corrplot package is a graphical display of a correlation matrix, confidence interval.
It also contains some algorithms to do matrix reordering. In addition, corrplot is good at details, including choosing color, text labels, color labels, layout, etc.

Visualization Methods

There are seven visualization methods (parameter method) in corrplot package, named "circle", "square", "ellipse", "number", "shade", "color", "pie".

data(mtcars)
M <- cor(mtcars)
corrplot(M, method = "circle")

plot of chunk methods

corrplot(M, method = "square")

plot of chunk methods

corrplot(M, method = "ellipse")

plot of chunk methods

corrplot(M, method = "number")

plot of chunk methods

corrplot(M, method = "shade")

plot of chunk methods

corrplot(M, method = "color")

plot of chunk methods

corrplot(M, method = "pie")

plot of chunk methods

Layout

There are three layout types (parameter type), named "full" (default), "upper" or "lower", display full matrix, lower triangular or upper triangular matrix.

corrplot(M, type = "upper")

plot of chunk layout

corrplot(M, type = "lower")

plot of chunk layout

corrplot.mixed() is a wrapped function for mixed visualization style.

corrplot.mixed(M)

plot of chunk mixed

corrplot.mixed(M, lower = "ellipse", upper = "circle")

plot of chunk mixed

corrplot.mixed(M, lower = "square", upper = "circle")

plot of chunk mixed

Reorder A Correlation Matrix

Matrix reorder is very important for mining the hiden structure and pattern in the matrix. There are four methods in corrplot (parameter order), named "AOE", "FPC", "hclust", "alphabet". More algorithms can be found in seriation package.

You can also reorder the matrix “manually” via function corrMatOrder().

\[ a_i = \begin{cases} \tan (e_{i2}/e_{i1}), & \text{if $e_{i1}>0$;} \newline \tan (e_{i2}/e_{i1}) + \pi, & \text{otherwise.} \end{cases} \]

where \( e_1 \) and \( e_2 \) are the largest two eigenvalues of the correlation matrix. See Michael Friendly (2002) for details.

corrplot(M, order = "AOE")

plot of chunk order

corrplot(M, order = "hclust")

plot of chunk order

corrplot(M, order = "FPC")

plot of chunk order

corrplot(M, order = "alphabet")

plot of chunk order

If using "hclust", corrplot() can draw rectangles around the chart of corrrlation matrix based on the results of hierarchical clustering.

corrplot(M, order = "hclust", addrect = 2)

plot of chunk rectangles

corrplot(M, order = "hclust", addrect = 3)

plot of chunk rectangles

Using Different Color Spectrum

We can also specify the color system, colorRampPalette() is very convenient for generating color spectrum.

col1 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "white", "cyan", 
    "#007FFF", "blue", "#00007F"))
col2 <- colorRampPalette(c("#67001F", "#B2182B", "#D6604D", "#F4A582", "#FDDBC7", 
    "#FFFFFF", "#D1E5F0", "#92C5DE", "#4393C3", "#2166AC", "#053061"))
col3 <- colorRampPalette(c("red", "white", "blue"))
col4 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "#7FFF7F", 
    "cyan", "#007FFF", "blue", "#00007F"))
wb <- c("white", "black")
## using these color spectrums
corrplot(M, order = "hclust", addrect = 2, col = col1(100))

plot of chunk color

corrplot(M, order = "hclust", addrect = 2, col = col2(50))

plot of chunk color

corrplot(M, order = "hclust", addrect = 2, col = col3(20))

plot of chunk color

corrplot(M, order = "hclust", addrect = 2, col = col4(10))

plot of chunk color

corrplot(M, order = "hclust", addrect = 2, col = wb, bg = "gold2")

plot of chunk color

Color Legend and Text Legend

Parameter cl.* is for color legend, and tl.* if for text legend.

Here are some examples.

## remove color legend and text legend
corrplot(M, order = "AOE", cl.pos = "n", tl.pos = "n")

plot of chunk color-label

## bottom color legend, diagonal text legend, rotate text label
corrplot(M, order = "AOE", cl.pos = "b", tl.pos = "d", tl.srt = 60)

plot of chunk color-label

## a wider color legend with numbers right aligned
corrplot(M, order = "AOE", cl.ratio = 0.2, cl.align = "r")

plot of chunk color-label

Deal with the Non-correlation Matrix

corrplot(abs(M), order = "AOE", col = col3(200), cl.lim = c(0, 1))

plot of chunk non-corr

## visualize a matrix in [-100, 100]
ran <- round(matrix(runif(225, -100, 100), 15))
corrplot(ran, is.corr = FALSE, method = "square")

plot of chunk non-corr

## a beautiful color legend
corrplot(ran, is.corr = FALSE, method = "ellipse", cl.lim = c(-100, 100))

plot of chunk non-corr

Combine with the Significance Test

cor.mtest <- function(mat, conf.level = 0.95) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
    diag(p.mat) <- 0
    diag(lowCI.mat) <- diag(uppCI.mat) <- 1
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
            lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
            uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
        }
    }
    return(list(p.mat, lowCI.mat, uppCI.mat))
}
res1 <- cor.mtest(mtcars, 0.95)
res2 <- cor.mtest(mtcars, 0.99)
## specialized the insignificant value according to the significant level
corrplot(M, p.mat = res1[[1]], sig.level = 0.2)

plot of chunk test

corrplot(M, p.mat = res1[[1]], sig.level = 0.05)

plot of chunk test

corrplot(M, p.mat = res1[[1]], sig.level = 0.01)

plot of chunk test

## leave blank on no significant coefficient
corrplot(M, p.mat = res1[[1]], insig = "blank")

plot of chunk test

## add p-values on no significant coefficient
corrplot(M, p.mat = res1[[1]], insig = "p-value")

plot of chunk test

## add all p-values
corrplot(M, p.mat = res1[[1]], insig = "p-value", sig.level = -1)

plot of chunk test

## add cross on no significant coefficient
corrplot(M, p.mat = res1[[1]], order = "hclust", insig = "pch", addrect = 3)

plot of chunk test

Visualize Confidence Interval

## plot confidence interval(0.95, 0.95, 0.99), 'rect' method
corrplot(M, low = res1[[2]], upp = res1[[3]], order = "hclust", rect.col = "navy", 
    plotC = "rect", cl.pos = "n")

plot of chunk ci

corrplot(M, p.mat = res1[[1]], low = res1[[2]], upp = res1[[3]], order = "hclust", 
    pch.col = "red", sig.level = 0.01, addrect = 3, rect.col = "navy", plotC = "rect", 
    cl.pos = "n")

plot of chunk ci

Here is an animation to show the relation between significant level and confidence interval.

for (i in seq(0.1, 0, -0.005)) {
    tmp <- cor.mtest(mtcars, 1 - i)
    corrplot(M, p.mat = tmp[[1]], low = tmp[[2]], upp = tmp[[3]], order = "hclust", 
        pch.col = "red", sig.level = i, plotC = "rect", cl.pos = "n", mar = c(0, 
            0, 1, 0), title = substitute(alpha == x, list(x = format(i, digits = 3, 
            nsmall = 3))))
}