The ggcorrplot package can be used to visualize easily a correlation matrix using ggplot2. It provides a solution for reordering the correlation matrix and displays the significance level on the correlogram. It includes also a function for computing a matrix of correlation p-values.
Installation and loading
ggcorrplot can be installed from CRAN as follow:
#install.packages("ggcorrplot")
# Loading
library(ggcorrplot)
## Loading required package: ggplot2
data(mtcars)
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
The mtcars data set will be used in the following R code. The
function cor_pmat()
in ggcorrplot. computes a
matrix of correlation p-values.
cor() return correlation matrix; cor_pmat() in ggcorrplot package computes a matrix of correlation p-values.
# Compute a correlation matrix
corr <- round(cor(mtcars), 1) # rounded to one decimal point
head(corr[, 1:6]) # show first six rows of only the first six columns
## mpg cyl disp hp drat wt
## mpg 1.0 -0.9 -0.8 -0.8 0.7 -0.9
## cyl -0.9 1.0 0.9 0.8 -0.7 0.8
## disp -0.8 0.9 1.0 0.8 -0.7 0.9
## hp -0.8 0.8 0.8 1.0 -0.4 0.7
## drat 0.7 -0.7 -0.7 -0.4 1.0 -0.7
## wt -0.9 0.8 0.9 0.7 -0.7 1.0
# Compute a matrix of correlation p-values
p.mat <- cor_pmat(mtcars)
head(p.mat[, 1:4])
## mpg cyl disp hp
## mpg 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07
## cyl 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09
## disp 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08
## hp 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00
## drat 1.776240e-05 8.244636e-06 5.282022e-06 9.988772e-03
## wt 1.293959e-10 1.217567e-07 1.222320e-11 4.145827e-05
# Visualize the correlation matrix
# --------------------------------
# method = "square" (default)
ggcorrplot(corr)
ggcorrplot(cor(mtcars))
# method = "circle"
ggcorrplot(corr, method = "circle")
#> Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
#> "none")` instead.
ggcorrplot(cor(mtcars),
method = "circle",
type = "lower",
outline.color = "black",
lab_size = 6)
# Reordering the correlation matrix
# --------------------------------
# using hierarchical clustering
ggcorrplot(corr, hc.order = TRUE, outline.color = "white")
# Types of correlogram layout
# --------------------------------
# Get the lower triangle
ggcorrplot(corr,
hc.order = TRUE,
type = "lower",
outline.color = "white")
# Get the upper triangle
ggcorrplot(corr,
hc.order = TRUE,
type = "upper",
outline.color = "white")
# Change colors and theme
# --------------------------------
# Argument colors
ggcorrplot(
corr,
hc.order = TRUE,
type = "lower",
outline.color = "white",
ggtheme = ggplot2::theme_gray,
colors = c("#6D9EC1", "white", "#E46726")
)
# Add correlation coefficients
# --------------------------------
# argument lab = TRUE
ggcorrplot(corr,
hc.order = TRUE,
type = "lower",
lab = TRUE)
# Add correlation significance level
# --------------------------------
# Argument p.mat
# Barring the no significant coefficient
ggcorrplot(corr,
hc.order = TRUE,
type = "lower",
p.mat = p.mat)
library(ggcorrplot)
ggcorrplot(corr,
method = "circle", #"square" (default), "circle"
type = "lower", # "full" (default), "lower" or "upper" display.
hc.order = TRUE, #logical value. If TRUE, correlation matrix will be hc.ordered using hclust #function.
colors = c("red", "white", "green"), #Change colors
lab = TRUE, #Add correlation coefficients
)
# Leave blank on no significant coefficient
ggcorrplot(
corr,
p.mat = p.mat,
hc.order = TRUE,
type = "lower",
insig = "blank"
)
#install.packages("corrplot")
library(corrplot)
## corrplot 0.92 loaded
#loading the dataset
data(mtcars)
#we will use “corrplot” library
library(corrplot)
#to make the correlation matrix plot
corrplot(cor(mtcars)) #it creates the correlation matrix
It represents the “correlation coefficient” or value of “R” or the
degree of the linear relationship. The value of R can be -1 to +1. +1
means positive 100% correlation, i.e., if one variable increases, the
other will also increase, or if it decreases, the other will also
decrease. -1 means negative 100%, i.e., the other will decrease if one
increases. And 0.0 means no linear relationship. The size of the circles
is relative to the percentage of correlation.
You can also change the method to “square,” “circle,” or “number” to change the way it represents the correlation matrix. Then you can change the type to show the upper section or the lower section of the matrix (remember upper and lower are actually mirror images), so you can either show upper or lower or the full based on your preference.
corrplot(
cor(mtcars),
method = "square",
type = "upper",
tl.col = "black",
tl.cex = 2,
col = colorRampPalette(c("purple", "dark green"))(200)
)
You can also create a mixed-type matrix using the following code. Here
in the upper section, I used the square as the method and the lower
number (which are the correlation coefficients).
corrplot.mixed(cor(mtcars),
upper = "square",
lower = "number",
addgrid.col = "black",
tl.col = "black")
## using ggstatsplot} package
An alternative to the correlogram presented above is possible with the ggcorrmat() function from the {ggstatsplot} package:
#install.packages("ggstatsplot")
# load package
library(ggstatsplot)
## You can cite this package as:
## Patil, I. (2021). Visualizations with statistical details: The 'ggstatsplot' approach.
## Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167
# correlogram
ggstatsplot::ggcorrmat(
data = mtcars,
type = "parametric", # parametric for Pearson, nonparametric for Spearman's correlation
colors = c("darkred", "white", "steelblue") # change default colors
)
dat <- mtcars[, c(1, 3:7)]
# correlogram
ggstatsplot::ggcorrmat(
data = dat,
type = "parametric", # parametric for Pearson, nonparametric for Spearman's correlation
colors = c("darkred", "white", "steelblue") # change default colors
)
dat <- mtcars[, c(1, 3:7)]
corrplot2 <- function(data,
method = "pearson",
sig.level = 0.05,
order = "original",
diag = FALSE,
type = "upper",
tl.srt = 90,
number.font = 1,
number.cex = 1,
mar = c(0, 0, 0, 0)) {
library(corrplot)
data_incomplete <- data
data <- data[complete.cases(data), ]
mat <- cor(data, method = method)
cor.mtest <- function(mat, method) {
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat <- matrix(NA, n, n)
diag(p.mat) <- 0
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(mat[, i], mat[, j], method = method)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
}
}
colnames(p.mat) <- rownames(p.mat) <- colnames(mat)
p.mat
}
p.mat <- cor.mtest(data, method = method)
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(mat,
method = "color", col = col(200), number.font = number.font,
mar = mar, number.cex = number.cex,
type = type, order = order,
addCoef.col = "black", # add correlation coefficient
tl.col = "black", tl.srt = tl.srt, # rotation of text labels
# combine with significance level
p.mat = p.mat, sig.level = sig.level, insig = "blank",
# hide correlation coefficiens on the diagonal
diag = diag
)
}
corrplot2(
data = dat,
method = "pearson",
sig.level = 0.05,
order = "original",
diag = FALSE,
type = "upper",
tl.srt = 75
)
It ranks the correlation and produces a gradually decreasing order of columns, Which is really useful to analyze the top most correlated variables.
#install.packages("lares")
library(lares)
corr_cross(mtcars, rm.na = T, max_pvalue = 0.05, top = 15, grid = T)
## Returning only the top 15. You may override with the 'top' argument
## Warning in .font_global(font, quiet = FALSE): Font 'Arial Narrow' is not
## installed, has other name, or can't be found
Negative correlations are represented in red and positive
correlations in blue.
Use the corr_cross()
function if you want to
compute all correlations and return the highest and significant ones in
a plot:
# devtools::install_github("laresbernardo/lares")
library(lares)
corr_cross(dat, # name of dataset
max_pvalue = 0.05, # display only significant correlations (at 5% level)
top = 10 # display top 10 couples of variables (by correlation coefficient)
)
## Returning only the top 10. You may override with the 'top' argument
Negative correlations are represented in red and positive
correlations in blue.
Use the corr_var()
function if you want to
focus on the correlation of one variable against all others, and return
the highest ones in a plot:
corr_var(dat, # name of dataset
mpg, # name of variable to focus on
top = 5 # display top 5 correlations
)
corr_var(mtcars, # name of dataset
mpg, # name of variable to focus on
top = 7 # display top 7 correlations
)
corr <- round(cor(mtcars), 2) #Compute a correlation matrix
col <- colorRampPalette(c("blue", "white", "red"))(20)
heatmap(x = corr, col = col, symm = TRUE)
Now, first let’s calculate and create a correlation matrix and then we will see how to create visualization using ggplot.
#install.packages("rstatix")
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:ggcorrplot':
##
## cor_pmat
## The following object is masked from 'package:stats':
##
## filter
cor_test <- cor_mat(mtcars) #to create the correlation matrix
cor_test
## # A tibble: 11 × 12
## rowname mpg cyl disp hp drat wt qsec vs am gear carb
## * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 mpg 1 -0.85 -0.85 -0.78 0.68 -0.87 0.42 0.66 0.6 0.48 -0.55
## 2 cyl -0.85 1 0.9 0.83 -0.7 0.78 -0.59 -0.81 -0.52 -0.49 0.53
## 3 disp -0.85 0.9 1 0.79 -0.71 0.89 -0.43 -0.71 -0.59 -0.56 0.39
## 4 hp -0.78 0.83 0.79 1 -0.45 0.66 -0.71 -0.72 -0.24 -0.13 0.75
## 5 drat 0.68 -0.7 -0.71 -0.45 1 -0.71 0.091 0.44 0.71 0.7 -0.091
## 6 wt -0.87 0.78 0.89 0.66 -0.71 1 -0.17 -0.55 -0.69 -0.58 0.43
## 7 qsec 0.42 -0.59 -0.43 -0.71 0.091 -0.17 1 0.74 -0.23 -0.21 -0.66
## 8 vs 0.66 -0.81 -0.71 -0.72 0.44 -0.55 0.74 1 0.17 0.21 -0.57
## 9 am 0.6 -0.52 -0.59 -0.24 0.71 -0.69 -0.23 0.17 1 0.79 0.058
## 10 gear 0.48 -0.49 -0.56 -0.13 0.7 -0.58 -0.21 0.21 0.79 1 0.27
## 11 carb -0.55 0.53 0.39 0.75 -0.091 0.43 -0.66 -0.57 0.058 0.27 1
cor_p <- cor_pmat(mtcars)
cor_p
## # A tibble: 11 × 12
## rowname mpg cyl disp hp drat wt qsec
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 mpg 0 6.11e-10 9.38e-10 0.000000179 0.0000178 1.29e- 10 1.71e-2
## 2 cyl 6.11e-10 0 1.8 e-12 0.00000000348 0.00000824 1.22e- 7 3.66e-4
## 3 disp 9.38e-10 1.8 e-12 0 0.0000000714 0.00000528 1.22e- 11 1.31e-2
## 4 hp 1.79e- 7 3.48e- 9 7.14e- 8 0 0.00999 4.15e- 5 5.77e-6
## 5 drat 1.78e- 5 8.24e- 6 5.28e- 6 0.00999 0 4.78e- 6 6.2 e-1
## 6 wt 1.29e-10 1.22e- 7 1.22e-11 0.0000415 0.00000478 2.27e-236 3.39e-1
## 7 qsec 1.71e- 2 3.66e- 4 1.31e- 2 0.00000577 0.62 3.39e- 1 0
## 8 vs 3.42e- 5 1.84e- 8 5.24e- 6 0.00000294 0.0117 9.8 e- 4 1.03e-6
## 9 am 2.85e- 4 2.15e- 3 3.66e- 4 0.18 0.00000473 1.13e- 5 2.06e-1
## 10 gear 5.4 e- 3 4.17e- 3 9.64e- 4 0.493 0.00000836 4.59e- 4 2.43e-1
## 11 carb 1.08e- 3 1.94e- 3 2.53e- 2 0.000000783 0.621 1.46e- 2 4.54e-5
## # … with 4 more variables: vs <dbl>, am <dbl>, gear <dbl>, carb <dbl>
Now as we have the matrix for r-value, we can just gather all the data into a variable columns (for all the keys) and the actual r value in another column using the following code.
df <- cor_test %>% gather(-rowname, key = cor_var, value = r)
df
## # A tibble: 121 × 3
## rowname cor_var r
## <chr> <chr> <dbl>
## 1 mpg mpg 1
## 2 cyl mpg -0.85
## 3 disp mpg -0.85
## 4 hp mpg -0.78
## 5 drat mpg 0.68
## 6 wt mpg -0.87
## 7 qsec mpg 0.42
## 8 vs mpg 0.66
## 9 am mpg 0.6
## 10 gear mpg 0.48
## # … with 111 more rows
df %>% ggplot(aes(rowname, cor_var, fill = r)) + geom_tile() +
labs(x = "variables", y = "variables")
Now lets say, we want to customize it, you can use the basic ggplot functions to customize it. For example:
df %>% ggplot(aes(rowname, cor_var, fill = r)) + geom_tile() +
labs(x = "variables", y = "variables") +
scale_fill_gradient(low = "blue", high = "red")
Now lets say, you want to add the actual values in your plot, you can
use the following code:
df %>% ggplot(aes(rowname, cor_var, fill = r)) + geom_tile() +
labs(x = "variables", y = "variables") +
scale_fill_gradient(low = "blue", high = "red") +
geom_text(aes(label = r))
## Another informative package is “perfromanceanalytics”, which gives
you p-value, distribution (histograms), and correlation coefficient.
#install.packages("PerformanceAnalytics")
library(PerformanceAnalytics)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
##
## legend
chart.Correlation(cor(mtcars))
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
The red stars in the figure define the level of significance. * = 0.05,
** = 0.01, *** = 0.001
my_data <- mtcars[, c(1,3,4,5,6,7)]
chart.Correlation(my_data, histogram=TRUE, pch=19)
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
## Warning in par(usr): argument 1 does not name a graphical parameter
Notice!
# Quick display of two cabapilities of GGally, to assess the distribution and correlation of variables
library(GGally)
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
# Check correlations (as scatterplots), distribution and print corrleation coefficient
ggpairs(mtcars, columns = 1:8, title="correlogram with ggpairs()")
# Nice visualization of correlations
ggpairs(mtcars, columns = 2:4, ggplot2::aes(colour=as.character(am)))
# Quick display of two cabapilities of GGally, to assess the distribution and correlation of variables
library(GGally)
# From the help page:
data(mtcars)
ggpairs(
mtcars[, c(1, 3, 4, 2)],
upper = list(continuous = "density", combo = "box_no_facet"),
lower = list(continuous = "points", combo = "dot_no_facet")
)
Correlation coefficient is a quantity that measures the strength of the association (or dependence) between two or more variables.
Pearson r
: is a parametric correlation test as
it depends on the distribution (normal distribution) of the data. It
measures the linear dependence between two variables. The plot of y =
f(x) is named the linear regression curve. (the mostly used method)
Kendall tau
: rank-based correlation coefficient
(non-parametric methods). Recommended if the data do not come from a
bivariate normal distribution.
Spearman rho
: rank-base correlation coefficient
(non-parametric methods). Recommended if the data do not come from a
bivariate normal distribution.
Are the data from each of the 2 variables (x, y) follow a normal distribution?
Use Shapiro-Wilk normality test -> R function: shapiro.test() and look at the normality plot -> R function: ggpubr::ggqqplot()
Shapiro-Wilk test can be performed as follow:
Null hypothesis: the data are normally distributed Alternative hypothesis: the data are not normally distributed
#install.packages("ggpubr")
library(ggpubr)
ggscatter(mtcars, x = "mpg", y = "wt",
add = "reg.line",
conf.int = TRUE,
cor.coef = TRUE,
cor.method = "pearson",
xlab = "Miles/(US) gallon",
ylab = "Weight (1000 lbs)")
## `geom_smooth()` using formula = 'y ~ x'
#Shapiro-Wilk normality test for mpg and wt
shapiro.test(mtcars$mpg)
##
## Shapiro-Wilk normality test
##
## data: mtcars$mpg
## W = 0.94756, p-value = 0.1229
shapiro.test(mtcars$wt)
##
## Shapiro-Wilk normality test
##
## data: mtcars$wt
## W = 0.94326, p-value = 0.09265
#Visual inspection of the data normality using Q-Q plots (quantile-quantile plots)
#Q-Q plot draws the correlation between a given sample and the normal distribution.
ggqqplot(mtcars$mpg, ylab = "MPG")
## Warning: The following aesthetics were dropped during statistical transformation: sample
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: sample
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggqqplot(mtcars$wt, ylab = "WT")
## Warning: The following aesthetics were dropped during statistical transformation: sample
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: sample
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
#Pearson correlation test
cor.test(mtcars$wt, mtcars$mpg, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: mtcars$wt and mtcars$mpg
## t = -9.559, df = 30, p-value = 1.294e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9338264 -0.7440872
## sample estimates:
## cor
## -0.8676594
If the data are not normally distributed, it’s recommended to use the non-parametric correlation, including Spearman and Kendall rank-based correlation tests.
#Spearman rank correlation coefficient
cor.test(mtcars$wt, mtcars$mpg, method = "spearman")
## Warning in cor.test.default(mtcars$wt, mtcars$mpg, method = "spearman"): Cannot
## compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: mtcars$wt and mtcars$mpg
## S = 10292, p-value = 1.488e-11
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.886422
#Kendall rank correlation test
res <- cor.test(mtcars$wt, mtcars$mpg, method="kendall")
## Warning in cor.test.default(mtcars$wt, mtcars$mpg, method = "kendall"): Cannot
## compute exact p-value with ties
res
##
## Kendall's rank correlation tau
##
## data: mtcars$wt and mtcars$mpg
## z = -5.7981, p-value = 6.706e-09
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.7278321
#Extract the p.value and the correlation coefficient
res$p.value
## [1] 6.70577e-09
res$estimate
## tau
## -0.7278321
The value of correlation coefficient can be negative or positive, range [-1, 1]:
-1: strong negative correlation 0: no relationship between the two variables (x and y) 1: strong positive correlation