Program 14

Author

Stephen George

USN

1NT24IS227

Problem Statement

Develop a script in r to calculate and visualize a correlatio matrix for a given dataset, wth color-coded cells indicating the strength and direction of correlation,using ggplot2’s geom_tile function

#load the required libraries
library(ggplot2)
library(tidyr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Dataset

We use the built-in mtcars dataset

#preview the dataset
head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#use the built-in mtcars dataset
data(mtcars)
#compute correlation matrix
cor_matrix <- cor(mtcars)
#convert matrix to a data frame for plotting
cor_df <- as.data.frame(as.table(cor_matrix))
head(cor_df)
  Var1 Var2       Freq
1  mpg  mpg  1.0000000
2  cyl  mpg -0.8521620
3 disp  mpg -0.8475514
4   hp  mpg -0.7761684
5 drat  mpg  0.6811719
6   wt  mpg -0.8676594

Explanation

  • cor(mtcars) computes pairwise correlation.
  • as.table() flattens matrix into a long format table.
  • The result has 3 columns: Var1, Var2 and the correlation value Freq.

Visualize using ggplot

p<-ggplot(cor_df,aes(x=Var1,y=Var2,fill = Freq))
p

p<-p+
  geom_tile(color="white")
p

p<-p+  #draw title borders
  scale_fill_gradient2(
    low="blue",mid="white",high="red",
    midpoint=0,limit = c(-1,1),
    name="correlation"
  )
p

p<-p +
  geom_text(aes(label=round(Freq,2)),size=3)
p

p<-p+  #Show values
  theme_minimal()
p

p<-p+
  labs(
    title = "Correlation Matrix(mtcars)",
    x = "",y = ""
  )
p

p<-p+
  theme(axis.text.x=element_text(angle = 45,hjust = 1))
p