# Load required libraries
library(ggplot2)
PROGRAM 14
Correlation Matrix Visualization using ggplot2
Develop a script in R to calculate and visualize a correlation matrix for a given dataset, with color-coded cells indicating the strength and direction of correlations, using ggplot2’s geom_tile function.
library(tidyr)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Dataset
We use the built-in mtcars
dataset.
# Preview the dataset
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Cundefined
# Use built-in mtcars dataset
data(mtcars)
# Compute correlation matrix
<- cor(mtcars)
cor_matrix
# Convert matrix to a data frame for plotting
<- as.data.frame(as.table(cor_matrix))
cor_df head(cor_df)
Var1 Var2 Freq
1 mpg mpg 1.0000000
2 cyl mpg -0.8521620
3 disp mpg -0.8475514
4 hp mpg -0.7761684
5 drat mpg 0.6811719
6 wt mpg -0.8676594
Explanation:
cor(mtcars)
computes pairwise correlation.as.table()
flattens the matrix into a long-format table.The result has 3 columns: Var1, Var2, and the correlation value (
Freq
).
Step 2: Visualize Using ggplot2::geom_tile
ggplot(cor_df, aes(x = Var1, y = Var2, fill = Freq)) +
geom_tile(color = "white") + # Draw tile borders
scale_fill_gradient2(
low = "blue", mid = "white", high = "red",
midpoint = 0, limit = c(-1, 1),
name = "Correlation"
+
) geom_text(aes(label = round(Freq, 2)), size = 3) + # Show values
theme_minimal() +
labs(
title = "Correlation Matrix (mtcars)",
x = "", y = ""
+
) theme(axis.text.x = element_text(angle = 45, hjust = 1))
. |
. |
Step | Description |
---|---|
cor() |
Computes correlation values between numeric variables. |
as.table() + as.data.frame() |
Converts matrix into a long format suitable for plotting. |
ggplot() |
Initializes the plot using long-form data. |
geom_tile() |
Creates color-coded tiles based on correlation values. |
scale_fill_gradient2() |
Applies a diverging color scale: red (strong +ve), blue (strong -ve), white (neutral). |
geom_text() |
Adds correlation values as text in each cell. |
theme_minimal() |
Cleans up the plot visually. |
axis.text.x rotation |
Tilts x-axis labels for better readability. |