# Load libraries
library(ggplot2)
library(GGally)
# Create plot
ggparcoord(
iris,
columns = 1:4,
groupColumn = "Species",
scale = "uniminmax"
) +
theme_minimal() +
labs(
title = "Parallel Coordinates Plot of Iris Measurements",
x = "Variables",
y = "Scaled Values"
) +
scale_color_brewer(palette = "Set1")
A parallel coordinates plot is useful for comparing multiple numeric variables at the same time.This plot helps show patterns and differences among iris species across flower measurements. The colored lines make it easy to see clustering between species.
# Load libraries
library(ggplot2)
library(maps)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Map data
states_map <- map_data("state")
# Crime data
crime_data <- USArrests %>%
as.data.frame()
# Add state names
crime_data$region <- tolower(rownames(crime_data))
# Join datasets
map_data_combined <- left_join(states_map, crime_data, by = "region")
# Create choropleth map
ggplot(map_data_combined,
aes(long, lat, group = group, fill = Murder)) +
geom_polygon(color = "white") +
coord_fixed(1.3) +
theme_void() +
labs(
title = "Murder Rates Across the United States",
fill = "Murder Rate"
) +
scale_fill_gradient(low = "lightblue", high = "darkred")
A choropleth map is appropriate because it shows geographic differences clearly across states. The color gradient helps identify which states have higher or lower murder rates. Using state boundaries makes the information easier to interpret.
# Load libraries
library(ggplot2)
library(ggalluvial)
# Convert Titanic data
titanic_data <- as.data.frame(Titanic)
# Create plot
ggplot(
titanic_data,
aes(axis1 = Class, axis2 = Sex, axis3 = Age, axis4 = Survived,
y = Freq)
) +
geom_alluvium(aes(fill = Survived)) +
geom_stratum() +
geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("Class", "Sex", "Age", "Survived")) +
labs(
title = "Titanic Passenger Survival Flow",
x = "Category",
y = "Count"
) +
theme_minimal()
A flow diagram is usueful for showing relationships between categories. This grpah helps visualize how passenger, class, sex, age, and survival status are connected. The flowing bands make it easy to follow groups through multiple variables.
# Load libraries
library(ggplot2)
library(ggdist)
# Create plot
ggplot(ToothGrowth, aes(x = supp, y = len, fill = supp)) +
stat_halfeye(
adjust = 0.5,
justification = -0.2,
point_colour = NA
) +
geom_boxplot(
width = 0.12,
outlier.color = NA,
alpha = 0.5
) +
geom_jitter(
width = 0.05,
alpha = 0.5
) +
theme_minimal() +
labs(
title = "Raincloud Plot of Tooth Length by Supplement",
x = "Supplement Type",
y = "Tooth Length"
)
A raincloud plot is appropriate because it combines density, boxplots, and raw data points in one visualization.This makes it easy to see the overall distribution and individual observations at the same time. The plot clearly compares tooth growth between supplement groups.
library(ggplot2)
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point(size = 3) +
theme_minimal() +
labs(
title = "Car Weight vs Miles Per Gallon",
x = "Weight",
y = "Miles Per Gallon",
color = "Cylinders"
)
this scaterplot shows that hevaier cars generally have lower fuel efficiency. Cars with fewer cylinders tend to have better gas mileage. The color grouping helps compare cylinder categories clearly.
A scatterplot is appropriate because it shows the relationship between two numeric variables. It helps identify trends, clusters, and possible correlations between weight and fuel efficiency.
ggplot(mtcars, aes(x = factor(cyl), y = hp, fill = factor(cyl))) +
geom_boxplot() +
theme_minimal() +
labs(
title = "Horsepower by Cylinder Count",
x = "Cylinders",
y = "Horsepower",
fill = "Cylinders"
)
Cars with more cylinders generally have higher horsepower. The boxplot also shows the spread and variability of horsepower within each cylinder group.
A boxplot is useful for comparing distributions across groups. It clearly shows medians, ranges, and outliers.
ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
geom_histogram(bins = 10, alpha = 0.7) +
theme_minimal() +
labs(
title = "Distribution of Miles Per Gallon",
x = "Miles Per Gallon",
fill = "Cylinders"
)
Most cars in the dataset fall in the middle MPG range, while fewer cars have extremely high fuel efficiency. Cylinder groups help show how fuel economy differs among vehicles.
A histogram is appropriate because it displays the frequency and distribution of a numeric variable. It helps identify patterns and spread in fuel efficiency.
ggplot(mtcars, aes(x = factor(am), y = mpg, fill = factor(am))) +
geom_violin() +
theme_minimal() +
labs(
title = "MPG by Transmission Type",
x = "Transmission (0 = Automatic, 1 = Manual)",
y = "Miles Per Gallon",
fill = "Transmission"
)
Manual cars tend to have slightly higher MPG values compared to automatic cars. The violin shape also shows the density of observations within each group.
A violin plot combines distribution and density information, making it useful for comparing groups while also showing spread.
library(reshape2)
# Correlation matrix
cor_matrix <- cor(mtcars)
# Convert to dataframe
cor_data <- melt(cor_matrix)
# Create heatmap
ggplot(cor_data, aes(x = Var1, y = Var2, fill = value)) +
geom_tile() +
theme_minimal() +
labs(
title = "Correlation Heatmap of Car Variables",
x = "",
y = "",
fill = "Correlation"
)
The heatmap shows strong positive and negative relationships between car variables. For example, weight and horsepower are positively related, while MPG and weight are negatively related.
A heatmap is effective for comparing many variable relationships at once. The color intensity makes correlations easier to identify quickly.