Parallel Coordinates

# Load libraries
library(ggplot2)
library(GGally)

# Create plot
ggparcoord(
  iris,
  columns = 1:4,
  groupColumn = "Species",
  scale = "uniminmax"
) +
  theme_minimal() +
  labs(
    title = "Parallel Coordinates Plot of Iris Measurements",
    x = "Variables",
    y = "Scaled Values"
  ) +
  scale_color_brewer(palette = "Set1")

Explnation

A parallel coordinates plot is useful for comparing multiple numeric variables at the same time.This plot helps show patterns and differences among iris species across flower measurements. The colored lines make it easy to see clustering between species.

Map

# Load libraries
library(ggplot2)
library(maps)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Map data
states_map <- map_data("state")

# Crime data
crime_data <- USArrests %>%
  as.data.frame()

# Add state names
crime_data$region <- tolower(rownames(crime_data))

# Join datasets
map_data_combined <- left_join(states_map, crime_data, by = "region")

# Create choropleth map
ggplot(map_data_combined,
       aes(long, lat, group = group, fill = Murder)) +
  geom_polygon(color = "white") +
  coord_fixed(1.3) +
  theme_void() +
  labs(
    title = "Murder Rates Across the United States",
    fill = "Murder Rate"
  ) +
  scale_fill_gradient(low = "lightblue", high = "darkred")

Explanation

A choropleth map is appropriate because it shows geographic differences clearly across states. The color gradient helps identify which states have higher or lower murder rates. Using state boundaries makes the information easier to interpret.

Flow chart

# Load libraries
library(ggplot2)
library(ggalluvial)

# Convert Titanic data
titanic_data <- as.data.frame(Titanic)

# Create plot
ggplot(
  titanic_data,
  aes(axis1 = Class, axis2 = Sex, axis3 = Age, axis4 = Survived,
      y = Freq)
) +
  geom_alluvium(aes(fill = Survived)) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  scale_x_discrete(limits = c("Class", "Sex", "Age", "Survived")) +
  labs(
    title = "Titanic Passenger Survival Flow",
    x = "Category",
    y = "Count"
  ) +
  theme_minimal()

Explanation

A flow diagram is usueful for showing relationships between categories. This grpah helps visualize how passenger, class, sex, age, and survival status are connected. The flowing bands make it easy to follow groups through multiple variables.

Raincloud Plot

# Load libraries
library(ggplot2)
library(ggdist)

# Create plot
ggplot(ToothGrowth, aes(x = supp, y = len, fill = supp)) +
  stat_halfeye(
    adjust = 0.5,
    justification = -0.2,
    point_colour = NA
  ) +
  geom_boxplot(
    width = 0.12,
    outlier.color = NA,
    alpha = 0.5
  ) +
  geom_jitter(
    width = 0.05,
    alpha = 0.5
  ) +
  theme_minimal() +
  labs(
    title = "Raincloud Plot of Tooth Length by Supplement",
    x = "Supplement Type",
    y = "Tooth Length"
  )

Explanation

A raincloud plot is appropriate because it combines density, boxplots, and raw data points in one visualization.This makes it easy to see the overall distribution and individual observations at the same time. The plot clearly compares tooth growth between supplement groups.

STUDENT PORTFOLIO

Cetnral Question: How do car characteristics differ across vehicle types and performance measures?

Plot 1

library(ggplot2)

ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3) +
  theme_minimal() +
  labs(
    title = "Car Weight vs Miles Per Gallon",
    x = "Weight",
    y = "Miles Per Gallon",
    color = "Cylinders"
  )

Interpretation

this scaterplot shows that hevaier cars generally have lower fuel efficiency. Cars with fewer cylinders tend to have better gas mileage. The color grouping helps compare cylinder categories clearly.

why this plot works

A scatterplot is appropriate because it shows the relationship between two numeric variables. It helps identify trends, clusters, and possible correlations between weight and fuel efficiency.

Plot 2

ggplot(mtcars, aes(x = factor(cyl), y = hp, fill = factor(cyl))) +
  geom_boxplot() +
  theme_minimal() +
  labs(
    title = "Horsepower by Cylinder Count",
    x = "Cylinders",
    y = "Horsepower",
    fill = "Cylinders"
  )

Interpretation

Cars with more cylinders generally have higher horsepower. The boxplot also shows the spread and variability of horsepower within each cylinder group.

why this plot works

A boxplot is useful for comparing distributions across groups. It clearly shows medians, ranges, and outliers.

Plot 3

ggplot(mtcars, aes(x = mpg, fill = factor(cyl))) +
  geom_histogram(bins = 10, alpha = 0.7) +
  theme_minimal() +
  labs(
    title = "Distribution of Miles Per Gallon",
    x = "Miles Per Gallon",
    fill = "Cylinders"
  )

Interpretation

Most cars in the dataset fall in the middle MPG range, while fewer cars have extremely high fuel efficiency. Cylinder groups help show how fuel economy differs among vehicles.

why this plot works

A histogram is appropriate because it displays the frequency and distribution of a numeric variable. It helps identify patterns and spread in fuel efficiency.

plot 4

ggplot(mtcars, aes(x = factor(am), y = mpg, fill = factor(am))) +
  geom_violin() +
  theme_minimal() +
  labs(
    title = "MPG by Transmission Type",
    x = "Transmission (0 = Automatic, 1 = Manual)",
    y = "Miles Per Gallon",
    fill = "Transmission"
  )

Interpretation

Manual cars tend to have slightly higher MPG values compared to automatic cars. The violin shape also shows the density of observations within each group.

why this plot works

A violin plot combines distribution and density information, making it useful for comparing groups while also showing spread.

Plot 5

library(reshape2)

# Correlation matrix
cor_matrix <- cor(mtcars)

# Convert to dataframe
cor_data <- melt(cor_matrix)

# Create heatmap
ggplot(cor_data, aes(x = Var1, y = Var2, fill = value)) +
  geom_tile() +
  theme_minimal() +
  labs(
    title = "Correlation Heatmap of Car Variables",
    x = "",
    y = "",
    fill = "Correlation"
  )

Interpretation

The heatmap shows strong positive and negative relationships between car variables. For example, weight and horsepower are positively related, while MPG and weight are negatively related.

WHy this plot works

A heatmap is effective for comparing many variable relationships at once. The color intensity makes correlations easier to identify quickly.