Program 9

Author

1NT23IS080 - Section B - Harsh Deep B Nair

Create multiple histograms using ggplot2::facet_wrap() to visualize how a variable (e.g., Sepal.Length) is distributed across different groups (e.g., Species) in a built-in R dataset.

Step 1: Load the necessary libraries

# Load the ggplot2 package
library(ggplot2)

Step 2: Load and explore the dataset

# Load the iris dataset
data(iris)

# View the first few rows of the dataset
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Step 3: Create grouped histograms using facet_wrap

# Create histograms using facet_wrap for grouped data
ggplot(iris, aes(x = Sepal.Length)) +
  geom_histogram(binwidth = 0.3, fill = "skyblue", color = "black") +
  facet_wrap(~ Species) +
  labs(title = "Distribution of Sepal Length by Species",
       x = "Sepal Length (cm)",
       y = "Frequency") +
  theme_minimal()

Step 4: Explanation of each line

Code Line Description
ggplot(iris, aes(x = Sepal.Length)) Initializes a plot using the iris dataset and maps Sepal.Length to the x-axis.
geom_histogram(binwidth = 0.3, ...) Adds a histogram layer with a bin width of 0.3.
fill = "skyblue" Sets the fill color of the bars.
color = "black" Sets the border color of the bars.
facet_wrap(~ Species) Creates separate histograms for each species in a grid layout.
labs(...) Adds a title and axis labels.
theme_minimal() Applies a minimal theme for better visualization.

Output Description

The output will be three side-by-side histograms, each showing the distribution of Sepal Length for one of the following species:

  • setosa

  • versicolor

  • virginica

Each histogram allows us to visually compare the distribution of Sepal Length across the species.

Summary

This exercise demonstrates:

  • How to create grouped visualizations using facet_wrap().

  • How to analyze and compare distributions across categories using histograms.

  • Use of ggplot2, one of the most powerful R libraries for data visualization.