# Load the ggplot2 package
library(ggplot2)PROGRAM 9
objective
Create multiple histograms using ggplot2:: facet wrap() to visualize how a variable (e.g Sepal. Length) is distributed across different groups (e.g. Species) in a built-in R dataset.
Requirements
Step 1: Load and Explore the Dataset
We’ll use the built-in iris dataset. This dataset contains:
150 rows (observations)
4 numeric columns:
Sepal.Length,Sepal.Width,Petal.Length,Petal.Width1 categorical column:
Species(Setosa, Versicolor, Virginica)
Let’s view the first few rows.
#load the iris dataset
data(iris)
#view the first few rows of data set
head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Step 2: Create Grouped Histograms Using facet_wrap
Let’s now create histograms of Sepal.Length for each Species using ggplot2 and facet_wrap().
# Create histograms using facet_wrap for grouped data
ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(binwidth = 0.3, fill = "skyblue", color = "black") +
facet_wrap(~ Species) +
labs(title = "Distribution of Sepal Length by Species",
x = "Sepal Length (cm)",
y = "Frequency") +
theme_minimal()Step 3: Explanation of Each Line
| Code Line | Description |
|---|---|
ggplot(iris, aes(x = Sepal.Length)) |
Initializes a plot using the iris dataset and maps Sepal.Length to the x-axis. |
geom_histogram(binwidth = 0.3, ...) |
Adds a histogram layer with a bin width of 0.3. |
fill = "skyblue" |
Sets the fill color of the bars. |
color = "black" |
Sets the border color of the bars. |
facet_wrap(~ Species) |
Creates separate histograms for each species in a grid layout. |
labs(...) |
Adds a title and axis labels. |
theme_minimal() |
Applies a minimal theme for better visualization. |
Output Description
The output will be three side-by-side histograms, each showing the distribution of Sepal Length for one of the following species:
setosaversicolorvirginica
Each histogram allows us to visually compare the distribution of Sepal Length across the species.
Bonus Tip: Try with Different Variables
You can replace Sepal.Length in the aes(x = ...) part with:
Sepal.WidthPetal.LengthPetal.Width
This lets you explore how other features vary across species!
Summary
This exercise demonstrates:
How to create grouped visualizations using
facet_wrap().How to analyze and compare distributions across categories using histograms.
Use of
ggplot2, one of the most powerful R libraries for data visualization.