# Load ggplot2 for plotting
library(ggplot2)
program -10
Program - 10
- Develop an R function to draw a density curve representing the probability density function of a continuous variable, with separate curves for each group, using ggplot2.
Step 1: Load Required Library
We need the ggplot2
package to create density plots.
Step 2: Define the Function
We will create a function called plot_density_by_group()
which: - Accepts a data frame, the name of a continuous variable, and a grouping variable - Draws density curves by group - Allows optional custom color schemes
<- function(data, continuous_var, group_var, fill_colors = NULL) {
plot_density_by_group # Check if the specified columns exist
if (!(continuous_var %in% names(data)) || !(group_var %in% names(data))) {
stop("Invalid column names. Make sure both variables exist in the dataset.")
}
# Create the ggplot object
<- ggplot(data, aes_string(x = continuous_var, color = group_var, fill = group_var)) +
p geom_density(alpha = 0.4) +
labs(title = paste("Density Plot of", continuous_var, "by", group_var),
x = continuous_var,
y = "Density") +
theme_minimal()
# Apply custom fill colors if provided
if (!is.null(fill_colors)) {
<- p + scale_fill_manual(values = fill_colors) +
p scale_color_manual(values = fill_colors)
}
# Return the plot
return(p)
}
Step 3: Explanation of Function Components
Code | Description |
---|---|
data |
The dataset (e.g., iris ) |
continuous_var |
Name of the continuous variable (e.g., "Sepal.Length" ) |
group_var |
Grouping variable (e.g., "Species" ) |
aes_string() |
Maps the variables using string names (for flexibility) |
geom_density(alpha = 0.4) |
Draws smoothed density curves with transparency |
facet_wrap(~ group_var) |
Not used here; instead we overlay curves in one plot |
theme_minimal() |
Clean layout with minimal gridlines |
scale_fill_manual() |
Applies custom fill colors if provided |
Step 4: Example with Built-in iris Dataset
Let’s draw density plots for Sepal.Length
across different Species
in the iris dataset.
# Basic usage
plot_density_by_group(iris, "Sepal.Length", "Species")
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
Step 5: Example with Custom Colors
You can customize colors to improve visual appeal or match your theme.
# Define custom colors
<- c("setosa" = "steelblue",
custom_colors "versicolor" = "forestgreen",
"virginica" = "darkorange")
# Plot with custom colors
plot_density_by_group(iris, "Petal.Length", "Species", fill_colors = custom_colors)
📈 Step 6: Output Description
The X-axis shows the continuous variable (e.g.,
Sepal.Length
)The Y-axis shows the probability density
Each group (e.g.,
Species
) is represented by a separate curveThe
alpha = 0.4
setting allows curves to overlap transparently
Summary
This function is: - Reusable: Works for any dataset with a numeric and a categorical variable - Customizable: Supports color schemes - Effective: Helps visualize distribution patterns across groups
Use it for exploratory data analysis to compare how different categories behave in terms of continuous measurements.