programe5

Author

chethan

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

Implement a R program to create a histogram illustrating the distribution of a continuous variable, with overlays of density curves for each hroup, using ggplot2.

Overview of Steps

In this program, we will follow these steps:

  1. Load the required library

Step 1: Load Required Library

We first load the ggplot2 package, which is used for data visualization in R.

#load ggplot2 package for visualization
library(ggplot2)

Step 2: Explore the Inbuilt Dataset

We use the built-in dataset. In this dataset:

#use the built-in `iris` dataset
# `Petal.Length` is a continuous variable
# `Species` is a categorical grouping variable
str(iris)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
head(iris,n=2)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
tail(iris)
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
145          6.7         3.3          5.7         2.5 virginica
146          6.7         3.0          5.2         2.3 virginica
147          6.3         2.5          5.0         1.9 virginica
148          6.5         3.0          5.2         2.0 virginica
149          6.2         3.4          5.4         2.3 virginica
150          5.9         3.0          5.1         1.8 virginica
tail(iris,n=2)
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
149          6.2         3.4          5.4         2.3 virginica
150          5.9         3.0          5.1         1.8 virginica

##sStep

# STart ggplot with iris dataset
# Map Petal.Length to x-axis and fill by Species

p <- ggplot(data = iris, aes(x = Petal.Length, fill = Species))
p

Step 3.2

# Add histogram with density scaling

p <- p + geom_histogram(aes(y = ..density..),
                        alpha = 0.4,
                        position = "identity",
                        bins= 30)
p
Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(density)` instead.

#over density curves for each group

p<- p+
  geom_density(aes(color=Species),size=1.2)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
p

#add titel and axis labels,and apply clean theme
p<-p+labs(
  title="distribustion of petal length with group-wise density curves",
  x="Petol Length",
  y="Density")+
  theme_minimal()

p

p<- p + theme(legend.position = "top")
p