Program 2

Author

Manoj

Write an R script to create a scatter plot, incorporating categorical analysis through color-coded data points representing different groups, using ggplot2.

Step 1: Load necessary libraries

# Load necessary libraries
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.1.3
library(dplyr)
Warning: package 'dplyr' was built under R version 4.1.3

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Step 2: Load the Dataset

Explanation:

  • The iris dataset contains 150 samples of iris flowers categorized into three species: setosa, versicolor, and virginica.

  • Each sample has sepal and petal measurements.

  • head(data) displays the first few rows.

# Load the iris dataset
data <- iris

# Display first few rows
head(data)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Step 3: Create a Scatter Plot

X-Axis (Sepal.Length)

  • Represents the length of the flower’s sepal.

Y-Axis (Sepal.Width)

  • Represents the width of the flower’s sepal.

Color (Species)

  • Differentiates three species using distinct colors

Customization

  • geom_point(size = 3, alpha = 0.7): Increases the size of points and makes them slightly transparent.

  • labs(): Adds a title and axis labels.

  • theme_minimal(): Uses a clean background for readability

  • theme(legend.position = "top"): Moves the legend to the top.

# Create a scatter plot using ggplot2
ggplot(data, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_point(size = 3, alpha = 0.7) +  # Increase point size & transparency
  labs(title = "Scatter Plot of Sepal Dimensions",
       x = "Sepal Length",
       y = "Sepal Width",
       color = "Species") +  # Legend title
  theme_minimal() +  # Clean layout
  theme(legend.position = "top")  # Move legend to the top