program 2

Author

srushti vh 1nt23is218

write an r script to create a scatterv plot,incorporating categorical analysis through color-coded data points representing different groups,using ggplot2.

step1:load necessary libraries

library(ggplot2)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

step2:load the dataset

explanation:

  • the iris dataset contains 150 samples of iris flowers categorized into three species:setosa , versicolor, and virginica.

  • each sample has petal and sepal measurements.

  • head(data) displays the first few rows.

data <- iris
head(data, n=10)
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
table(data$species)
< table of extent 0 >

step3:create a scatter plot

ggplot(data , aes(x = Sepal.Length, y= Sepal.Width , color = Species))+
  geom_point(size = 3,alpha = 0.7)+
  labs(title = "scatter plot of sepal dimensions",
       x = " Sepal Length",
       y = " Sepal Width",
       color = "Species")+
  theme_minimal()+
  theme(legend.position = "top")