Iris Dataset Visualization

Author

Abby Williamson

Introduction

This report explores the iris dataset using various R plotting techniques. We will look at the relationships between petal lengths and widths, filter for specific species, and customize plots using different symbols, colors, and sizes.

Loading the Data

First, we load the iris dataset from the local directory.

iris = read.csv("./data/iris.csv", stringsAsFactors = TRUE)

Eploratory Data Plots

Plot 1: Petal length vs petal width

We begin with creating a simple scatter plot comparing petal length to petal width to see the overall trend across the dataset

plot(iris$Petal.Length, iris$Petal.Width,
     xlab = "Petal Length (cm)", ylab = "Petal Width (cm)",
     main = "Petal Length vs Width")

Plot 2: Petal length vs petal width for setosa only

Next, subset the data to isolate the Setosa species and look only at its unique petal dimensions

setosa_data = subset(iris, Species == "setosa")
plot(setosa_data$Petal.Length, setosa_data$Petal.Width,
     xlab = "Petal Length (cm)", ylab = "Petal Width (cm)",
     main = "Setosa Petal Length vs Width")

Advanced customization

Plot 3: Different symbols for different species

To different groups in a single visualization, we map the plotting character to the numeric factor levels of the flower species

plot (iris$Petal.Length, iris$Petal.Width,
      xlab = "Petal Length (cm)", ylab = "Petal Width (cm)",
      main = "Species by Symbol Type",
      pch = as.numeric(iris$Species))

legend("topleft", legend = levels(iris$Species), pch = 1:3, title = "Species")

Plot 4: Different colors for different species

Next, we use a custom color palette to clearly distinguish the three species

species_colors = c("setosa" = "darkorange", "versicolor" = "purple", "virginica" = "deepskyblue")

plot(iris$Petal.Length, iris$Petal.Width,
     xlab = "Petal Length (cm)", ylab = "Petal Width (cm)",
     main = "Species by Custom Colors",
     pch = 16, col = species_colors[iris$Species])

legend("topleft", legend = levels(iris$Species), col = species_colors, pch = 16, title = "Species")

Plot 5: Scaled symbols by sepal width and colored by species

A bubble plot maps three variables at once, with color dictating the species, postion displaying petal size, and point radius (cex) scaling with the sepal width.

plot(iris$Petal.Length, iris$Petal.Width,
     xlab = "Petal Length (cm)", ylab = "Petal Width (cm)", 
     main = "Bubble Plot: Size Scaled by Sepal Width", 
     pch = 1, cex = iris$Sepal.Width / 2, col = species_colors[iris$Species])

legend("topleft", legend = levels(iris$Species), col = species_colors, pch = 1, title = "Species")

Plot 6: Petal length vs petal width with a smoothing line

Lastly, we apply a LOESS smoothing line to the global scatter plot to visualize the continuous trend across all flower varieties

scatter.smooth(iris$Petal.Length, iris$Petal.Width,
               xlab = "Petal Length (cm)", ylab = "Petal Width (cm)", 
               main = "Petal Dimensions with Smoothing Line")