Spring 2025
Important points are emphasized / annotated
Axes, symbols, and colors are described
Visual content clarifies (does not distract)
Is accurate, clear, and improves understanding
An “effective graph” communicates clearly
Patterns
Relationships \(\leadsto\) compare & contrast values
Anomolies
Focus / reduction of information
It is bad at doing what it is designed to do: Difficult to judge relative size of the pie slices
Inefficient / inflexible use of space
Need many colors and high contrast to make wedges distinct
We’re much worse at estimating area than length — we’re especially bad at perceiving small differences in area
Pie charts make judging trends difficult
3D effects make graphs harder to read
Are we to judge length? Area? Volume?
Display looks 3D when the angular perspective is offset, which makes referencing values on the axes harder
Display looks 3D when shading is employed, which clutters the graph and makes it harder to read
When you have multiple variables to compare, there are several possibilities:
Plotting 2D values using a scatter plot is easy
If we have a categorical variable, we can sometimes use shading or color to add a third dimension
But if we have another numeric dimension, it’s challenging
Why not use point size (area)?
Items compared should have the same baseline for comparison
That baseline should not distort the true data values
Scaling should be set properly for comparison (apples-to-apples)
Scaling should not distort the true data values
Data should always be properly adjusted
The ggplot2 library:
Elements you choose to visualize may be set explicitly, or may be mapped to a variable using aes()
Additional elements you choose to visualize may be set explicitly, or may be mapped to a variable using aes()
library(ggplot2) rd = data.frame( Student = c("Bob", "Sue", "Cat", "Lin"), NumberGrade = c(96, 82, 97, 74), LetterGrade = factor(c("A","B","A","C")) ) p = ggplot(rd, aes(y=NumberGrade))
ggplot2 interprets plot elements using geom objects:
There are other geomtetry objects: http://docs.ggplot2.org/current/
Aside from geom objects, there are other kinds of layers:
library(ggplot2) rd = data.frame(Student = c("Bob", "Sue", "Cat", "Lin"), NumberGrade = c(96, 82, 97, 74), LetterGrade = factor(c("A","B","A","C")) ) ggplot(rd, aes(x=Student,y=NumberGrade)) + # Build the plot object geom_point(size=5) + # Encode visually using points xlab("Student Name") + # Label the X axis ylab("Numeric Grade") + # Label the Y axis ggtitle("Course Grade Results") # Give the plot a title
library(ggplot2) rd = data.frame(Student = c("Bob", "Sue", "Cat", "Lin"), NumberGrade = c(96, 82, 97, 74), LetterGrade = factor(c("A","B","A","C")) ) ggplot(rd, aes(x=Student,y=NumberGrade)) + geom_bar(stat="identity") + # Only line that changed... xlab("Student Name") + ylab("Numeric Grade") + ggtitle("Course Grade Results")
library(ggplot2) rd = data.frame(Student = c("Bob", "Sue", "Cat", "Lin"), NumberGrade = c(96, 82, 97, 74), LetterGrade = factor(c("A","B","A","C")) ) ggplot(rd, aes(x=Student,y=NumberGrade)) + geom_bar(stat="identity") + coord_flip() + xlab("Student Name") + ylab("Numeric Grade") + ggtitle("Course Grade Results")
library(ggplot2) rd = data.frame(Student = c("Bob", "Sue", "Cat", "Lin"), NumberGrade = c(96, 82, 97, 74), LetterGrade = factor(c("A","B","A","C")) ) ggplot(rd, aes(x=Student,y=NumberGrade,fill=LetterGrade)) + geom_bar(stat="identity") + xlab("Student Name") + ylab("Numeric Grade") + ggtitle("Course Grade Results")
library(ggplot2) myData = data.frame(Furbletude=rnorm(30), Blehmekness=rnorm(30)) ggplot(myData, aes(x=Furbletude, y=Blehmekness)) + geom_point(color="lightblue", fill="darkblue", size=4)
library(ggplot2) myData = data.frame(Furbletude=rnorm(30), Blehmekness=rnorm(30)) ggplot(myData, aes(x=Furbletude, y=Blehmekness)) + geom_point(color="darkblue", fill="lightblue", size=4, shape=21)
library(ggplot2) myData = data.frame(Count=sample(1:10, 30, replace=T), Awesomeness=sample(c("CoolThings", "SillyThings", "Meh"), 30, replace=T)) ggplot(myData, aes(x=Awesomeness, y=Count)) + geom_bar(stat="identity", color="white", fill=rgb(0.12, 0.76, 0.9))
library(ggplot2) myData = data.frame(Count=sample(1:10, 30, replace=T), Awesomeness=sample(c("CoolThings", "SillyThings", "Meh"), 30, replace=T), TypeOfThing=sample(c("A", "B", "C"), 30, replace=T)) ggplot(myData, aes(x=Awesomeness, y=Count, fill=TypeOfThing)) + geom_bar(stat="identity", color="black")
library(ggplot2) library(RColorBrewer) myData = data.frame(Count=sample(1:10, 30, replace=T), Awesomeness=sample(c("CoolThings", "SillyThings", "Meh"), 30, replace=T), TypeOfThing=sample(c("A", "B", "C"), 30, replace=T)) ggplot(myData, aes(x=Awesomeness, y=Count, fill=TypeOfThing)) + geom_bar(stat="identity") + scale_fill_brewer(palette="Set2") + # Set2 is a color-blind friendly palette theme_bw() # Make the background white and the grid lines black
library(ggplot2) ggplot(mtcars, aes(x=mpg,y=hp)) + geom_smooth(size=1.5, color="darkgray") + geom_point(aes(size=gear,color=cyl)) + xlab("Miles per Gallon") + ylab("Horse Power")
library(ggplot2) ggplot(mtcars, aes(x=mpg,y=hp)) + geom_point(size=4, shape=21, fill="lightblue", color="darkblue") + xlab("Miles per Gallon") + ylab("Horse Power") + theme(text=element_text(size=18, family="Times"))
library(ggplot2) ggplot(diamonds, aes(carat)) + geom_histogram(binwidth=0.5, fill="wheat", color="black") + xlab("Carat") + ylab("Count") + ggtitle("Diamond Carat Distribution") theme(text=element_text(size=18, family="Times"))