By the end of this lesson, you will be able to:
R has powerful plotting tools built directly into the base
language.
Today we will learn how to create two very common types of plots:
These skills form the foundation for data visualization in R.
These functions initiate a new graph in the plotting window. The specific function called depends on the class of the input data.
Examples include:
plot()hist()boxplot()barplot()Arguments can be passed to high-level functions to control appearance.
Common options include:
xlab, ylab – Specify labels for the x and
y axesmain – Sets the main title of the plottype – Character indicating the plot type
"p" for points"l" for lines"b" for bothcol – Specifies colorspch – Defines the plotting symbol (e.g.,
19 for filled circles)lwd – Controls line widthlty – Specifies line type (e.g., 1 =
solid, 2 = dashed)xlim, ylim – Define the limits for the x
and y axesThese functions add features to the current active plot without starting a new one.
points() – Adds pointslines() – Adds connected line segmentsabline() – Adds a straight line
abline(v = value) for vertical linesabline(h = value) for horizontal lineslegend() – Adds a legendtitle() – Adds main, subtitle, xlab, and ylab after the
plot has been createdpar()The par() function is used to set or query global
graphical parameters that affect all subsequent plots during an R
session.
We will begin using a built-in dataset called
mtcars.
Each row represents a car model.
Each column represents a variable such as miles per gallon, horsepower,
or weight.
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
In base R:
First variable → x-axis Second variable → y-axis
plot(mtcars$wt, mtcars$mpg)
plot(
mtcars$wt,
mtcars$mpg,
xlab = "Weight (1000 lbs)",
ylab = "Miles per Gallon",
main = "Fuel Efficiency vs Car Weight"
)
Useful options:
pch → shape col → color cex → size
plot(
mtcars$wt,
mtcars$mpg,
pch = 17,
col = "#FFFF00",
cex = 4.2,
xlab = "Weight (1000 lbs)",
ylab = "Miles per Gallon",
main = "Fuel Efficiency vs Car Weight"
)
Make a scatterplot with:
1.Horsepower (hp) on x-axis 2.Miles per gallon (mpg) on y-axis 3.Change color and point shape.
plot(mtcars$hp, mtcars$mpg,
pch=69,
col="red",
xlab="Horsepower",
ylab="Miles per gallon"
)
Line plots connect points in order.
They are best for:
Time series Ordered data Showing trends
year <- c(2000, 2005, 2010, 2015, 2020)
population <- c(282, 295, 309, 321, 331)
plot(year, population, type = "l")
#type = "l" means line.
plot(
year,
population,
type = "l",
lwd = 2,
col = "darkgreen",
xlab = "Year",
ylab = "Population (millions)",
main = "Population Growth Over Time",
)
points(year, population)
plot(
year,
population,
type = "b",
pch = 19,
col = "purple",
lwd = 2,
xlab = "Year",
ylab = "Population (millions)",
main = "Population Growth Over Time"
)
Use scatterplots when:
Use line plots when:
Create your own line plot using the example data below.
These data represent test scores over five weeks.
Make a line plot that:
week <- c(1, 2, 3, 4, 5)
score <- c(70, 75, 78, 85, 90)
plot(week, score,
type="b",
xlab="Week",
ylab="Score",
main="Week-Score Correlation",
col="blue",
lwd=0.9
)
For this homework, you will create:
You will also use at least ONE new base R plotting feature that we did not practice in class.
Choose at least ONE of the following new features (you can also find a different one if you’d like):
abline() → adds a trend linegrid() → adds background grid lineslegend() → adds a legend to your plot(You only need to choose ONE, but you may use more if you’d like.)
Use the mtcars dataset.
Create a scatterplot that:
abline(),
grid(), or legend())#I could do weight vs. horsepower?
#I want to measure the correlation between weight and horsepower; horsepower will be the x-axis, weight will be the y-axis. Why? I'm not sure, maybe having weight as a vertical variable and power as a horizontal one feels right.
##I should label the units of measurement for the weight, but the dataset doesn't actually mention what the weight units ARE, so I have no idea.
## https://r-charts.com/base-r/pch-symbols/ I checked the pch values with this site!
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
plot(mtcars$hp, mtcars$wt,
xlab="Horsepower",
ylab="Weight",
main="Car Horsepower vs. Weight",
pch=18,
col="darkred"
)
grid()
Create your own small dataset (at least 5 points). Create a line plot that:
Shows both points and lines (type = “b”)
timepassed <- c(5, 20, 23, 34, 44, 52) #This is the x-axis value in minutes
problemsdone <- c(1,2,3,4,5,6) #This is the y-axis value.
plot(timepassed,problemsdone,
type="b",
xlab="",
ylab="",
pch=9,
lwd=0.9
)
## I had to add blank x-labels and y-labels before this function call... otherwise, it automatically adds the variables as titles and they just overlap with the data in the title() function.
title(main="Time Investment In Organic Chemistry Homework",sub="(I completely made this data up!)", xlab="Minutes Passed", ylab="Problems Completed")