By the end of this lesson, you will be able to:
R has powerful plotting tools built directly into the base
language.
Today we will learn how to create two very common types of plots:
These skills form the foundation for data visualization in R.
These functions initiate a new graph in the plotting window. The specific function called depends on the class of the input data.
Examples include:
plot()hist()boxplot()barplot()Arguments can be passed to high-level functions to control appearance.
Common options include:
xlab, ylab – Specify labels for the x and
y axesmain – Sets the main title of the plottype – Character indicating the plot type
"p" for points"l" for lines"b" for bothcol – Specifies colorspch – Defines the plotting symbol (e.g.,
19 for filled circles)lwd – Controls line widthlty – Specifies line type (e.g., 1 =
solid, 2 = dashed)xlim, ylim – Define the limits for the x
and y axesThese functions add features to the current active plot without starting a new one.
points() – Adds pointslines() – Adds connected line segmentsabline() – Adds a straight line
abline(v = value) for vertical linesabline(h = value) for horizontal lineslegend() – Adds a legendtitle() – Adds main, subtitle, xlab, and ylab after the
plot has been createdpar()The par() function is used to set or query global
graphical parameters that affect all subsequent plots during an R
session.
We will begin using a built-in dataset called
mtcars.
Each row represents a car model.
Each column represents a variable such as miles per gallon, horsepower,
or weight.
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
In base R:
First variable → x-axis Second variable → y-axis
plot(mtcars$wt, mtcars$mpg)
plot(
mtcars$wt,
mtcars$mpg,
xlab = "Weight (1000 lbs)",
ylab = "Miles per Gallon",
main = "Fuel Efficiency vs Car Weight"
)
Useful options:
pch → shape, there are numbers for different shapes col → color cex → size, changes the size of the plots
plot(
mtcars$wt,
mtcars$mpg,
pch = 19,
col = "blue",
cex = 1.2,
xlab = "Weight (1000 lbs)",
ylab = "Miles per Gallon",
main = "Fuel Efficiency vs Car Weight"
)
Make a scatterplot with:
1.Horsepower (hp) on x-axis 2.Miles per gallon (mpg) on y-axis 3.Change color and point shape.
plot(
mtcars$hp,
mtcars$mpg,
pch = 18,
col = "purple",
cex = 1.2,
xlab = "Horsepower",
ylab = "Miles per Gallon",
main = "Fuel Efficiency vs Car Horsepower"
)
Line plots connect points in order.
They are best for:
Time series Ordered data Showing trends
year <- c(2000, 2005, 2010, 2015, 2020)
population <- c(282, 295, 309, 321, 331)
plot(year, population, type = "l")
#type = "l" means line, "b"= both line and dots, no letter is dots
plot(year, population, type = "b")
plot(year, population, type = "l") +
points(year,population)
## integer(0)
plot(
year,
population,
type = "l",
lwd = 2, #line width
col = "darkgreen",
xlab = "Year",
ylab = "Population (millions)",
main = "Population Growth Over Time"
)
plot(
year,
population,
type = "b",
pch = 19,
col = "purple",
lwd = 2,
xlab = "Year",
ylab = "Population (millions)",
main = "Population Growth Over Time"
)
Use scatterplots when:
Use line plots when:
Create your own line plot using the example data below.
These data represent test scores over five weeks.
Make a line plot that:
week <- c(1, 2, 3, 4, 5)
score <- c(70, 75, 78, 85, 90)
plot(
week,
score,
type = "b",
lwd = 2,
col = "darkgreen",
xlab = "Week",
ylab = "Score",
main = "Week vs. Score"
)
For this homework, you will create:
You will also use at least ONE new base R plotting feature that we did not practice in class.
Choose at least ONE of the following new features (you can also find a different one if you’d like):
abline() → adds a trend linegrid() → adds background grid lineslegend() → adds a legend to your plot(You only need to choose ONE, but you may use more if you’d like.)
Use the mtcars dataset.
Create a scatterplot that:
abline(),
grid(), or legend())head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
plot(
mtcars$wt,
mtcars$hp,
xlab = "Car Weight",
ylab = "Horsepower",
main = "Car Weight Compared to Horsepower",
col = "blue",
pch = 2,
cex = 1
)
grid( col = "pink",lwd = 3)
Create your own small dataset (at least 5 points). Create a line plot that:
Shows both points and lines (type = “b”)
duration = c(10, 20, 30, 40, 50, 60)
heart_rate = c(95, 110, 125, 140, 150, 160)
plot(
duration,
heart_rate,
xlab = "Minutes Excersized",
ylab = "Avg. Heart Rate (bpm)",
main = "Workout Duration vs. Heart Rate",
type = "b",
lwd = 3,
lty = "dashed"
)
legend("topleft",
legend = "heart_rate",
lty = "dashed",
lwd = 3,
pch = 16)