Learning Objectives

By the end of this lesson, you will be able to:


Introduction

R has powerful plotting tools built directly into the base language.
Today we will learn how to create two very common types of plots:

These skills form the foundation for data visualization in R.

Key Components of Base R Plotting

High-level functions

These functions initiate a new graph in the plotting window. The specific function called depends on the class of the input data.

Examples include:

  • plot()
  • hist()
  • boxplot()
  • barplot()

Customization parameters (graphical parameters)

Arguments can be passed to high-level functions to control appearance.

Common options include:

  • xlab, ylab – Specify labels for the x and y axes
  • main – Sets the main title of the plot
  • type – Character indicating the plot type
    • "p" for points
    • "l" for lines
    • "b" for both
  • col – Specifies colors
  • pch – Defines the plotting symbol (e.g., 19 for filled circles)
  • lwd – Controls line width
  • lty – Specifies line type (e.g., 1 = solid, 2 = dashed)
  • xlim, ylim – Define the limits for the x and y axes

Low-level functions

These functions add features to the current active plot without starting a new one.

  • points() – Adds points
  • lines() – Adds connected line segments
  • abline() – Adds a straight line
    • abline(v = value) for vertical lines
    • abline(h = value) for horizontal lines
  • legend() – Adds a legend
  • title() – Adds main, subtitle, xlab, and ylab after the plot has been created

Global parameters with par()

The par() function is used to set or query global graphical parameters that affect all subsequent plots during an R session.


Part 1: Exploring a Dataset

We will begin using a built-in dataset called mtcars.

Each row represents a car model.
Each column represents a variable such as miles per gallon, horsepower, or weight.

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Part 2: Scatterplots

In base R:

First variable → x-axis Second variable → y-axis

plot(mtcars$wt, mtcars$mpg)

Adding Labels and Title

plot(
  mtcars$wt,
  mtcars$mpg,
  xlab = "Weight (1000 lbs)",
  ylab = "Miles per Gallon",
  main = "Fuel Efficiency vs Car Weight"
)

Customizing Points

Useful options:

pch → shape col → color cex → size

plot(
  mtcars$wt,
  mtcars$mpg,
  pch = 17,
  col = "#FFFF00",
  cex = 4.2,
  xlab = "Weight (1000 lbs)",
  ylab = "Miles per Gallon",
  main = "Fuel Efficiency vs Car Weight"
)

Practice Project: Scatterplots

Make a scatterplot with:

1.Horsepower (hp) on x-axis 2.Miles per gallon (mpg) on y-axis 3.Change color and point shape.

plot(mtcars$hp, mtcars$mpg,
     pch=69,
     col="red",
     xlab="Horsepower",
     ylab="Miles per gallon"
  
  
)

Part 3: Line Plots

Line plots connect points in order.

They are best for:

Time series Ordered data Showing trends

year <- c(2000, 2005, 2010, 2015, 2020)
population <- c(282, 295, 309, 321, 331)

Basic Line Plot

plot(year, population, type = "l")

#type = "l" means line.

Adding Labeling and Styling

plot(
  year,
  population,
  type = "l",
  lwd = 2,
  col = "darkgreen",
  xlab = "Year",
  ylab = "Population (millions)",
  main = "Population Growth Over Time",
)

  points(year, population)

Line Plus Points

plot(
  year,
  population,
  type = "b",
  pch = 19,
  col = "purple",
  lwd = 2,
  xlab = "Year",
  ylab = "Population (millions)",
  main = "Population Growth Over Time"
)

Scatterplots vs Line Plots

Use scatterplots when:

  1. Comparing two variables
  2. Each point is independent

Use line plots when:

  1. Data are ordered
  2. Showing change over time

Practice: Line Plots

Create your own line plot using the example data below.

These data represent test scores over five weeks.

Make a line plot that:

  1. Uses week on the x-axis
  2. Uses score on the y-axis
  3. Shows BOTH points and lines
  4. Has labeled axes
  5. Includes a title
  6. Changes the color and line width
week <- c(1, 2, 3, 4, 5)
score <- c(70, 75, 78, 85, 90)

plot(week, score,
     type="b",
     xlab="Week",
     ylab="Score",
     main="Week-Score Correlation",
     col="blue",
     lwd=0.9
  
  
)

Homework:

For this homework, you will create:

You will also use at least ONE new base R plotting feature that we did not practice in class.

Choose at least ONE of the following new features (you can also find a different one if you’d like):

(You only need to choose ONE, but you may use more if you’d like.)

Part A: Scatterplot

Use the mtcars dataset.

Create a scatterplot that:

  • Uses two numeric variables of your choice
  • Includes axis labels
  • Includes a title
  • Customizes point color or shape
  • Includes ONE new plotting feature (abline(), grid(), or legend())

#I could do weight vs. horsepower?

#I want to measure the correlation between weight and horsepower; horsepower will be the x-axis, weight will be the y-axis. Why? I'm not sure, maybe having weight as a vertical variable and power as a horizontal one feels right.
##I should label the units of measurement for the weight, but the dataset doesn't actually mention what the weight units ARE, so I have no idea.
## https://r-charts.com/base-r/pch-symbols/ I checked the pch values with this site!
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
plot(mtcars$hp, mtcars$wt,
     xlab="Horsepower",
     ylab="Weight",
     main="Car Horsepower vs. Weight",
     pch=18,
     col="darkred"
)

grid()

Part B: Line Plot

Create your own small dataset (at least 5 points). Create a line plot that:

Shows both points and lines (type = “b”)

  1. Includes axis labels 2.Includes a title
  2. Customizes line appearance
  3. Includes ONE new plotting feature (abline(), grid(), or legend())
timepassed <- c(5, 20, 23, 34, 44, 52) #This is the x-axis value in minutes
problemsdone <- c(1,2,3,4,5,6) #This is the y-axis value.

plot(timepassed,problemsdone,
  type="b",
  xlab="",
  ylab="",
  pch=9,
  lwd=0.9
)

## I had to add blank x-labels and y-labels before this function call... otherwise, it automatically adds the variables as titles and they just overlap with the data in the title() function.

title(main="Time Investment In Organic Chemistry Homework",sub="(I completely made this data up!)", xlab="Minutes Passed", ylab="Problems Completed")