knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.1     ✔ stringr   1.5.2
## ✔ ggplot2   4.0.0     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
  1. Goal of This Code-Through

In this short tutorial, I walk through the basics of ggplot2, the visualization package we have been using in class.

My goal is to show how to build a plot step-by-step, change aesthetics like color and size, and create a quick bar chart.

By the end, someone new to ggplot2 should be able to recreate these plots from scratch.

I use the built-in mpg dataset from ggplot2, which contains information about vehicle fuel efficiency.

glimpse(mpg)
## Rows: 234
## Columns: 11
## $ manufacturer <chr> "audi", "audi", "audi", "audi", "audi", "audi", "audi", "…
## $ model        <chr> "a4", "a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "…
## $ displ        <dbl> 1.8, 1.8, 2.0, 2.0, 2.8, 2.8, 3.1, 1.8, 1.8, 2.0, 2.0, 2.…
## $ year         <int> 1999, 1999, 2008, 2008, 1999, 1999, 2008, 1999, 1999, 200…
## $ cyl          <int> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 8, 8, …
## $ trans        <chr> "auto(l5)", "manual(m5)", "manual(m6)", "auto(av)", "auto…
## $ drv          <chr> "f", "f", "f", "f", "f", "f", "f", "4", "4", "4", "4", "4…
## $ cty          <int> 18, 21, 20, 21, 16, 18, 18, 18, 16, 20, 19, 15, 17, 17, 1…
## $ hwy          <int> 29, 29, 31, 30, 26, 26, 27, 26, 25, 28, 27, 25, 25, 25, 2…
## $ fl           <chr> "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p", "p…
## $ class        <chr> "compact", "compact", "compact", "compact", "compact", "c…
head(mpg)
## # A tibble: 6 × 11
##   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
##   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
## 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
## 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
## 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
## 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
## 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
## 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…
  1. How ggplot2 Works

A ggplot is built using layers. The basic structure is:

ggplot(data = , aes(x = , y = )) +

()

ggplot() tells R which dataset to use

aes() maps variables to positions or colors

geom_*() adds geometric objects such as points or bars

We start with a simple scatterplot.

  1. Scatter Plot: Engine Size vs Highway MPG
ggplot(data = mpg, aes(x = displ, y = hwy)) +
geom_point()

displ is engine size

hwy is highway miles per gallon

We already see that as engine size increases, highway mpg tends to decrease.

  1. Improving the Plot: Color, Labels, and Themes
ggplot(data = mpg, aes(x = displ, y = hwy, color = class)) +
geom_point(size = 2, alpha = 0.8) +
labs(
title = "Engine Size vs Highway MPG by Vehicle Class",
x = "Engine Displacement (liters)",
y = "Highway Miles per Gallon",
color = "Vehicle Class"
) +
theme_minimal()

Updates made:

Colored points by vehicle class

Adjusted point size and transparency

Added readable labels using labs()

Cleaned the background with theme_minimal()

  1. Bar Chart: Counting Cars by Class
ggplot(data = mpg, aes(x = class)) +
geom_bar() +
labs(
title = "Number of Cars in Each Vehicle Class",
x = "Vehicle Class",
y = "Count"
) +
theme_minimal()

Flipping the axes for readability:

ggplot(data = mpg, aes(x = class)) +
geom_bar(fill = "steelblue") +
coord_flip() +
labs(
title = "Cars by Class (Flipped View)",
x = "Vehicle Class",
y = "Count"
) +
theme_minimal()

  1. Faceting: Comparing by Drive Type
ggplot(data = mpg, aes(x = displ, y = hwy, color = class)) +
geom_point() +
facet_wrap(~ drv) +
labs(
title = "Fuel Efficiency Patterns Across Drive Types",
x = "Engine Displacement (liters)",
y = "Highway MPG"
) +
theme_minimal()

Faceting allows us to compare subgroups such as front-wheel, rear-wheel, and 4-wheel drive.

  1. Summary of Key Ideas

In this code-through, I demonstrated how to:

Build a plot with ggplot() and aes()

Add layers like geom_point() and geom_bar()

Customize visuals with color, size, labels, and themes

Use facet_wrap() to split one plot into multiple panels

These steps form the core of most ggplot2 visualizations we use in community analytics.

  1. Ways I Customized My RMD Document

R Markdown allows a lot of customization. Here are the improvements I used (or could have used) to make my document look better:

8.1. HTML Theme + Syntax Highlighting

In the YAML header I applied:

theme: flatly → gives the whole document a clean professional look

highlight: zenburn → makes code blocks easier to read

These small changes immediately improve readability.

8.2. Floating Table of Contents

I added:

toc: true

toc_float: true

This gives an interactive sidebar table of contents, useful for navigation.

8.3. Clean Output Using Chunk Options

To avoid clutter, I turned off warnings and messages:

knitr::opts_chunk$set(message = FALSE, warning = FALSE)

This keeps the document clean and focused.

  1. Consistent Plot Theme

Using theme_minimal() in all plots made them visually consistent with the RMD theme.

  1. Using Headers, Lists, and Spacing

Readable formatting (headings, bullet points, separators) makes the code-through feel like a tutorial rather than a raw dump of code.

Reflection

I used several formatting options to make this RMD document clear and visually appealing. Choosing a clean HTML theme and syntax-highlighting style made both the text and code easier to follow. I added a floating table of contents for better navigation and used consistent ggplot themes to match the overall look. Simple spacing, headings, and bullet points also helped the tutorial read more like an explainer instead of raw code. Overall, these small customizations made the code-through feel more polished and accessible.