Intro

We will focus today on R packages - what they are, how to install and use them, and why you would want to. We’ll look especially closely at a package called R Markdown. You’ll use it to produce your lab exercises, starting with this Friday’s lab exercise. Before we look at packages, though, we’ll circle back to Monday’s data exercise and look at the solutions to the tasks it required.

Why use packages?

As you saw in Monday’s lesson, functions make R do things. Many functions are included in “base R,” the basic version of R you get when you install R on a computer. Others functions, though, have to be added to R by installing a package. Packages contain “add-on” functions that extend R’s capabilities and make doing things in R easier than they would be in base R.

An example

To see what I mean, use RStudio to run this part of the fair market rent script from Monday’s lesson:

# ----------------------------------------------------------
# Install & load required packages
# ----------------------------------------------------------

if (!require("tidyverse"))
  install.packages("tidyverse")
if (!require("gt"))
  install.packages("gt")

library(tidyverse)
library(readxl)
library(gt)

# ----------------------------------------------------------
# Download HUD SAFMR Excel file
# ----------------------------------------------------------

download.file(
  "https://www.huduser.gov/portal/datasets/fmr/fmr2026/fy2026_safmrs.xlsx",
  "rent.xlsx",
  mode = "wb"
)

# ----------------------------------------------------------
# Read Excel data
# ----------------------------------------------------------

FMR <- read_xlsx(path = "rent.xlsx", .name_repair = "universal")

Among other things, the code will produce a data frame called FMR that contains current fair market rent data for all U.S. ZIP codes monitored by HUD’s Small-Area Fair Market Rent program. Click on the data frame in RStudio’s “Environment” area, and you’ll be able to scroll through 18 variables’ worth of rent-related information about 51,895 U.S. ZIP codes.

Filtering the ugly way

But we aren’t interested in all 51,895 ZIP codes. We want just the dozen ZIP codes in Rutherford County. Here’s one way to find those dozen ZIP codes and stash them in a new data frame called FMR_RuCo:

FMR_RuCo <- FMR[FMR$ZIP.Code == "37127" |
                  FMR$ZIP.Code == "37128" |
                  FMR$ZIP.Code == "37129" |
                  FMR$ZIP.Code == "37130" |
                  FMR$ZIP.Code == "37132" |
                  FMR$ZIP.Code == "37085" |
                  FMR$ZIP.Code == "37118" |
                  FMR$ZIP.Code == "37149" |
                  FMR$ZIP.Code == "37037" |
                  FMR$ZIP.Code == "37153" |
                  FMR$ZIP.Code == "37167" |
                  FMR$ZIP.Code == "37086", ]

The code works. Go ahead and run it, in fact. The code is a pain to type, though. The FMR$ZIP.Code == part gets repeated over and over, once for each ZIP code. So does the |, which means “or” in base R. The code means, “Include an FMR row of data in the new FMR_RuCo data frame only if the row’s ZIP.Code value is 37127 OR 37128 OR 37129,” and so on, up to 37086. All that repetition increases the chances of including a typo that will break the code. And what’s with that seemingly random , at the end? I have no idea … but the code won’t work unless you remember to include it.

Filtering the pretty way

Let’s try a different way. To prove that it works, run this code to delete the FMR_RuCo data frame that the ugly code produced:

rm(FMR_RuCo)

… then run this code, which will redo the filtering operation and create the FMR_RuCo data frame, but with far simpler - and more intuitive - syntax.

# ----------------------------------------------------------
# Rutherford County ZIP Codes
# ----------------------------------------------------------

ZIPList <- c(
  "37127", "37128", "37129", "37130", "37132",
  "37085", "37118", "37149", "37037", "37153",
  "37167", "37086"
)

# ----------------------------------------------------------
# Filter, select columns, and rename
# ----------------------------------------------------------

FMR_RuCo <- FMR %>%
  filter(ZIP.Code %in% ZIPList)

A package made the difference

The ugly code uses base R. The pretty code uses the filter() function, which is part of the dplyer package, which is part of the tidyverse package loaded at the start of the script.

The R Markdown package

More later about other packages. Let’s look at one you’ll need to start using right away. The R Markdown package is essentially a basic word processor capable of incorporating R code and output into a document, then publishing the document on the Web.

Installing R Markdown

R Markdown comes pre-installed in RStudio. If it didn’t for some reason, though, you can install it using the install.packages() function:

install.packages("rmarkdown")

Using R Markdown

Watch this R Markdown demo video on YouTube to learn how to create and publish a research report using R Markdown. The video’s URL is: https://youtu.be/nThLoGg8Sdg?si=5ez0d79-Snx6b95U. The code shown in the video, as well as the table produced and displayed in the report, are from a previous semester and won’t match the code and table you will be working with. But the process will be the same.

As the video explains: R Markdown is mostly a point-and-click editor. But you have to use manually-typed formatting commands called “code chunk options” to control what R Markdown does with the blocks of code you include. The commands go at the top of each code chunk, inside the {r} command at the top of the chunk, and preceded by a comma.

Basically, there are three chunk code options you will need:

{r, include=FALSE} tells R Markdown to run the code in the block, but wholly in the background, without displaying the code itself or any of the output the code produces. When formatting your R Markdown as a report, this is the code to put in the first code chunk, along with the entire script you want to use. If warnings or messages ever do show up once the document is “knitted,” (that is, compiled), you can suppress them by expanding the code chunk option to read: {r, include=FALSE, message=FALSE, warning=FALSE}. Note that FALSE is capitalized.
{r, echo=FALSE} tells R Markdown to run the code in the block, but display only the output from the code instead of showing both the code and the output. So, suppose you want to display a table produced, but not shown, when R Markdown ran the code in the {r, include=FALSE} code chunk. Simply put the name of the table, like FMR_RuCo_table, in a code chunk that begins with this option. The FMR_RuCo_table code will tell R Markdown to show the table, but the {r, echo=FALSE} will tell R Markdown to do it without showing the FMR_RuCo_table code. Here, again, you can suppress any warnings or messages that happen to show up by expanding the chunk code option to read {r, echo=FALSE, message=FALSE, warning=FALSE}.
Finally, {r, eval=FALSE} tells R Markdown to display the code in the chunk but do nothing else with it - that is, don’t show its output, run it, or even check it for errors. When producing a report-style R Markdown document, this is the option you use in the final code chunk, the one that displays the R code for whatever output you displayed in the report.

In-class exercise

First, here is the complete basic fair market rent script from Monday, which you can copy and paste:

# ----------------------------------------------------------
# Install & load required packages
# ----------------------------------------------------------

if (!require("tidyverse"))
  install.packages("tidyverse")
if (!require("gt"))
  install.packages("gt")

library(tidyverse)
library(readxl)
library(gt)

# ----------------------------------------------------------
# Download HUD SAFMR Excel file
# ----------------------------------------------------------

download.file(
  "https://www.huduser.gov/portal/datasets/fmr/fmr2026/fy2026_safmrs.xlsx",
  "rent.xlsx",
  mode = "wb"
)

# ----------------------------------------------------------
# Read Excel data
# ----------------------------------------------------------

FMR <- read_xlsx(path = "rent.xlsx", .name_repair = "universal")

# ----------------------------------------------------------
# Rutherford County ZIP Codes
# ----------------------------------------------------------

ZIPList <- c(
  "37127", "37128", "37129", "37130", "37132",
  "37085", "37118", "37149", "37037", "37153",
  "37167", "37086"
)

# ----------------------------------------------------------
# Filter, select columns, and rename
# ----------------------------------------------------------

FMR_RuCo <- FMR %>%
  filter(ZIP.Code %in% ZIPList) %>%
  select(
    ZIP.Code,
    SAFMR.0BR,
    SAFMR.1BR,
    SAFMR.2BR,
    SAFMR.3BR,
    SAFMR.4BR
  ) %>%
  distinct()

colnames(FMR_RuCo) <- c("ZIP", "Studio", "BR1", "BR2", "BR3", "BR4")

# ----------------------------------------------------------
# Basic GT table
# ----------------------------------------------------------

FMR_RuCo_table <- gt(FMR_RuCo) %>%
  tab_header(title = "Rutherford FMR, by size and ZIP") %>%
  cols_align(align = "left")

FMR_RuCo_table

Use R Markdown, the above script, what you learned from the video, and the include=FALSE, echo=FALSE, and eval=FALSE code chunk options to create an R Markdown document that looks something like what you see below after you knit it. If you can get it to work, then Friday’s graded lab exercise will be a breeze, because all you have to do is publish the report on RPubs.com and send me the URL.

Your R Markdown-produced report should look something like this:

Rent script

(your name)
(the date)

Fair markent rent in Rutherford County ZIP codes

Here are fair market rents for various apartment sizes in Rutherford County ZIP codes.

ZIP	Studio	BR1	BR2	BR3	BR4
Rutherford FMR, by size and ZIP
37037	1900	1990	2180	2790	3400
37085	1320	1380	1520	1940	2360
37086	1730	1820	1990	2540	3100
37118	1150	1170	1320	1660	2020
37127	1360	1420	1560	1990	2430
37128	1570	1640	1800	2300	2800
37129	1570	1640	1800	2300	2800
37130	1280	1340	1470	1880	2290
37132	1280	1340	1470	1880	2290
37149	1150	1180	1320	1660	2020
37153	1670	1750	1920	2450	2990
37167	1430	1500	1640	2100	2560