Introduction

The US National Park Service publishes how many people visit national parks per year. With this data, citizens can ask and answer questions such as:

The following report creates a series of charts that visualize the annual number of visitors to national parks as well as the relationship between visitor numbers and park size in an attempt to answer these questions.

The original National Parks Visitation dataset contains visitor data at all 63 national parks from 1979 to 2024. However, this report focuses on the 19 national parks located in the intermountain region from 2015 to 2024. The entire dataset was obtained from the Responsible Datasets in Context website (Walsh and Keyes, 2024).


Data Preparation

Prepare the workspace by loading numerous packages.

library(tidyverse)
library(here)
library(dplyr)
library(scales)

The National Parks Visitation data (NP_RecreationVisits.csv) is loaded here and needs minor cleaning. The column name RecreationVisits is shortened to Visitors for ease of reference. The entire dataset is filtered to only retain the national parks in the intermountain region as well as the most recent ten years of visitor data (i.e., 2015 to 2024). The Year column is changed from a double to a factor data type; this process allows the plots to treat the years as a category rather than as a continuous variable. This cleaned, subset of data is exported as a csv, has had the parks’ areas added to it in Microsoft Excel (with Esri’s area data), and then is loaded back into this report for additional analysis (intermountain_area.csv).

# Load CSV data
parks <- read_csv(here("Data", "NP_RecreationVisits.csv"))

# Shorten column name, filter to intermountain region, filter years 2015 to 2024
# Save changes to a new variable
intermountain <- rename(parks, Visits = RecreationVisits) %>%
  filter(Region == "Intermountain", Year>=2015)

# Change the Year to a factor data type and overwrite the original file
intermountain$Year <- as.factor(intermountain$Year)

# Export clean data subset by removing the comment below (#)
# write_csv(parks, "intermountain_area.csv")

# Add area data in software of choice, then...
# Load cleaned, subsetted data with area data
parkarea <- read_csv(here("Data", "intermountain_area.csv"))

Analysis

Number of Visitors to National Parks

These faceted point charts show the yearly number of visitors to each national park as a function of time (i.e., yearly from 2015 to 2024) in an attempt to answer the first question “How have visitor numbers changed over the past ten years at national parks?”.

# Create point plots showing visit numbers by year 
intermountain %>%
  ggplot(aes(x = Year, y = Visits)) + 
  geom_point() +
  
  # Create individual charts for each park based on name (wrap park name)
  facet_wrap(~ParkName, labeller = label_wrap_gen()) +
  
  # Include comma separators for the visitor count (y-axis)
  # Provides the benefit of standard notation
  scale_y_continuous(labels = scales::label_comma()) +
  
  # Use a simple pre-programmed theme
  theme_bw() +
  
  # Change the x-axis text direction to improve legibility
  theme(axis.text.x = element_text(angle = 90)) +
  
  # Label the chart
  labs(title = "Number of Visitors to National Parks",
       subtitle = "Intermountain Region, 2015 - 2024",
       caption = "Figure 1: Number of visitors to national parks. The annual
       number of visitors to national parks in the intermountain 
       region between 2015 to 2024. Data from Walsh and Keyes, 2024.",
       x = "Years",
       y = "Number of Visitors")

Analyzing the charts reveal that visitor numbers vary throughout the region over time. This data reveals other interesting insights at individual parks as well. For instance:

  • The number of visitors decreased in 2020 in almost all parks. This was likely caused by the pandemic.
  • The Grand Canyon National Park (NP) has a lower number of visitors after 2020 than before 2020. This could be caused by visitor management strategies like timed-entry.
  • Yellowstone NP and Grand Teton NP had dips in visitors in 2022 unlike the other parks. Additional analysis is needed to determine a cause.

Statistics of National Park Visits

This chart attempts to answer the second question “How do visitor numbers compare between national parks?” Presented here is a series of box-and-whisker plots that relay the relative statistics for each national park. A point plot of the yearly number of visitors is overlaid on the respective box-and-whisker plot.

# Create box-and-whisker and point plots of visit levels by park and by year
intermountain %>%
  ggplot(aes(x = ParkName, y = Visits)) + 
  geom_boxplot() +
  
  # Overlay the points on the boxplot, color by year for additional variable
  geom_point(aes(x = ParkName, y = Visits, color = Year)) + 
  
  # Flip axes to improve legibility
  coord_flip()+
  
  # Use a simple pre-programmed theme
  theme_bw() +
  
  # Include comma separators for the visitor count (y-axis)
  scale_y_continuous(labels = scales::label_comma()) +
  
  # Change color scheme to improve legibility
  scale_color_brewer(palette = "Paired") +
  
  # Add labels
  labs(title = "Statistics of National Park Visits",
     subtitle = "Intermountain Region, 2015 - 2024",
     caption = "Figure 2: Statistics of national park visits. The statistics 
     of the annual number of visitors to intermountain national parks 
     between 2015 to 2024. Data from Walsh and Keyes, 2024.",
     x = "National Park",
     y = "Number of Annual Visitors")

# Save this chart by removing the comment below (#)
# ggsave(here("output","NP Visitor Statistics.png"))

Although this chart shows similar data to the prior chart, this one better distinguishes the minimum and maximum number of visitors by year and also facilitates comparison between the parks. For instance:

  • The years 2020 and 2015 tend to have the minimum number of visitors at the majority of these parks. However, Yellowstone NP and Grand Teton NP had the lowest number of visitors in 2022.
  • 2021 is a common year for parks to have their maximum number of visitors.
  • The Grand Canyon NP had the largest range of visitors during the time frame.
  • The number of visitors is relatively stable at some parks, such as Big Bend NP, Black Canyon of the Gunnison NP, and Guadalupe Mountains NP, as evident by their small interquartile ranges.
  • The number of visitors has greater variation at national parks such as Yellowstone NP, Zion NP, and Grand Canyon NP, according to their larger interquartile ranges.

Comparing the Park Area to Its Visitors

The scatterplot attempts to answer the question “Is there a correlation between the park’s area and its average number of visitors?” The parks’ areas are measured in square miles. In the dataset, each park’s area is repeated multiple times (one instance per year), hence the mean calculation in the code below. The average number of visitors is calculated by averaging each park’s visitors per year during the ten-year study. The correlation is modeled with a linear regression line (blue) and its confidence interval (gray). The number of park visitors does not appear correlated to the park’s area because of the lack of data points alongside the linear model or within the confidence interval.

# Create scatter plot of mean visitors and park area
parkarea %>%
  
  # Prepare data by grouping park name and summarizing each park's data
  group_by(ParkName) %>%
  summarise(meanvis = mean(Visits),
            meanarea = mean(Area_mi2)) %>%
  
  # Create scatter plot of mean values
  ggplot(aes(x= meanarea, y = meanvis)) +
  geom_point(aes(color=ParkName))+
  
  # Add linear regression line
  geom_smooth(method = "lm") + 
  
  # Include comma separators for legibility and standard notation
  scale_y_continuous(labels = scales::label_comma()) +
  scale_x_continuous(labels = scales::label_comma()) +
  
  # Use a simple pre-programmed theme
  theme_bw() +
  
  # Add labels
  labs(title = "Comparing the Park's Area to its Average \nNumber of Visitors",
     subtitle = "Intermountain Region, 2015 - 2024",
     caption = "Figure 3: Linear correlation between the park's area and the
     average number of annual visitors. The chart only captures intermountain 
     national parks and their visitors between 2015 to 2024. Area measured is 
     in square miles. Data from Walsh and Keyes, 2024.",
     x = "Area in square miles",
     y = "Average number of visitors")


Source

Esri. (2026). USA Parks. [Data]. Retrieved March 5, 2026 from https://www.arcgis.com/home/item.html?id=f092c20803a047cba81fbf1e30eff0b5.

Walsh, M., and Keyes, Os. 2024. U.S. National Park Visit Data (1979-2024) from Responsible Datasets in Context. [Data]. Retrieved March 16, 2026 from https://www.responsible-datasets-in-context.com/posts/np-data/.