Authors

Li-Neishin Co, Beth Gronski, Linh Pham, and Matthew Shull

Date

Winter 2022

Abstract

Our project will focus on analyzing the public campgrounds within the United States. This is important because people wanting to camp in the United States don’t have an easily accessible place to see data on the campgrounds. Included in this report will be several data visualizations as well as hypothesis testing and classification machine learning modeling.

Keywords

campsites, amenities, location, town_distance, phone

Introduction

Our project will be analyzing information from a United States campground dataset through data visualization, hypothesis testing, and classification modeling. Below are some examples of questions that we will be answering.

  1. Do campgrounds with data missing correlate to smaller campgrounds or more rural areas?
  2. Which states have significantly more campgrounds than others?
  3. Does a certain area have campgrounds with more amenities?
  4. How does the number and type of amenities correlate with distance from town?

There are a high number of parks and campsites within the United States, and along with that, a large group of people who love the outdoors. The questions that we chose are meant to educate those who enjoy camping and for those who are interested in the commonalities between the campsite and its location. Additionally, these questions were selected based on the information that we were personally curious about. As of now, the only list that currently has the campsite information is created by the author of the dataset. That list is more of a color-coded map, but without any analysis done on how the data is connected to each other. Overall, our plan to analyze the dataset in multiple ways will help people interested in camping as well as the students in the classroom interested in understanding the data.

The Dataset

Where did you find the data? Please include a link to the data source

Who collected the data?

How was the data collected or generated?

Why was the data collected?

How many observations (rows) are in your data?

How many features (columns) are in the data?

What, if any, ethical questions or questions of power do you need to consider when working with this data?

What are possible limitations or problems with this data? (at least 200 words)

Implications

After finishing our analysis on this dataset regarding how the data and features are connected to each other, there are several possible implications we can make for policymakers, technologists, and designers. Policymakers may use our data to estimate how much federal funding should go into campsites, taking into account their location, available amenities, and whether they are a state or national parks. They can also use our analysis to lead development projects to improve campgrounds in states with few campsites to encourage outdoor activities. Designers can use the dataset and our analysis as the base and have more data values collected to create a more accurate and thorough analysis. Additionally, certain designers may use our work in their campground designing process when considering the locations and what amenities to have. On the other hands, technologists may want to implement applications or softwares that alert and recommend users of nearby campgrounds, especially in states with few numbers of available and accessible campsites.

Limitations & Challenges

Examining data about campsites and their different amenities can be difficult for various reasons. Two prominent challenges are the limited research that has been conducted on campsites and that the majority of research on amenities and the quality of campsites is from surveying campers. In respect to the limited research, there is not much of a jumping-off point to further research, so we may be challenged in find appropriate questions in focused areas on camping. Furthermore, with the majority of researching on quality being done through survey, there is an issue with accuracy and density. It is hard to determine if they were able to survey a large portion of the population who camps, and if they covered several different campsites. The relatively small numbers in people surveyed and in already conducted research can make if difficult to narrow our analysis and provide the best information for people who camp in the US.

Summary Information

This campground data includes 11408 campgrounds from the United States categorized into different types. The most common type of campground is NF with 3823 camps. From this data we are able to learn about different campground types, how campgrounds are different in each state, and in each city. For example, the state with the most camps is CA with 1195 camps. The city with the most camps is Stanley, ID with 35 camps. The camp that is farthest from civilization is Lower Chiquito in Fresno, CA.

Table: Aggregate Campground Data by State

State Number of Camps Majority Type Number of Majority Proportion of Majority Number of Camps with Hookups Proportion of Camps with Hookups
AK 84 State Recrational Area 41 0.49 65 0.77
AL 110 Utility 2 0.02 85 0.77
AR 225 SP 30 0.13 159 0.71
AZ 243 State Recrational Area 1 0.00 59 0.24
CA 1195 Utility 19 0.02 227 0.19
CO 489 State Recrational Area 1 0.00 102 0.21
CT 16 State Forest 2 0.12 6 0.38
DE 7 SP 5 0.71 5 0.71
FL 165 State Forest 15 0.09 130 0.79
GA 164 Utility 6 0.04 129 0.79
HI 21 State Recrational Area 5 0.24 21 1.00
IA 493 State Recrational Area 7 0.01 397 0.81
ID 469 Utility 7 0.01 94 0.20
IL 159 State Recrational Area 1 0.01 130 0.82
IN 106 State Recrational Area 10 0.09 89 0.84
KS 212 SP 36 0.17 176 0.83
KY 129 Utility 1 0.01 85 0.66
LA 84 Utility 2 0.02 58 0.69
MA 34 Utility 1 0.03 14 0.41
MD 45 State Forest 4 0.09 30 0.67
ME 32 SP 17 0.53 9 0.28
MI 480 Utility 9 0.02 261 0.54
MN 387 State Recrational Area 2 0.01 261 0.67
MO 176 State Forest 13 0.07 129 0.73
MS 133 Utility 3 0.02 105 0.79
MT 463 State Forest 103 0.22 55 0.12
NC 103 State Recrational Area 12 0.12 48 0.47
ND 239 State Recrational Area 3 0.01 182 0.76
NE 259 Utility 3 0.01 191 0.74
NH 44 State Forest 1 0.02 10 0.23
NJ 26 State Forest 9 0.35 10 0.38
NM 205 State Forest 1 0.00 46 0.22
NV 88 State Recrational Area 3 0.03 23 0.26
NY 146 SPR 43 0.29 67 0.46
OH 108 Utility 9 0.08 86 0.80
OK 239 State Forest 1 0.00 211 0.88
OR 687 Utility 3 0.00 178 0.26
PA 112 Utility 9 0.08 84 0.75
RI 8 SP 5 0.62 4 0.50
SC 68 State Forest 1 0.01 55 0.81
SD 165 State Recrational Area 35 0.21 120 0.73
TN 126 Utility 1 0.01 87 0.69
TX 421 Utility 26 0.06 305 0.72
UT 344 SP 39 0.11 42 0.12
VA 102 SP 22 0.22 69 0.68
VT 43 SP 33 0.77 4 0.09
WA 544 Utility 3 0.01 161 0.30
WI 294 State Recrational Area 1 0.00 185 0.63
WV 90 State Forest 24 0.27 44 0.49
WY 223 State Recrational Area 1 0.00 28 0.13

This table shows aggregate information on campgrounds in each state. It is in alphabetical order by state with numerical informaton such as how many campgrounds are in each state, how many of the campgrounds have RV hookups, and more. This is very interesting to see because many states have campground numbers that go into the hundreds, while a few you could count on one hand. There is also information on which type of campground is most common in that state, how many of their campgrounds are that type, and what proportion of campgrounds is the majority. We have included this table to more easily display some campground statistics by state, and how they differ from state to state. With this table you can see some states have a lot of the same type of campgrounds, while others have a diverse array of campgrounds. You can also see that the majority of states have more campgrounds with RV Hookups than without, however there are a few that have very little.

Chart 1: Distance

The violin graph above seeks to show the distribution of the distances between the public campground and its nearest town. As each observation of the dataset represents a public campsite, the distance from the nearest town is recorded for each row in numeric form and, more specifically, in miles, ranging from values as low as 0.0 to ones as high as 50.8. Then, taking into account all public campgrounds, the distribution of these distances to the closest urban area was determined. The wider the width of the violin graph indicates more data points at that certain distance. Additionally, the mean value and one standard deviation above and below mean are also found, representing by the white dot and the line bisecting that dot.

This graph is important and is included because one of the major appeals of campgrounds is whether or not the campsite is accessible to an urban area or not. This accessibility is not only essential for the safety of the campers but may also help improve the economic well-being of the nearby town. As the violin graph illustrates, the majority of the public campgrounds in the United States are less than 20 miles away from a town or a similar urban area. As the distance from the nearest town increases, the number of campsites with that criteria decreases. Moreover, considering all of the campsites in this dataset, the average distance from closest the town is around 10 miles. However, it is worth remembering while analyzing this dataset that the campgrounds included in this dataset are all public, vehicle-accessible, and family-oriented campsites. Those three factors could influence the location of the campgrounds in term of accessibility to towns or nearby urban areas.

Chart 2: Type of Campsites

The reason why I chose to do a bar chart is because it can clearly show the amount of different types of campsites in the dataset. The question that I wanted to answer with this type of chart was “what is the distribution of different campsites”? Although there are many other more complicated charts, a bar chart will be able to show viewers a general numerical amount while simultaneously allowing viewers to compare each of the types of campsites.

The chart shows that there are a significant amount of National Forests (NF), County Parks (CP), and State Parks (SP). While there are still other types of campsites, these three types cover the vast majority of the campsites in the United States. These would make sense as well, since they are applied on a local, state, and federal level.

Chart 3: Number of Campsites per U.S State

The heatmap represents the number of campsites within each particular state in America. Each observation (campsite) within the graph was calculated by the longitude and latitude and categorized by state so we can see visually the differences and similarities in where public campsites are mostly located in the US. The legend shows that the lighter the shade of blue, the closer the quantity of campsites is to 160, while the darker the shade of blue reveals that it’s closer to a smaller amount of campsites within that state.

In evaluating the graph, we can see that of all the states, California holds the highest number of campsites compared to all others. The further east we go in the US the darker and less number of campsites are found which could be due to a variety of reasons like urbanization, location, how family friendly it is. It’s also good to note that the creator of the dataset, Tom Hillegass, chose to only calculate the public campgrounds within the US while not taking into account the private campgrounds as well. Without this information it could alter the colors shown on this heatmap and possibly show a different outcome and quantity of campsites within each state.