Import data

# excel file
data <- read_excel("../00_data/myData.xls")
data
## # A tibble: 10,992 × 11
##     ...1 city          country description location state state_abbrev longitude
##    <dbl> <chr>         <chr>   <chr>       <chr>    <chr> <chr>        <chr>    
##  1     1 Ada           United… "Ada witch… Ada Cem… Mich… MI           -85.5048…
##  2     2 Addison       United… "A little … North A… Mich… MI           -84.3818…
##  3     3 Adrian        United… "If you ta… Ghost T… Mich… MI           -84.0356…
##  4     4 Adrian        United… "In the 19… Siena H… Mich… MI           -84.0175…
##  5     5 Albion        United… "Kappa Del… Albion … Mich… MI           -84.7451…
##  6     6 Albion        United… "A mysteri… Riversi… Mich… MI           -84.7530…
##  7     7 Algoma Towns… United… "On a wind… Hell's … Mich… MI           NA       
##  8     8 Algonac       United… "Morrow Ro… Morrow … Mich… MI           -82.5762…
##  9     9 Allegan       United… "People re… Elks Lo… Mich… MI           -85.8415…
## 10    10 Allegan       United… "Various g… The Gri… Mich… MI           -85.8575…
## # ℹ 10,982 more rows
## # ℹ 3 more variables: latitude <chr>, city_longitude <chr>, city_latitude <chr>

State one question

Are hauntings more prevalent in certain states and are there any variations by longitude/latitude of the place?

Plot data

ggplot(data = data) + geom_bar(mapping = aes(x = state)) + coord_flip()

data_clean <- data %>%
    # Remove missing values
    na.omit()%>%
    # Filter data for mainland U.S. based on latitude and longitude
    mutate(latitude = as.numeric(latitude),
           longitude = as.numeric(longitude)) %>%
    filter(latitude >= 24.396308 & latitude <= 49.384358, longitude >= -125.0 & longitude <= -66.93457)
# Create a spatial points data frame for mapping 
mymap <- st_as_sf(data_clean, coords = c("longitude", "latitude"), crs = 4326)
# Create the map using mapview 
mapview(mymap, cex = 1, #customize marker size 
        alpha = 0.1) # control transparency

Interpret

Hauntings are reported more in California and Texas in the United States. This can be seen in the bar chart which shows California with the highest reported cases and Texas with the second highest count. This is most likely due to the fact that these two states are more densely populated and would therefore have more reportings. The map plot of the longitude and latitudes also shows that the hauntings are reported more in Texas and California specifically, but in general there seems to be more reportings on the east coast compared to the west coast.