About the data

This data, from the National Weather Service, contains data on weather forecasts and observations for 167 American cities across 16 months, beginning in January 2021, along with data on these cities. Forecasts were recorded at a few different intervals before the observation was made, but I chose to focus on the 12 hour forecast.

forecast_cities <- read_csv("data/forecast_cities.csv")
outlook_meanings <- read_csv("data/outlook_meanings.csv")
weather_forecasts <- read_csv("data/weather_forecasts.csv")
states <- map_data("state")

Comparing states by month

I chose to compare the difference between the high temperature predicted in the 12-hour forecast and the observed temperature, both across different states and for each month of the year.

weather_cities <- weather_forecasts %>%
  mutate(date = ymd(date)) %>%
  mutate(month = month(date, label = TRUE)) %>%
  left_join(forecast_cities, by = c("city", "state")) %>%
  filter(forecast_hours_before == 12 & high_or_low == "high") %>%
  group_by(state, month) %>%
  summarize(mean_temp_diff = mean(observed_temp - forecast_temp, na.rm = TRUE))

# convert state names from abbreviation to full name
state_names = tibble(
  state = state.abb,
  name = tolower(state.name)
)

weather_cities <- weather_cities %>%
  left_join(state_names, by = "state")

ggplot(data = weather_cities) + 
  geom_map(aes(
    map_id = name, 
    fill = mean_temp_diff), 
    map = states,
    color = "white",
    lwd = 0.1) +
  expand_limits(x = states$long, y = states$lat) +
  labs(x="latitude", 
       y= "longitude", 
       title = "Mean Difference Between Predicted and Observed High Temperatures") +
  colorspace::scale_fill_continuous_diverging(palette = "Blue-Red 3") +
  labs(fill = "Mean Difference\n(Obs. - Pred.)") +
  facet_wrap(vars(month)) +
  coord_map() +
  theme_map() +
  theme(legend.position = "right")

Results

At first glance, it appears that the observed temperature tends to be higher on average than what was predicted. This is true across the country, but particularly so in the interior of the country (i.e. not the coasts). Colder months like January and February were warmer than expected in the majority of states, as were the months from September to November, while warm months like July were cooler than expected in most states, though this pattern is not as clear in other months.

Differences between coastal and landlocked regions of the U.S. may reflect the more volatile weather patterns of the interior of the country, compared with the more stable patterns of the coasts. In addition, climate forecasts may be more likely to underestimate the actual temperature as climate change accelerates, though confirming this would require comparision with data from earlier years.

Comparing U.S. Temperature Forecast with Observation by Month

Elena Parkerson

About the data

Comparing states by month

Results