1 Introduction

The data used in this analysis came from the National Weather Service and includes sixteen months of weather forecasts and observations from 167 cities. Additionally, a separate data set provides geographical and climatic details about these cities, and other American cities. The analysis utilizes data from the following three files: forecast_cities.csv (city-specific information), outlook_meanings.csv (forecast outlook reference codes), weather_forecasts.csv (forecasted and observed weather data). The outlook_meanings.csv file serves as a reference for interpreting forecast outlook codes used in the weather dataset. These three datasets were joined together using dplyr join functions, and temperature error (temp_error) was computed for each observation to quantify forecast accuracy.

library(dplyr)
library(readr)
library(ggplot2)
library(tidyverse)
library(viridis)

weather_forecasts <- read_csv("data/weather_forecasts.csv")
forecast_cities <- read_csv("data/forecast_cities.csv")
outlook_meanings <- read_csv("data/outlook_meanings.csv")

weather_analysis <- weather_forecasts |> 
    left_join(forecast_cities, by = c("city", "state")) |> 
    left_join(outlook_meanings, by = "forecast_outlook") |>
    mutate(temp_error = forecast_temp - observed_temp) |>  # Calculate error
    select(temp_error, forecast_hours_before, high_or_low,
           lon, lat, elevation, distance_to_coast,
           elevation_change_eight, wind, avg_annual_precip, meaning) |> 
    relocate(temp_error) #new element for the new element requirement

2 Geographic Distribution of Forecast Errors

library(maps)
library(dplyr)
library(ggplot2)

# Get state boundaries
states <- map_data("state")

# Function to create map of forecast errors
create_error_plot <- function(data, temp_type) {
  # Filter data for temperature type
  plot_data <- data %>%
    filter(high_or_low == temp_type)
  
  max_error <- max(abs(plot_data$temp_error), na.rm = TRUE)
  
  ggplot() +
    geom_polygon(data = states, 
                aes(x = long, y = lat, group = group),
                fill = NA, color = "black", size = 0.5) +
    geom_point(data = plot_data,
               aes(x = lon, y = lat,
                   color = abs(temp_error),
                   size = abs(temp_error)),
               alpha = 0.6,
               shape = 21,
               fill = NA) +
    # Color scale with limits
    scale_color_gradientn(
      colors = c("lightgreen", "yellow", "darkred"),
      name = "Forecast Error (°F)",
      breaks = seq(0, max_error, length.out = 5),
      labels = scales::label_number(accuracy = 1),
      guide = guide_colorbar(
        direction = "vertical",
        title.position = "top",
        barwidth = 1,
        barheight = 8,
        ticks = TRUE
      )
    ) +
    scale_size_continuous(
      range = c(1, 4),
      name = "Forecast Error (°F)",
      breaks = seq(0, max_error, length.out = 5),
      labels = scales::label_number(accuracy = 1)
    ) +
    coord_map() +
    labs(title = sprintf("Geographic Distribution of %s Temperature Forecast Errors", 
                        ifelse(temp_type == "high", "High", "Low")),
         subtitle = "(Larger points indicate larger errors)",
         x = "Longitude",
         y = "Latitude") +
    theme_minimal() +
    theme(
      plot.title = element_text(size = 12, face = "bold"),
      plot.subtitle = element_text(size = 10),
      legend.position = "right",
      legend.box = "vertical",
      legend.margin = margin(6, 6, 6, 6),
      legend.title = element_text(face = "bold"),
    )
}

low_plot <- create_error_plot(weather_analysis, "low")
high_plot <- create_error_plot(weather_analysis, "high")

print(low_plot)

print(high_plot)

Higher forecast errors are more prevalent in the eastern United States, particularly in the Midwest and Texas, where certain locations have noticeably large discrepancies. The geographic distribution of low and high temperature forecast errors is highly similar, suggesting that errors in low-temperature predictions and high-temperature predictions tend to be similar among each forecast.

3 Investigation into Why Certain Areas Struggle With Weather Prediction

3.1 Comparing the Top 10% of Forecast Errors to the Full Dataset

# Compared all data to high-error locations
error_analysis <- weather_analysis %>%
  mutate(error_category = ifelse(
    abs(temp_error) > quantile(abs(temp_error), 0.9, na.rm = TRUE),
    "high_error", "normal_error"
  )) %>%
  group_by(high_or_low, error_category) %>%
  summarise(
    avg_elevation = mean(elevation, na.rm = TRUE),
    avg_distance_coast = mean(distance_to_coast, na.rm = TRUE),
    avg_elevation_change_eight = mean(elevation_change_eight, na.rm = TRUE),
    avg_wind = mean(wind, na.rm = TRUE),
    avg_annual_precip = mean(avg_annual_precip, na.rm = TRUE),
    n = n(),
    .groups = "drop"
  ) %>%
  arrange(high_or_low, error_category)

print(as.data.frame(error_analysis))
##   high_or_low error_category avg_elevation avg_distance_coast
## 1        high     high_error      385.8698           383.8344
## 2        high   normal_error      345.0003           315.6293
## 3        high           <NA>      345.1939           317.5408
## 4         low     high_error      537.1471           399.7696
## 5         low   normal_error      331.1456           313.4344
## 6         low           <NA>      345.1362           317.5086
##   avg_elevation_change_eight avg_wind avg_annual_precip      n
## 1                   148.8429 3.468456          37.85841  20429
## 2                   142.5122 3.391335          39.19201 267997
## 3                   142.7964 3.389961          39.03318  37558
## 4                   194.7854 3.413888          32.74012  23540
## 5                   138.3727 3.395337          39.66565 266039
## 6                   142.8076 3.389660          39.03367  36405

To better understand the factors influencing forecast errors, the top 10% of temperature forecast errors were compared to the rest of the dataset. The first summary table indicates that, in high-error cases, low-temperature forecasts are associated with higher average elevation and greater terrain complexity than high-temperature forecasts. However, when considering the overall dataset, locations with forecast errors tend to have lower elevation and lower terrain complexity compared to those in the top 10% of errors. Average wind speed was nearly identical between the top 10% of forecast errors and the full dataset for both high and low temperatures. Average annual precipitation was slightly lower in high-temperature forecast errors within the top 10% of errors.

3.2 Effect of Forecast Time Before Observation on Accuracy

#graph 1
ggplot(weather_analysis, 
       aes(x = factor(forecast_hours_before), 
           y = abs(temp_error), 
           fill = high_or_low)) +
  geom_boxplot(
            coef = 1) +              # Reduce whisker length
  coord_cartesian(y = c(0, 10)) +      # Zoom in on boxes
  scale_fill_viridis_d() +
  labs(title = "Forecast Error by Hours Before Observation",
       x = "Hours Before Observation",
       y = "Absolute Temperature Error (°F)",
       fill = "Temperature Type") +
  theme_minimal()

To determine whether forecast lead time (the number of hours before observation) influenced error magnitude, box plots were created comparing absolute temperature error across different forecast lead times (12, 24, 36, and 48 hours before observation). Although the interquartile range was significantly higher for low-temperature forecasts at 36 and 48 hours before observation, the median error remained the same across all lead times. This indicates that forecast lead time does not significantly impact absolute temperature error within the relatively short timeframes analyzed. This matches expectations, as short-term weather forecasts (within 48 hours) are typically stable in accuracy. However, extending the analysis to forecasts made further in advance may reveal a stronger relationship between lead time and error.

3.3 Forecast Outlook and Prediction Errors

# Calculate medians for each meaning and temperature type
error_summary <- weather_analysis %>%
  group_by(meaning, high_or_low) %>%
  summarise(
    median_error = median(abs(temp_error), na.rm = TRUE),
    mean_error = mean(abs(temp_error), na.rm = TRUE),
    .groups = 'drop'
  )

# Create bar plot with medians
ggplot(error_summary, 
       aes(x = factor(meaning), 
           y = median_error, 
           fill = high_or_low)) +
  geom_col(position = "dodge", width = 0.8) +
  scale_fill_manual(values = c("high" = "#5B4B8A", "low" = "#92B4EC"),
                    labels = c("High Temperature", "Low Temperature")) +
  labs(title = "Median Forecast Error by Outlook",
       x = "Outlook",
       y = "Median Absolute Temperature Error (°F)",
       fill = "Temperature Type") +
  theme_minimal() +
  theme(
    legend.position = "top",
    panel.grid.minor = element_blank(),
    axis.text.x = element_text(angle = 45, hjust = 1)  # Rotate x labels for better readability
  )

The median absolute temperature error was generally around 2°F. However, high-temperature forecasts exhibited the largest errors (6°F) when the forecast outlook was missing. This suggests that unusual or complex weather conditions, which were not categorized under a specific outlook, posed greater forecasting challenges. High-temperature predictions were also more difficult during drizzle and fog, with median forecast errors around 3°F. Low-temperature predictions had higher errors during conditions such as blowing snow, dust storms, freezing drizzle, heavy cloud cover, and strong winds, with median errors around 3°F. These findings suggest that certain weather conditions may increase error rates among high and low temperatures.

3.4 Spearman Correlation Analysis

# Function to calculate Spearman correlations by group
get_correlations <- function(data, x_var) {
  # For high temperatures
  high_data <- data[data$high_or_low == "high", ]
  high_spearman <- cor.test(
    high_data[[x_var]],
    abs(high_data$temp_error),  
    method = "spearman"
  )
  
  # For low temperatures
  low_data <- data[data$high_or_low == "low", ]
  low_spearman <- cor.test(
    low_data[[x_var]],
    abs(low_data$temp_error),
    method = "spearman"
  )
  
  return(list(
    high = list(
      spearman = high_spearman
    ),
    low = list(
      spearman = low_spearman
    )
  ))
}

# Calculate correlations for each variable
variables <- c("elevation", "distance_to_coast", "elevation_change_eight", "wind",  "avg_annual_precip")
names(variables) <- c("Elevation", "Distance to Coast", "Elevation Change", "wind", "avg_annual_precip")

# Summary data frame
results <- data.frame(
  Variable = character(),
  Temp_Type = character(),
  Spearman_Rho = numeric(),
  Spearman_P = numeric(),
  stringsAsFactors = FALSE
)

for(var in variables) {
  corr <- get_correlations(weather_analysis, var)
  
  # High temperature results
  results <- rbind(results, data.frame(
    Variable = names(variables)[variables == var],
    Temp_Type = "High",
    Spearman_Rho = corr$high$spearman$estimate,
    Spearman_P = corr$high$spearman$p.value
  ))
  
  # Low temperature results
  results <- rbind(results, data.frame(
    Variable = names(variables)[variables == var],
    Temp_Type = "Low",
    Spearman_Rho = corr$low$spearman$estimate,
    Spearman_P = corr$low$spearman$p.value
  ))
}

# Format p-values and correlations
results$Spearman_Rho <- round(results$Spearman_Rho, 3)
results$Spearman_P <- format.pval(results$Spearman_P, digits = 3)

print(results)
##               Variable Temp_Type Spearman_Rho Spearman_P
## rho          Elevation      High        0.049    < 2e-16
## rho1         Elevation       Low        0.106    < 2e-16
## rho2 Distance to Coast      High        0.080    < 2e-16
## rho3 Distance to Coast       Low        0.077    < 2e-16
## rho4  Elevation Change      High        0.024    < 2e-16
## rho5  Elevation Change       Low        0.091    < 2e-16
## rho6              wind      High        0.047    < 2e-16
## rho7              wind       Low        0.013   2.24e-12
## rho8 avg_annual_precip      High       -0.013   6.05e-12
## rho9 avg_annual_precip       Low       -0.113    < 2e-16

A Spearman correlation test was conducted to assess the influence of various geographical and meteorological factors on absolute temperature error. The results indicate that all correlations were statistically significant (p < 0.05), but their magnitudes were weak. Elevation and average annual precipitation exhibited the strongest correlation with low-temperature forecast errors (ρ = 0.106 and ρ = -0.113 respectively). Distance to the coast, elevation change, and wind speed had even weaker effects on temperature forecast errors. While all correlations were statistically significant, their small magnitudes indicate that forecast errors are likely driven by different factors instead of geographic features.

3.5 ANOVA Results on Weather Outlook Influence

# ANOVA for high temperature errors
high_temp_anova <- aov(abs(temp_error) ~ meaning, 
                      data = subset(weather_analysis, high_or_low == "high"))

# ANOVA for low temperature errors
low_temp_anova <- aov(abs(temp_error) ~ meaning, 
                     data = subset(weather_analysis, high_or_low == "low"))

# Get summary of results
summary(high_temp_anova)
##                 Df  Sum Sq Mean Sq F value Pr(>F)    
## meaning         22   32770  1489.5   340.7 <2e-16 ***
## Residuals   288125 1259523     4.4                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 37836 observations deleted due to missingness
summary(low_temp_anova)
##                 Df  Sum Sq Mean Sq F value Pr(>F)    
## meaning         22   27535  1251.6   276.7 <2e-16 ***
## Residuals   289275 1308498     4.5                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 36686 observations deleted due to missingness

An ANOVA (Analysis of Variance) test was conducted to determine whether weather outlook (forecast conditions) significantly affects forecast errors. The results show that weather outlook has a highly significant impact, with p-values < 2e-16 and large F-values (340.7 and 276.7), confirming that forecast errors vary significantly depending on weather conditions.

However, despite this strong statistical significance, weather outlook alone does not explain most of the forecast error variance. The residual variance remains high (Sum of Squares: 32,770 vs. 1,259,523 and 27,535 vs. 1,308,498), indicating that other factors are contributing to forecast inaccuracies. Additionally, 37,836 and 36,686 observations were removed due to missing data, suggesting that certain weather conditions may have more incomplete records, which could affect the overall analysis.

4 Conclusion

This analysis examined temperature forecast errors across 167 U.S. cities, and considered elevation, distance to coast, elevation change (eight points), wind, average annual precipitation, and outlook. The results indicate that while elevation, distance to coast, elevation change, wind,and average annual precipitation, slightly impact low-temperature forecast errors, their effects are weak. Weather outlook plays a statistically significant role in error magnitude, particularly during complex weather conditions, but does not fully explain the observed variations. The time before observation (12-48 hours) does not significantly influence error rates, suggesting that short-term forecasts maintain consistent accuracy. Overall, the findings suggest that temperature forecast errors arise from a combination of factors related to outlook and other measures, instead of geographic features or wind speed.