Predicting Weather Forecast Accuracy
This report evaluates the spatial determinants of atmospheric temperature forecasting error across the continental United States and Alaska. Using sixteen months of National Weather Service (NWS) data spanning 167 cities, I quantified accuracy by calculating the absolute difference between forecasted and observed temperatures. The primary objective centered on identifying geographic patterns, specifically latitude and proximity to the ocean, that correlate with predictive errors.
I found a distinct linear correlation between latitude and mean forecast error. High-latitude cities consistently exhibit higher levels of predictive error. Fairbanks represents the most significant outlier with a mean error of more than 4°F. Similar patterns are present in other high latitude cities such as Casper, Helena, and Missoula. In these northern cities, average forecast error is above 3°F. This decrease in accuracy suggests that NWS models face increased difficulty simulating temperature fluctuations in high-latitude climates.
Proximity to the ocean serves as a critical secondary factor correlating with high error. Forecasts for coastal cities remain more stable and predictable. This is likely due to the thermoregulating influence of the oceans. Conversely, continental interior cities often experience heightened error rates in predicting temperature. As seen by the plot, light blue points (representing close proximity to the Ocean) tend to have low average forecast errors while orange points often have higher average forecast errors. These data confirm that accuracy of weather forecast models decreases as distance from the coast increases. Inland regions such as Helena are isolated from oceanic stability. They present significant challenges for current predictive models, particularly when coupled with high latitudes.
Portfolio 4: Added Interactivity
This portfolio interactivity update added three interactive components to add to the analysis and user experience with the original report. The original ggplot is not an interactive plotly object. Before I had only 4 cities labled, now with plotly the user can hover over points see the city, latitude, and mean forecast of any point on the plot. A leaflet map is used to spatially show cities on a map of North America allowing for more geographical context when analyzing points. The points are colored representing their mean error so users can observe possible patterns while also being able to hover over a point and gain specific mean error information. I also have a simple data table where a user can look up a specific city, or other variable, and find the observation quickly.