Overview

Row

Time Span

2020 - 2024

Avg. Temperature Change

11.57 °C

CO2 Change

-11.89 ppm

Largest Missingness

10 missing

Row

Project Overview

This dashboard presents a compact exploratory analysis of the Climate Change Dataset, focusing on temporal trends, seasonal patterns, and relationships among key environmental variables.

Data Cleaning

Missing and inconsistent values were handled prior to analysis. Placeholder text such as “Unknown” and “NAN”, as well as the outlier code 99999, were treated as missing. Numeric values were imputed using monthly means to preserve seasonal structure, with overall means used only as a fallback.

Visualization Design

The design follows principles from both Tufte and Wexler. In line with Tufte, visualizations minimize non-data ink through clean layouts, restrained color use, and direct comparisons. In line with Wexler, chart types are selected to match analytical tasks, ensuring clarity and interpretability.

Why It Matters

Understanding these patterns helps identify long-term climate trends and their potential environmental impacts.

Temperature Pattern Over Time

Data Cleaning & Missing Values

Row

Missingness Before Imputation

Cleaning and Imputation Notes

The assignment asked for careful treatment of missing data rather than dropping rows. That is especially important in a small dataset, where listwise deletion can remove a large share of the available information.

Cleaning decisions used here:

  • Text placeholders such as Unknown and NAN were converted to true missing values.
  • The placeholder outlier 99999 in Urbanization_Index was treated as missing.
  • Missing numeric values were imputed using the mean within month across years, which is more defensible than using a single grand mean because climate variables often show seasonal structure.
  • If a whole month had no usable value for a variable, the method falls back to the overall variable mean.

This approach is simple, transparent, and appropriate for coursework, while also preserving seasonal structure more effectively than global mean imputation.

Row

Before vs. After Example for Average Temperature

The imputed values follow the overall seasonal pattern closely, suggesting that the monthly mean approach preserves underlying structure without introducing substantial distortion.

Seasonality

Row

Seasonal Pattern by Month

Interpretation

A month-level view is useful because many climate variables are inherently seasonal. Faceting separates variables with different units and scales, which supports a cleaner comparison and follows the visualization principle of avoiding misleading shared axes for unlike measures.

This tab is especially important because it also justifies the imputation strategy: replacing missing values within month preserves the broad seasonal rhythm better than a single global average would.

Temperature and precipitation show clear seasonal peaks toward later months, while humidity and sea surface temperature exhibit more moderate variation across the year.

Row

Temperature Range by Month

Relationships

Row

CO2 and Temperature

Correlation Heatmap

Row

Why these visual choices work

This section uses two complementary relationship plots rather than overloading one graphic with too much information.

  • The scatterplot supports focused interpretation of one substantively interesting relationship: CO2 and temperature.
  • The correlation heatmap supports pattern-finding across many variables at once.
  • Interactive hover details add information without cluttering the visual field.

These choices reflect the semester’s visualization guidance to prioritize legibility, purposeful encoding, and efficient comparisons.

Together, these visualizations reveal that a modest positive relationship exists between CO2 and temperature, while temperature-related variables tend to be more strongly correlated with each other than with other environmental factors.

References & Best Practices

Row

Connection to Course Readings

This dashboard follows several ideas discussed in class and in the course readings:

  1. Match plot type to question. Time-series plots are used to show change over time, faceting is used for seasonal comparisons, and scatterplots and heatmaps are used to explore relationships.

  2. Reduce non-data ink. The plots use simple themes, clear labels, and minimal styling so that the data remain the focus.

  3. Respect scale and comparability. Variables with different units are shown separately rather than forced onto the same axis.

  4. Make data preparation visible. Missing-value handling is explained clearly rather than hidden.

  5. Support interpretation. Each section includes short explanations to guide what the viewer should notice.

Closing Summary

Overall, the dataset shows clear patterns over time and across seasons, along with some relationships between key variables like CO2 and temperature. The dashboard is designed to make those patterns easy to see while also being clear about how the data were cleaned and prepared.