The tidying of this dataset was done with Anthropic LLM Claude
country_col <-names(raw_data)[1]total_col <-names(raw_data)[ncol(raw_data)]tidy_data <- raw_data %>%# Reshape: Pivot everything EXCEPT the country and the total sum columnpivot_longer(cols =-c(all_of(country_col), all_of(total_col)), names_to ="year", values_to ="co2_emissions",# FIX: This prevents the 'Can't combine character and double' errorvalues_transform =list(co2_emissions = as.character) ) %>%# Rename for consistencyrename(country =!!sym(country_col), total_sum =!!sym(total_col)) %>%# Clean up data typesmutate(year =as.numeric(year),country =str_trim(country),# Convert characters back to numbers (the '-' becomes NA automatically)co2_emissions =as.numeric(co2_emissions) ) %>%# Remove the NAs (the old dashes)filter(!is.na(co2_emissions)) %>%arrange(country, year)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `co2_emissions = as.numeric(co2_emissions)`.
Caused by warning:
! NAs introduced by coercion
top_10_list <- tidy_data %>%distinct(country, total_sum) %>%slice_max(total_sum, n =10) %>%pull(country)plot_data <- tidy_data %>%filter(country %in% top_10_list)ggplot(plot_data, aes(x = year, y = co2_emissions, color = country)) +geom_line(size =1) +geom_point(size =1.5, alpha =0.5) +# Adds dots to see the specific data pointstheme_minimal() +labs(title ="CO2 Emission Trends: Top 10 Global Emitters",subtitle ="Data source: Tracking Global CO2 Emissions (1990-2023)",x ="Year",y ="Emissions (Metric Tons per Capita)",color ="Country" ) +theme(legend.position ="bottom")
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.