Reshape population data such that it can be used to generate the desired visualization.
x: year y: population color, shape: country_name
population |>pivot_longer(cols =`2000`:`2023`,names_to ="year",values_to ="population" )
# A tibble: 5,208 × 6
series_name series_code country_name country_code year population
<chr> <chr> <chr> <chr> <chr> <dbl>
1 Population, total SP.POP.TOTL Afghanistan AFG 2000 19542982
2 Population, total SP.POP.TOTL Afghanistan AFG 2001 19688632
3 Population, total SP.POP.TOTL Afghanistan AFG 2002 21000256
4 Population, total SP.POP.TOTL Afghanistan AFG 2003 22645130
5 Population, total SP.POP.TOTL Afghanistan AFG 2004 23553551
6 Population, total SP.POP.TOTL Afghanistan AFG 2005 24411191
7 Population, total SP.POP.TOTL Afghanistan AFG 2006 25442944
8 Population, total SP.POP.TOTL Afghanistan AFG 2007 25903301
9 Population, total SP.POP.TOTL Afghanistan AFG 2008 26427199
10 Population, total SP.POP.TOTL Afghanistan AFG 2009 27385307
# ℹ 5,198 more rows
The first pivot creates year as a string(chr) variable. Let’s convert it to a numeric value.
population_longer <- population |>#create new df by passing the pivot to a new namepivot_longer(cols =`2000`:`2023`,names_to ="year",values_to ="population",names_transform = as.numeric )
Visualization
Now we are able to begin to to visualize the transformed data.
population_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name)) +geom_line() +geom_point()
Fixing the visualization
Update x-axis scales
Update y-axis so it’s scaled to millions and uses the same breaks as the goal plot.
Theme
Labels
Placement of legend
#fix shapes first-- each country now gets its own shapepopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point()
#fix the x-axispopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point() +scale_x_continuous(limits =c(2000, 2024), breaks =seq(2000, 2024, 4))
#fix the y-axispopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point() +scale_x_continuous(limits =c(2000, 2024), breaks =seq(2000, 2024, 4))+scale_y_continuous(breaks =seq(250000000, 1250000000, 250000000),labels =label_number(scale =1/1000000, suffix ="mil") ) +scale_color_manual(values =c("United States"="#0A3161","China"="#EE1C25","India"="#FF671F" ) ) +theme_minimal() +labs(title ="Country Populations Over Time",subtitle ="2000 to 2023",caption ="Data Source: World Bank", x ="Year",y ="Population (millions)",color ="Country",shape ="Country" )
#fix the color scheme by adding specific colors US = #0A3161#China = #EE1C25, India = #FF671F
#fix shapes first-- each country now gets its own shapepopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point()
#fix the x-axispopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point() +scale_x_continuous(limits =c(2000, 2024), breaks =seq(2000, 2024, 4))
#fix the y-axispopulation_longer |>filter(country_name %in%c("China", "India", "United States")) |>ggplot(aes(x = year, y = population, color = country_name, shape = country_name)) +geom_line() +geom_point() +scale_x_continuous(limits =c(2000, 2024), breaks =seq(2000, 2024, 4))+scale_y_continuous(breaks =seq(250000000, 1250000000, 250000000),labels =label_number(scale =1/1000000, suffix ="mil") ) +scale_color_manual(values =c("United States"="#0A3161","China"="#EE1C25","India"="#FF671F" ) ) +labs(title ="Country Populations Over Time",subtitle ="2000 to 2023",caption ="Data Source: World Bank", x ="Year",y ="Population (millions)",color =NULL,shape =NULL ) +theme_minimal() +theme(legend.position ="top")