Hello, this is my first data visualisation. I am super excited but also…

Step 1 - Set up

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(WDI)
library(ggplot2)

Step 2 - Data manipulation

2.1 Datasets

wdi_AP=WDI(indicator=c('EN.ATM.PM25.MC.M3')) %>% 
  rename(PM25=EN.ATM.PM25.MC.M3) %>%
  filter(year==2017)

wdi_GDPpc=WDI(indicator=c('NY.GDP.PCAP.CD')) %>% 
  rename(GDPpc=NY.GDP.PCAP.CD) %>%
  filter(year==2017)

inc_lvl=data.frame(WDI_data$country) %>% 
  select(country, income)

2.2 Combine data

combined_data=left_join(wdi_AP,wdi_GDPpc)

## Joining, by = c("iso2c", "country", "year")

combined_data=left_join(combined_data,inc_lvl) %>% 
  filter(income==c('High income', 'Low income', 'Lower middle income', 'Upper middle income'))

## Joining, by = "country"

Step 3 - Create a nice graph

ggplot(combined_data, aes(x=GDPpc, y=PM25))+
  geom_point(aes(colour=income)) +
  geom_smooth(method = 'lm', se=FALSE, colour='orange')+
  theme_minimal()+
  labs(title='Exposure to air pollution vs. GDP per capita, 2017',
       caption = 'source: World Bank:World Development Indicators',
       x='GDP per capita [current US$]',
       y='PM 2.5 pollution [mean annual exposure]')+
  scale_color_grey()

## `geom_smooth()` using formula 'y ~ x'

## Warning: Removed 7 rows containing non-finite values (stat_smooth).

## Warning: Removed 7 rows containing missing values (geom_point).

The data suggest a negative correlation between GDP per capita and exposure to air pollution. Hence the wealthier the country, the less likely are its citizens to suffer from high levels of PM 2.5.

However, as every good data magician knows, correlation does not imply causation.

Major sources of particulate matter(PM) include agriculture, industrial processes, and combustion fossil fuels. Low income countries are more likely to experience those activities when compared to their richer counterparts.

¬THE END.¬

Air pollution vs. GDP per capita

Step 1 - Set up

Step 2 - Data manipulation

2.1 Datasets

2.2 Combine data

Step 3 - Create a nice graph