library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
WA<-read.csv("/Users/rupeshswarnakar/Desktop/washdash-download.csv")
summary(WA)
## Type Region Residence.Type Service.Type
## Length:260 Length:260 Length:260 Length:260
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Year Coverage Population Service.level
## Min. :2022 Min. : 0.000 Min. :0.000e+00 Length:260
## 1st Qu.:2022 1st Qu.: 1.803 1st Qu.:3.524e+06 Class :character
## Median :2022 Median : 10.250 Median :2.916e+07 Mode :character
## Mean :2022 Mean : 22.571 Mean :1.660e+08
## 3rd Qu.:2022 3rd Qu.: 33.408 3rd Qu.:1.748e+08
## Max. :2022 Max. :100.000 Max. :2.173e+09
Q.1 Which Region is comparatively behind on progress in household drinking water?
Q.2 Is there any relationship between development of country to progress on drinking water?
Q.3 Is the population affecting drinking water, sanitation and hygiene? #Sometimes more population can impact the availability or drinking water, sanitation and hygiene.
let’s take a look into Regions that are relatively behind on development such as Sub-Saharan Africa and Oceania:
WA|>
filter(Region=='Sub-Saharan Africa')|>
filter(Service.Type== 'Drinking water')|>
aggregate(Coverage~Service.level, mean)
## Service.level Coverage
## 1 Basic service 33.60807
## 2 Limited service 13.35015
## 3 Safely managed service 33.17985
## 4 Surface water 5.82792
## 5 Unimproved 14.03401
WA|>
filter(Region=='Oceania')|>
filter(Service.Type== 'Drinking water')|>
aggregate(Coverage~Service.level, mean)
## Service.level Coverage
## 1 At least basic 55.36870
## 2 Basic service 37.50377
## 3 Limited service 1.76193
## 4 Safely managed service 55.32068
## 5 Surface water 13.61583
## 6 Unimproved 16.76829
Let’s compare these above data with that of comparitively developed Regions such as Europe and Northern America:
WA|>
filter(Region=='Europe and Northern America')|>
filter(Service.Type== 'Drinking water')|>
aggregate(Coverage~Service.level, mean)
## Service.level Coverage
## 1 Basic service 6.3636600
## 2 Limited service 0.3317933
## 3 Safely managed service 92.1247667
## 4 Surface water 0.0279600
## 5 Unimproved 1.1518233
We can also look at ‘safely managed’ and ‘unimproved’ service level of all Regions to see which are comparatively higher than the other:
WA|>
filter(Service.Type== 'Drinking water')|>
filter(Service.level=='Safely managed service')|>
aggregate(Coverage~Region, mean)
## Region Coverage
## 1 Australia and New Zealand 99.53387
## 2 Central and Southern Asia 67.47019
## 3 Eastern and South-Eastern Asia 76.75868
## 4 Europe and Northern America 92.12477
## 5 Latin America and the Caribbean 69.51487
## 6 Northern Africa and Western Asia 78.96726
## 7 Oceania 55.32068
## 8 Sub-Saharan Africa 33.17985
WA|>
filter(Service.Type== 'Drinking water')|>
filter(Service.level=='Unimproved')|>
aggregate(Coverage~Region, mean)
## Region Coverage
## 1 Australia and New Zealand 0.018150
## 2 Central and Southern Asia 2.044047
## 3 Eastern and South-Eastern Asia 2.350480
## 4 Europe and Northern America 1.151823
## 5 Latin America and the Caribbean 1.488903
## 6 Northern Africa and Western Asia 2.290723
## 7 Oceania 16.768293
## 8 Sub-Saharan Africa 14.034013
From the above two datasets we can see that Sub-Saharan Africa and Oceania are comparatively less progressed in terms of drinking water.
ggplot(WA, aes(x=Service.Type,
y=Population,
fill=Region))+
geom_boxplot()+
labs(x="Different Types of Services",
y="Population using Services",
title="Population vs Types of Services in Different SDG Regions")+
scale_color_brewer(palette='Dark2')
ggplot(WA, aes(x=Service.level,
y=Coverage,
fill=Region))+
geom_boxplot()+
labs(x="Quality of Different Services",
y="% Coverage using Services",
title="% Coverage of Population vs Quality of Services in Different SDG Regions")+
scale_color_brewer(palette='Dark2')
This visualization shows that Sub-Saharan Africa and Oceania are higher in service level like unimproved, open defecation, no hand-washing facility.
This visualization shows that Sub-Saharan Africa and Oceania are lowest in terms of Safely managed services.
Also, it shows that Sub-Saharan Africa and Oceania are utilizing surface water more than any other regions. This could mean, they may be using it to drink, or for sanitary purpose.
These above visualization shows that regions like Sub-Saharan Africa and Oceania which are relatively back in terms of development than other Regions have effects on Drinking water.
This above analysis opens the door for investigation on questions such as comparison between the mortality rate on population of developed vs developing regions, or impact on health due to urbanization vs unsanitary behavior.