According to Wikipedia: “Gender equality is the state of equal ease of access to resources and opportunities regardless of gender, including economic participation and decision-making; and the state of valuing different behaviors, aspirations and needs equally, regardless of gender.”
Although the fight for gender equality dates back centuries, it recently has gained a “cohesive” momentum empowering the voices of those - primarily females, who have been long put at disadvantage in many aspects of life, from housing, economic opportunities, to essential needs and rights such as voting and education that some of us, males, take for granted.
The Women’s march of 2017 signaled a turning point in this fight when millions around the world united in protest against the oppression of equal rights. The March was primarily fueled by public continued remarks undermining women at all levels in society by the then candidate to the presidency of the United States - Donald J. Trump, and it’s start was marked by his win of the 2016 elections.
Although many voices have been heard, and women in the world appear to be taking back their rights to excel, the response by governments accross the world to this movement still shows there is much work to do. The following document examines the most recent Gender Statistics Data from the World Bank, reviewing some of the aspects which to-date continue to limit the opportunities for females to contribute to the progress of our society.
Agency: an individuals capacity to make decisions about their own life and act on them to achieve a desired outcome, free of violence, retribution, or fear.
Education: a group of qualitative and quantitative variables that represent male and female degree of schooling and their respective opportunities in a given country.
Per Capita Gross National Income - GNI: the total amount of money earned by a nation’s people and businesses including investment income, regardless fo where it was earned, as well as money received from abroad such as foreign investment and economic development aid divided by the country’s population.
Human Capital Index - HCI: is an indicator of a country’s ability to mobilize the economic and professional potential of its citizens. It measures how much capital a given country loses through the lack of education and health. The HCI ranges between 0 and 1, where 1 indicates the maximum potential for a given country has been reached
The data for this study was obtained from the World Bank and it contains quantitative and qualitative variables pertaining to Agency, Economic and Social Context, Economic Opportunities, Education, Health, and Public Life and Decision Making for genders female and male across 217 nations for the years of 1960-2020. (last update: March 2021).
The data is published annually, and it was first released in July 2021. It is collected through API Harvesting methods on a quarterly basis by the World Bank’s Gender Group, and the Development Economics Data Group. There are multiple sources of data combined for this particular report - including World Bank datasets, International Gender data portals.
Although the data set was composed by an reputable institution, it is worth mentioning that its data may have a certain degree of bias which is due to the fact that World Bank relies on nations to report their own figures, as a result some nations may report statistics in a manner that would favor them or would not show national conflict or portray a negative image of such nation.
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.0 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## Attaching package: 'reshape'
## The following object is masked from 'package:dplyr':
##
## rename
## The following objects are masked from 'package:tidyr':
##
## expand, smiths
##
## Attaching package: 'plotly'
## The following object is masked from 'package:reshape':
##
## rename
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
## Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1
##
## Attaching package: 'rio'
## The following object is masked from 'package:plotly':
##
## export
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
##
## Attaching package: 'raster'
## The following object is masked from 'package:patchwork':
##
## area
## The following object is masked from 'package:plotly':
##
## select
## The following object is masked from 'package:dplyr':
##
## select
## The following object is masked from 'package:tidyr':
##
## extract
## rgdal: version: 1.5-23, (SVN revision 1121)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.1.4, released 2020/10/20
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
## Path to PROJ shared files: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rgdal/proj
## Linking to sp version:1.4-5
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
Gender_Data_Raw<- read.csv("/Users/cruz-diazgroup/Desktop/1. William/1. School/DATA SCIENCE/2. DATA 110 Vis. & Com./3. Projects/Will_Lopez_Project2_Subject_Data_approval/Gender_StatsData.csv", check.names = FALSE, header = TRUE, sep = ",")
Gender_Data <- Gender_Data_Raw
#View(Gender_Data)
#str(Gender_Data)
#head(Gender_Data)
#tail(Gender_Data)
which(colnames(Gender_Data) == '')
## [1] 66
Gender_Data<- Gender_Data[-c(66)]
which(colnames(Gender_Data) == "Indicator Name")
## [1] 3
Gender_Data<- Gender_Data[-c(3)]
#head(Gender_Data)
which(colnames(Gender_Data) == "1960")
## [1] 4
which(colnames(Gender_Data) == "1989")
## [1] 33
Gender_Data<- Gender_Data[-c(4:33)]
#head(Gender_Data)
Gender_Data <-Gender_Data %>%
filter(!`Country Code` %in% c("ARB","CSS","CEB","EAR","EAS","EAP","TEA","EMU","ECS","ECA","TEC","EUU","FCS","HPC","HIC","IBD","IBT","IDB","IDX","IDA","LTE","LCN","LAC","TLA","LDC","LMY","LIC","LMC","MEA","MNA","TMN","MIC","NAC","OED","OSS","PSS","PST","PRE","SST","SAS","TSA","SSF","SSA","TSS","UMC","WLD"))
#head(Gender_Data)
Gender_Data<-Gender_Data %>%
mutate(`2020` = coalesce(`2020`,`2019`))
# I now check to see that my data was updated with 2019 GDP per capita information for the year 2020
#Gender_Data%>%
#filter(`Indicator Code`=="NY.GNP.PCAP.CD")
Gender_Data_Long <- Gender_Data %>%
pivot_longer('1990':'2020', names_to = "year", values_to = "value")
#head(Gender_Data_Long)
#tail(Gender_Data_Long)
Low Income: countries with annual income < $1,026 Lower-Middle Income: countries with annual income between $1,027 - $3,995 Upper-Middle Income: countries with annual income between $3,996 - $12,375 High Income: countries with annual income > $12,376
Country_GNI_Level <-Gender_Data_Long %>%
filter(year == 2019,`Indicator Code`== "NY.GNP.PCAP.CD") %>%
mutate("Income Level" = if_else(value < 1026, 'Low', if_else(value>= 1027 & value <= 3995, 'Lower-Middle',if_else(value >= 3996 & value <= 12375,'Upper-Middle','High'))))
#head(Country_GNI_Level)
which(colnames(Country_GNI_Level) == "Country Name")
## [1] 1
which(colnames(Country_GNI_Level) == "Income Level")
## [1] 6
Country_GNI_Level <- Country_GNI_Level[c(1,6)]
#head(Country_GNI_Level)
Gender_Data_Long <-left_join(Gender_Data_Long, Country_GNI_Level, by = "Country Name")
#head(Gender_Data_Long)
Gender_Data_Wide <-Gender_Data_Long %>%
pivot_wider(names_from = `Indicator Code`, values_from = value)
#head(Gender_Data_Wide)
Gender_Ref <- read.csv("/Users/cruz-diazgroup/Desktop/1. William/1. School/DATA SCIENCE/2. DATA 110 Vis. & Com./3. Projects/Will_Lopez_Project2_Subject_Data_approval/Gender_Stats_csv/Gender_Ref.csv", check.names = FALSE, header = TRUE, sep = ",")
#head(Gender_Ref)
Gender_Data_Long1 <-left_join(Gender_Data_Long, Gender_Ref, by = "Indicator Code")
#head(Gender_Data_Long1)
#tail(Gender_Data_Long1)
A general look at the data reveals that the per Capita Income of nations is very “skewed” - right skewed, and that there are very large differences in incomes among grouped nations, with many outliers, in particular within nations with High incomes. We can see here, that the vast majority of nations lie within Low to Lower to upper middle incomes groups.
h1<-Gender_Data_Wide %>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019,!is.na(`Income Level`))%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_histogram(bins = 20, fill = "slategray1") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("Freq.")
b1<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019,!is.na(`Income Level`))%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_boxplot(fill = "slategray1") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("")
q1<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019,!is.na(`Income Level`))%>%
ggplot(aes(sample = NY.GDP.PCAP.CD)) +
geom_line(stat = "qq") +
theme(axis.text = element_text(angle = 45)) +
xlab("Theorical Quantiles") + ylab("Sample Quantiles")
(h1 + b1 + q1 +
plot_layout(ncol = 3, nrow = 1) +
plot_annotation(title = "GDP per capita all income countries",
subtitle = "Year 2019",
caption = "Global Gender Statistics 2020 - World Bank )"))
hist1<-Gender_Data_Wide %>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level` == "Low")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_histogram(bins = 10, fill = "aquamarine3") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("Freq.")
hist2<-Gender_Data_Wide %>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level` == "Lower-Middle")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_histogram(bins = 10, fill = "darkseagreen") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("Freq.")
hist3<-Gender_Data_Wide %>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level` == "Upper-Middle")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_histogram(bins = 10, fill = "deepskyblue3") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("Freq.")
hist4<-Gender_Data_Wide %>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level` == "High")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_histogram(bins = 10, fill = "deepskyblue4") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("Freq.")
box1<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level`== "Low")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_boxplot(fill = "aquamarine3") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("")
box2<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level`== "Lower-Middle")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_boxplot(fill = "darkseagreen") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("")
box3<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level`== "Upper-Middle")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_boxplot(fill = "deepskyblue3") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("")
box4<-Gender_Data_Wide%>%
filter(!is.na(NY.GDP.PCAP.CD), year == 2019, `Income Level`== "High")%>% ggplot(
mapping = aes(x = NY.GDP.PCAP.CD)) +
geom_boxplot(fill = "deepskyblue4") +
theme(axis.text = element_text(angle = 45)) +
xlab("USD") + ylab("")
qq1<-Gender_Data_Wide%>%
filter(`Income Level`== "Low") %>%
ggplot(aes(sample = NY.GDP.PCAP.CD)) +
geom_line(stat = "qq") +
theme(axis.text = element_text(angle = 45)) +
xlab("Theorical Quantiles") + ylab("Sample Quantiles")
qq2<-Gender_Data_Wide%>%
filter(`Income Level`== "Lower-Middle") %>%
ggplot(aes(sample = NY.GDP.PCAP.CD)) +
geom_line(stat = "qq") +
theme(axis.text = element_text(angle = 45)) +
xlab("Theorical Quantiles") + ylab("Sample Quantiles")
qq3<-Gender_Data_Wide%>%
filter(`Income Level`== "Upper-Middle") %>%
ggplot(aes(sample = NY.GDP.PCAP.CD)) +
geom_line(stat = "qq") +
theme(axis.text = element_text(angle = 45)) +
xlab("Theorical Quantiles") + ylab("Sample Quantiles")
qq4<-Gender_Data_Wide%>%
filter(`Income Level`== "High") %>%
ggplot(aes(sample = NY.GDP.PCAP.CD)) +
geom_line(stat = "qq") +
theme(axis.text = element_text(angle = 45)) +
xlab("Theorical Quantiles") + ylab("Sample Quantiles")
(hist1 + hist2 + hist3 + hist4 +
box1 + box2 + box3 + box4 +
qq1 + qq2 + qq3 + qq4 +
plot_layout(ncol = 4, nrow = 3) +
plot_annotation(title = "GDP per capita for Low, Lower-Middle, Upper-Middle, and High income countries",
subtitle = "Year 2019",
caption = "Global Gender Statistics 2020 - World Bank )"))
## Warning: Removed 23 rows containing non-finite values (stat_qq).
## Warning: Removed 34 rows containing non-finite values (stat_qq).
## Warning: Removed 47 rows containing non-finite values (stat_qq).
## Warning: Removed 70 rows containing non-finite values (stat_qq).
# Filter out year for data
Gender_Data_Wide1<-Gender_Data_Wide %>%
filter(year == 2020)
As it was defined earlier “Agency” refers to an individual’s capacity to make decisions about their own life and act on them to achieve a desired outcome, free of violence, retribution, or fear. We looked at some variables pertaining to travel equality between females and males, and found that there are still nations where females do not have the same rights to obtain passports, to travel out of home, or country as males.
# Add labels under the tops of bars
Gender_Data_Long1 %>%
filter(!is.na(`Income Level`), `Indicator Code` %in% c("SG.APL.PSPT.EQ", "SG.HME.TRVL.EQ","SG.CTR.TRVL.EQ"), year == 2020, !is.na(value)) %>%
group_by(`Indicator Code`)%>%
ggplot(aes(x=`Indicator Code`, y=`value`, fill=`Income Level`)) +
geom_bar(stat="identity") +
geom_text(aes(label=`value`), vjust=3.5, colour="black", size=3.5) +
labs(title = "Female Travel Equality",
subtitle = "All Income Nations - 2020",
x = "Obtaining a Passport - Out of Home Travel - Out of Country Travel", y = "Frequencies",
caption = "data: world bank",
fill = "") +
scale_fill_brewer(palette = "Pastel2")
Gender_Data_Wide1%>%
dplyr::select(`Income Level`, `SG.APL.PSPT.EQ`,`SG.HME.TRVL.EQ`,`SG.CTR.TRVL.EQ`,`Country Name`)%>%
filter(!is.na(`Income Level`))%>%
group_by(`Income Level`) %>%
summarise("Passport Equality %Y" = round((sum(`SG.APL.PSPT.EQ` == "1", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0)," Out-of-home Equality %Y" = round((sum(`SG.HME.TRVL.EQ` == "1", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0), "Out-of-Country Equality %Y" = round((sum(`SG.CTR.TRVL.EQ` == "1", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0),"TTL Reported (each group)" = n_distinct(`Country Name`)) %>%
arrange(`TTL Reported (each group)`)
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 4 x 5
## `Income Level` `Passport Equality… ` Out-of-home Equali… `Out-of-Country Equa…
## <chr> <dbl> <dbl> <dbl>
## 1 Low 81 90 95
## 2 Lower-Middle 76 98 100
## 3 Upper-Middle 78 93 94
## 4 High 87 88 92
## # … with 1 more variable: TTL Reported (each group) <int>
Gender_Data_Wide1%>%
dplyr::select(`Income Level`, `SG.APL.PSPT.EQ`,`SG.HME.TRVL.EQ`,`SG.CTR.TRVL.EQ`,`Country Name`)%>%
filter(!is.na(`Income Level`))%>%
group_by(`Income Level`) %>%
summarise("Passport Equality %N" = round((sum(`SG.APL.PSPT.EQ` == "0", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0),"Out-of-home Equality %N" = round((sum(`SG.HME.TRVL.EQ` == "0", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0), "Out-of-Country Equality %N" = round((sum(`SG.CTR.TRVL.EQ` == "0", na.rm = TRUE) / n_distinct(`Country Name`)*100),digits = 0),"TTL Reported (each group)" = n_distinct(`Country Name`)) %>%
arrange(`TTL Reported (each group)`)
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 4 x 5
## `Income Level` `Passport Equality… `Out-of-home Equalit… `Out-of-Country Equa…
## <chr> <dbl> <dbl> <dbl>
## 1 Low 19 10 5
## 2 Lower-Middle 24 2 0
## 3 Upper-Middle 20 6 4
## 4 High 8 7 3
## # … with 1 more variable: TTL Reported (each group) <int>
The Human Capital Index is an indicator of a country’s ability to mobilize the economic and professional potential of its citizens. It measures how much capital a given country loses through the lack of education and health. The HCI ranges between 0 and 1, where 1 indicates the maximum potential for a given country has been reached. In general, citizens of wealthy nations tend to fair better in tests scores when compared to citizens of poorer countries. The graph below demonstrates that tests scores go hand in hand with a nation’s investment in its citizens’ education and wellbeing. A few outliers though, do show that a Country’s GDP per Capita does not necesarily result in high tests scores in the absence of high HCI. Qatar, United Arab Emirates, Kuwait, and Luxembourg for instance have much higher per capita GDP than say Spain, and Italy, nevertheless, Spain and Italy have higher HCI and tests results. The same holds true for The United States vs Estonia and Poland for instance.
fig<-plot_ly(
Gender_Data_Wide1, x = ~HD.HCI.OVRL, y = ~HD.HCI.HLOS,
text = ~paste("Country: ",`Country Name`,"<br>Female scores:", HD.HCI.HLOS.FE, "<br>Male scores:", HD.HCI.HLOS.MA, "<br>Expected years of schooling (f):",HD.HCI.EYRS.FE, "<br>Expected years of schooling (m):",HD.HCI.EYRS.MA),
color = ~HD.HCI.HLOS, size = ~NY.GDP.PCAP.CD
)
fig <- fig %>% layout(
title = '2020 Harmonized Tests Scores vs The Human Capital Index',
xaxis = list(
title = 'Human Capital Index - HCI'
),
yaxis = list(
title = 'Harmonized Tests Scores'
)
)
fig
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
## Warning: Ignoring 43 observations
## Warning: `arrange_()` was deprecated in dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## Warning: `line.width` does not currently support multiple values.
Another important insight from this review is the education expectation and gender tests results. It appears that gender expectations in terms of years of education does influence the performance on tests scores for males and females. Across all type of income levels, except for the very low income countries, females are expected to complete more years of education than males, as a result, in all groups, except for low income countries, the median and mean female tests scores are higher than those for males. The boxplots also show that females tests scores in general do have higher “max” values than those for males.
# Mean tests scores for females by income country levels
Gender_Data_Wide1$`Income Level` <- factor(Gender_Data_Wide1$`Income Level`, levels = c("Low", "Lower-Middle", "Upper-Middle","High"))
fig2 <- plot_ly(
data=Gender_Data_Wide1,
x = ~`Income Level`,
y = ~HD.HCI.HLOS.FE,
color = ~`Income Level`,
type = "box",
showlegend = FALSE,
text = ~paste("Country: ",`Country Name`,"<br>Female scores:", HD.HCI.HLOS.FE,
"<br>Male scores:", HD.HCI.HLOS.MA,
"<br>Expected years of schooling (f):", HD.HCI.EYRS.FE,
"<br>Expected years of schooling (m):", HD.HCI.EYRS.MA)
)
fig2 <- fig2 %>% layout(
title = '2020 Harmonized Female Tests Scores Distribution',
xaxis = list(
title = '"Per Income" Country Categories'
),
yaxis = list(
title = 'Harmonized Tests Scores'
)
)
# Mean tests scores for males by income country levels
Gender_Data_Wide1$`Income Level` <- factor(Gender_Data_Wide1$`Income Level`, levels = c("Low", "Lower-Middle", "Upper-Middle","High"))
fig3 <- plot_ly(
data=Gender_Data_Wide1,
x = ~`Income Level`,
y = ~HD.HCI.HLOS.MA,
color = ~`Income Level`,
colors = "BrBG",
type = "box",
showlegend = FALSE,
text = ~paste("Country: ",`Country Name`,"<br>Female scores:", HD.HCI.HLOS.FE,
"<br>Male scores:", HD.HCI.HLOS.MA,
"<br>Expected years of schooling (f):", HD.HCI.EYRS.FE,
"<br>Expected years of schooling (m):", HD.HCI.EYRS.MA)
)
fig3 <- fig3 %>% layout(
title = '2020 Harmonized Male Tests Scores Distribution',
xaxis = list(
title = '"Per Income" Country Categories'
),
yaxis = list(
title = 'Harmonized Tests Scores'
)
)
figA <- subplot(fig2, fig3, nrows = 1)
## Warning: Ignoring 66 observations
## Warning: Ignoring 66 observations
figA <- figA %>% layout(title = "2020 Harmonized Tests Scores Distribution by Income Categories
(Females & Males)",
yaxis = list(title = 'Harmonized Tests Scores'))
figA
# Mean tests scores and expected years of education for both females & males
Gender_Data_Wide1%>%
dplyr::select(`Income Level`,`HD.HCI.HLOS.FE`,`HD.HCI.HLOS.MA`,`HD.HCI.EYRS.FE`,`HD.HCI.EYRS.MA`)%>%
filter(!is.na(`Income Level`))%>%
group_by(`Income Level`) %>%
summarise(Females = mean(`HD.HCI.HLOS.FE`, na.rm = TRUE), "Exp yrs (f)" = mean(`HD.HCI.EYRS.FE`,na.rm=TRUE), Males =mean(`HD.HCI.HLOS.MA`,na.rm = TRUE), "Exp yrs (m)" = mean(`HD.HCI.EYRS.MA`,na.rm=TRUE))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 4 x 5
## `Income Level` Females `Exp yrs (f)` Males `Exp yrs (m)`
## <fct> <dbl> <dbl> <dbl> <dbl>
## 1 Low 354. 7.10 355. 7.61
## 2 Lower-Middle 386. 10.3 381. 10.2
## 3 Upper-Middle 418. 12.1 405. 11.8
## 4 High 495. 13.3 481. 13.1
It is interesting that to-date wealthy nations do not fully support infant development and females well being when it comes to maternity leave. The map below shows a glaring disconnect between the per Capita Income of Nations and their corresponding maternity leave.
##### filtered long 2 We load the libraries “rnaturalearthdata” to get our World vector map & World country polygons to map our data insights.
library(rnaturalearth)
library(rnaturalearthdata)
world <- ne_countries(scale = "medium", returnclass = "sf")
class(world)
## [1] "sf" "data.frame"
We create a new df with the GDP variables to map
mapGDP <-dplyr::select(Gender_Data_Wide1, year, `Country Name`,`Country Code`, "Per Capita GDP" = NY.GDP.PCAP.CD, "Days of Paid Maternity" = SH.MMR.LEVE, "Retirement Age with Benefits (f)" = SG.AGE.RTRE.FL.FE, "Retirement Age with Benefits (m)" =SG.AGE.RTRE.FL.MA, `Income Level`)%>%
filter(year == 2020)
We add variables we wish to map from our “Gender_Filtered2” dataset
world <- left_join(world, mapGDP, by = c("adm0_a3" = "Country Code"))
We map our world data set with the corresponding variable “GDP per Capita” and “Paid Maternity”
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(world) +
tm_polygons(c("Per Capita GDP", "Days of Paid Maternity")) +
tm_facets(sync = TRUE, ncol = 2)
Gender Statistics sources of Data: https://datacatalog.worldbank.org/dataset/gender-statistics https://www.worldbank.org/en/data/datatopics/gender/data-resources
Gender Statistics indicators https://www.worldbank.org/en/data/datatopics/gender/indicators https://www.investopedia.com/terms/g/gross-national-income-gni.asp https://www.r-spatial.org/r/2018/10/25/ggplot2-sf.html`