How has the proportion of men and women in the workforce differed over time?
The World Development Indicators (WDI) Dataset is provided by the
United Nations Educational, Scientific and Cultural Organization
(UNESCO). It contains national, regional, and global estimates of
development and population data.
Source: https://data.unesco.org/explore/dataset/wdi001/information/
Key Variables:
-Year
-Regional Group
-Labor Force (Male, Female, Total)
-Population (Male, Female)
This dataset includes 6326 observations and 25 variables.
library(tidyverse)
library(RColorBrewer)
countries <- read_csv("wdi001.csv")
I will filter the dataset to only include countries in Asia, group by year, and summarize to find the mean population information for men and women. I will also find the mean proportions of men and women in the workforce. With this, I will create a supporting line plot to illustrate the difference in these proportions over time.
#str(df)
head(countries)
## # A tibble: 6 × 25
## Year `Regional Group` Country `GDP Total` `GDP Growth Rate` `GDP Per Capita`
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 1964 Europe and North… Andorra NA NA NA
## 2 1968 Europe and North… Andorra NA NA NA
## 3 1970 Europe and North… Andorra 78617711. NA 3935.
## 4 1972 Europe and North… Andorra 113414397. 8.15 4940.
## 5 1974 Europe and North… Andorra 186557082. 5.62 7140.
## 6 1977 Europe and North… Andorra 253997897. 2.84 8168.
## # ℹ 19 more variables: `Youth Literacy Rate` <dbl>,
## # `Adult Literacy Rate` <dbl>, `Primary School Enrollment` <dbl>,
## # `Secondary School Enrollment` <dbl>, `Tertiary School Enrollment` <dbl>,
## # `Labor Force Female` <dbl>, `Labor Force Male` <dbl>,
## # `Labor Force Total` <dbl>, `Unemployment Rate` <dbl>,
## # `Life Expectancy` <dbl>, `Population Aged 0-14` <dbl>,
## # `Population Aged 15-64` <dbl>, `Population Aged 65-up` <dbl>, …
names(countries) <- gsub(" ", "_", names(countries))#sub spaces w/ underscores
names(countries) <- tolower(names(countries)) #variable names lowercase
countries <- countries |>
filter(!is.na(labor_force_total))
countriesasia <- countries |>
select(c(year,
country,
regional_group,
labor_force_female,
labor_force_male,
labor_force_total,
female_population,
male_population,
population_total)) |>
filter(regional_group %in% c("Asia and the Pacific",
"Arab States",
"Arab States,Asia and the Pacific"))
#Find male/female population count
countriesasia <- countriesasia |>
mutate(population_female = (female_population/100) * population_total,
population_male = (male_population/100) * population_total)
#Find mean of labor percentages across Asia.
countriesasia2 <- countriesasia |>
group_by(year) |>
summarise(mean_total_labor = mean(labor_force_total),
mean_female_labor = mean(labor_force_female),
mean_male_labor = mean(labor_force_male),
mean_female_pop = mean(population_female),
mean_male_pop = mean(population_male)) |>
pivot_longer(cols = c(mean_total_labor, #Separate percentages for graphing
mean_female_labor,
mean_male_labor),
names_to = "Category",
values_to = "Percentage")
ggplot(countriesasia2, aes(x = year, y = Percentage, color = Category)) +
geom_line(size = 0.9) +
geom_point() +
labs(title = "Avg. % Population in Asia's Labor Force over Time (1990-2024)",
x = "Year",
y = "Avg. % Population in Labor Force",
caption = "Source: UNESCO") +
scale_color_brewer(palette = "Dark2", name = "",
labels = c("Female", "Male", "Total")) +
ylim(40,80) +
theme_bw()
Two Proportions Z-Test Is the proportion of males in the workforce higher than the proportion of females in the workforce in Asia in 1990 and 2024?
\(H_0\): \(p_1\) = \(p_2\)
\(H_a\): \(p_1\) > \(p_2\)
Where:
\(p_1\) = proportion of
male population in the workforce
\(p_2\) = proportion of female population in
the workforce
head(countriesasia2, 3)
## # A tibble: 3 × 5
## year mean_female_pop mean_male_pop Category Percentage
## <dbl> <dbl> <dbl> <chr> <dbl>
## 1 1990 30306549. 31488399. mean_total_labor 61.4
## 2 1990 30306549. 31488399. mean_female_labor 43.1
## 3 1990 30306549. 31488399. mean_male_labor 77.7
tail(countriesasia2, 3)
## # A tibble: 3 × 5
## year mean_female_pop mean_male_pop Category Percentage
## <dbl> <dbl> <dbl> <chr> <dbl>
## 1 2024 47608549. 49355122. mean_total_labor 60.5
## 2 2024 47608549. 49355122. mean_female_labor 46.0
## 3 2024 47608549. 49355122. mean_male_labor 73.2
#Mean male population * male labor percentage
31488399 * .7768706
## [1] 24462411
#Mean female population * female labor percentage
30306549 * .4306110
## [1] 13050333
prop.test(c(24462411, 13050333), c(31488399, 30306549), alternative = "greater")
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(24462411, 13050333) out of c(31488399, 30306549)
## X-squared = 7762054, df = 1, p-value < 2.2e-16
## alternative hypothesis: greater
## 95 percent confidence interval:
## 0.3460678 1.0000000
## sample estimates:
## prop 1 prop 2
## 0.7768706 0.4306110
#Mean male population * male labor percentage
49355122 * .7316760
## [1] 36111958
#Mean female population * female labor percentage
47608549 * .4599562
## [1] 21897847
prop.test(c(36111958, 21897847), c(49355122, 47608549), alternative = "greater")
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(36111958, 21897847) out of c(49355122, 47608549)
## X-squared = 7444178, df = 1, p-value < 2.2e-16
## alternative hypothesis: greater
## 95 percent confidence interval:
## 0.2715621 1.0000000
## sample estimates:
## prop 1 prop 2
## 0.7316760 0.4599562
P-value: 2.2e-16
α = 0.05
There is strong evidence to support the alternate hypothesis that the proportion of males in the workforce is higher than the proportion of females in the workforce in Asia.
95% CI for difference = (0.271, 1.000)
The interval is entirely above 0, showing there is a higher proportion
of males in the workforce than females in Asia.
In my analysis I found that the average proportion of males in the workforce is higher than females in the workforce in Asian countries. I also found that while the percentage of males in the workforce has decreased since 1990 (77.6% to 73.2%), the percentage of women in the workforce has increased (43.1% to 46%). This answers my question of how the proportion has differs between men and women and how it has changed over time. It is clear from the results that throughout countries in Asia, there is a significant gender gap between the working population. This is likely due to traditional culture norms, however, it is also clear from the data that more and more women are entering the workforce as time moves forward. In the future, I would like to further explore the rise in women entering the workforce by noting which jobs are most popular among the population, and I would also like to see this growth in individual countries rather than the continent as a whole.
Dataset: https://data.unesco.org/explore/dataset/wdi001/information/ RColorBrewer library from DATA110.