Objectives

In this paper, we conducted a brief research on female employment status in US and China.

library(magrittr)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.3.3
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:magrittr':
## 
##     extract
library(ggmap)
## Warning: package 'ggmap' was built under R version 3.3.3
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.3.3
## 
## Attaching package: 'ggmap'
## The following object is masked from 'package:magrittr':
## 
##     inset
library(ggplot2)
data<-read.csv("WDIData_filter.csv",stringsAsFactors = FALSE)
## Warning: package 'bindrcpp' was built under R version 3.3.3
##                                                                                            Indicator.Name
## 1                                                            Labor force, female (% of total labor force)
## 2                       Ratio of female to male labor force participation rate (%) (modeled ILO estimate)
## 3                          Ratio of female to male labor force participation rate (%) (national estimate)
## 4          Unemployment with advanced education, female (% of female labor force with advanced education)
## 5                Unemployment with basic education, female (% of female labor force with basic education)
## 6  Unemployment with intermediate education, female (% of female labor force with intermediate education)
## 7                                   Unemployment, female (% of female labor force) (modeled ILO estimate)
## 8                                      Unemployment, female (% of female labor force) (national estimate)
## 9                  Unemployment, youth female (% of female labor force ages 15-24) (modeled ILO estimate)
## 10                    Unemployment, youth female (% of female labor force ages 15-24) (national estimate)

Female Population: China and The United States

Firstly, we would like to know about the how female contributes to the whole population in terms of percentage. Below chart is showing that the average percentage of female in US is around 50.75% from the data year range. The average percentage of female in China is around 48.65 from the data year range. The delta is 1.1%.

It tells us that China and US both have a relatively good gender balance, when only look at the total population percentage. The percentage had some fluctuation in both China and US, and the US has a more obvious trend. From year 1960 to 1980, the US female percentage increased, and from 1980 to 1990, it was stable at was at a relatively high position, then it has been decreasing starting around year 1990.

data %>%
  filter(stringr::str_detect(Indicator.Name,"Population, female")) %>%
  mutate(Year = as.integer(substr(Indicator,2,5))) %>%
  select(X...Country.Name,Year,Value) %>%
  ggplot(aes(x = Year,y=Value,color = X...Country.Name)) + geom_line(aes(group =X...Country.Name))+theme_minimal()

Population By Age Cohort

Secondly, after knowing the grand total, we would like to break into the details from age, which is a factor closely related to the research topic of the employment status.

From the below chart, the US age distribution is relatively more stable than China. from age 0-24, China and US almost have the same shape, as well as from 60 to older. However, from The highest bar of China falls into age 25-29 and age 45-49. And in the middle of the two ranges, which is the range from 30-44, the bars are relatively low. If we do the math here, those who are 30-44 years old were in the eighties of the twentieth century, when there was a peak of the birth control policy. So the total birth rate decreased, as well as the female birth rate.

age <- data %>%
  filter(stringr::str_detect(Indicator.Name,"female")) %>%
  filter(stringr::str_detect(Indicator.Name,"Population ages")) %>%
  mutate(Age.cohort = stringr::str_match(Indicator.Name,"([0-9]+)-([0-9]+)")[,1])%>% 
  select(X...Country.Name,Age.cohort,Indicator,Value) %>%
  spread(Indicator,Value)

age$Age.cohort <- age$Age.cohort %>% factor() 
age$Age.cohort <- factor(age$Age.cohort,levels(age$Age.cohort)[c(1,10,2:9,11:35)])
age2 <- age %>% 
  mutate(X2016 = ifelse(X...Country.Name == "China",-1.0*X2016,X2016)) %>%
  filter(!Age.cohort == "NA")

age2 %>%
  ggplot(aes(x = Age.cohort,color = X...Country.Name)) + geom_linerange(data = filter(age2,X...Country.Name == "United States"),aes(ymax = -0.3+X2016),ymin = -0.3,size = 3.5,alpha =0.8)+ geom_linerange(data = filter(age2,X...Country.Name == "China"),aes(ymin = -0.3,ymax = -0.3+X2016),size = 3.5,alpha =0.8)+geom_label(aes(x = age2$Age.cohort, y = 0, label = age2$Age.cohort), 
         inherit.aes = F,
         size = 3.5, label.padding = unit(0.0, "lines"), label.size = 0,
         label.r = unit(0.0, "lines"), fill = "#EFF2F4", alpha = 0.9, color = "#5D646F")+coord_flip()

Labor Force: Women In Labor Force

Now thirdly, we will look at the work force rate of female in China and US. The chart 3 is telling us that before 1994, the percentage of female working in China is larger than the percentage of female working in US. But since 1994, the percentage of female working in China has been decreasing from 45.25% to 43.55%. While the percentage of female working in US has been increasing from 44.4% to around 46%, and the US percentage hit the peak around year 2010. There are several factors which impact this: 1. More females in US have been encouraged to go out to work even after marriage. 2. US had a Financial Crisis in 2008 - 2010, when more females had to go out to work, which matches with the peak 3. China has a narrower range of the age, which females are accepted by the majority of the companies in the past

data %>%
  filter(stringr::str_detect(Indicator.Name,"Labor force, female")) %>%
  mutate(Year = as.integer(substr(Indicator,2,5))) %>%
  select(X...Country.Name,Year,Value) %>%
  filter(Year >= 1990) %>%
  ggplot(aes(x=Year,y=Value,color=X...Country.Name)) + geom_line()+ggtitle("Labor Participation: % Female in labor force")

Health: Mortality Rate

Last but not the least, we looked at the mortality rate of female in China and US. US has continuously decreasing in small scale from past to recent. China has continuously decreasing too, but in a much more larger scale since the starting point was high of more than 400 per the data set. As of recent years, the moraility rate has decreased to close to 0. The reason why China had a high female mortality rate may due to some historical points, such as the big famine back to 1960s. This chart is telling us both China and US is improving in terms of living status. And the improvement of the medicare is also contributing a lot in this case with more advance teacnologies got involved in human treatments.

data %>%
  filter(stringr::str_detect(Indicator.Name,"female")) %>%
  filter(stringr::str_detect(Indicator.Name,"Mortality rate, adult")) %>%
  mutate(Year = as.integer(substr(Indicator,2,5))) %>%
  select(Indicator.Name,X...Country.Name,Year,Value) %>%
  #filter(Year >= 1999,Year <= 2015) %>%
  ggplot(aes(x=Year,y=Value,color=X...Country.Name,group = X...Country.Name)) + geom_line()+ggtitle("Mortality Rate, Per 1000 Capita")