Explore the data and come up with a question to guide your analysis.
Use dplyr’s summarize, count, and
group_by functions to produce a report that describes two
key insights that answer your question. Then, use ggplot2’s functions to
visualize one of your insights.
library(dplyr)
library(ggplot2)
library(readr)
gender <- read_csv("Gender_StatsCSV.csv")
countryseries <- read_csv("Gender_Statscountry-series.csv") # info about how the data was extracted
country.sheet <- read_csv("Gender_StatsCountry.csv") # info about the country
footnote <- read_csv("Gender_Statsfootnote.csv") # ??????
series.time <- read_csv("Gender_Statsseries-time.csv")
series <- read_csv("Gender_Statsseries.csv") # information about the series like unit of measure
After reviewing the data, it has a mixture of topics like
Assets,
Economic Policy & Debt: National accounts: Growth rates,
Education: Efficiency,
Norms and Decision-making, and others. This report will
cover Japan and USA’s fertility rates, measured by births per woman,
between the years of 1960 and 2023:
series.info <- series[series$`Indicator Name`=="Fertility rate, total (births per woman)",]
series.info$`Long definition`
## [1] "Total fertility rate represents the number of children that would be born to a woman if she were to live to the end of her childbearing years and bear children in accordance with age-specific fertility rates of the specified year."
# create and clean subset with only the fertility rates in Japan, and United States
fertility <- subset(gender,`Country Code` %in% c("JPN","USA")) |> subset(`Indicator Name`=="Fertility rate, total (births per woman)")
fertility <- na.omit(tidyr::gather(fertility,`1960`:`2023`,key="year",value="ratio"))
fertility$year <- as.numeric(fertility$year)
fertility <- fertility %>% rename(`country`=`Country Name`,`ctry.code`=`Country Code`,`indicator`=`Indicator Name`,`ind.code`=`Indicator Code`)
fertility
## # A tibble: 126 × 6
## country ctry.code indicator ind.code year ratio
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 Japan JPN Fertility rate, total (births p… SP.DYN.… 1960 2
## 2 United States USA Fertility rate, total (births p… SP.DYN.… 1960 3.65
## 3 Japan JPN Fertility rate, total (births p… SP.DYN.… 1961 1.96
## 4 United States USA Fertility rate, total (births p… SP.DYN.… 1961 3.62
## 5 Japan JPN Fertility rate, total (births p… SP.DYN.… 1962 1.98
## 6 United States USA Fertility rate, total (births p… SP.DYN.… 1962 3.46
## 7 Japan JPN Fertility rate, total (births p… SP.DYN.… 1963 2
## 8 United States USA Fertility rate, total (births p… SP.DYN.… 1963 3.32
## 9 Japan JPN Fertility rate, total (births p… SP.DYN.… 1964 2.05
## 10 United States USA Fertility rate, total (births p… SP.DYN.… 1964 3.19
## # ℹ 116 more rows
fertility %>% filter(country=="Japan") %>% ggplot(aes(`year`,`ratio`)) + geom_point(aes(color=country))
Per Suzuki and Kashiwase on World Bank Blogs (The
curse of the Fire-Horse: How superstition impacted fertility rates in
Japan):
“Many Japanese families chose not to have children in 1966 due to their superstition of “Hinoe-Uma (Fire-Horse)”. Fire-Horse is the 43rd combination of the sexagenary cycle, which happens every 60 years. The superstition is that women born in this year of the “Fire-Horse” have a bad personality and will kill their future husband.”
The below graph shows a dramatic drop in birth rate in 1966 for Japan
and aligns with the article’s explanation of the Hinoeuma. The writers
believe that the superstition of Hinoeuma, which next turn is in 2026,
won’t make much of a dramatic impact.
It’s well-known that Japan has a really low birth rate. I’ve heard by word of mouth that many believe that the low birth rate has to do something with the high expectations of doing very well at school to get a very good job (no one has the time to date). It would be hard to make correlations between the superstition and the birth rate since the academia-employment culture takes a toll also.
ggplot(fertility, aes(`year`,`ratio`)) + geom_point(aes(color=country)) + geom_smooth()
Japan’s birth rate has generally been lower than the US, except for the period between the 1970s.
Per Population Reference Bureau,
“The social setting was, of course, quite different from the 1930s. Feminism was emerging, new methods of contraception had become available, and the national legalization of abortion took place in 1973. For many of these reasons, the late 1960s and 1970s represented a departure from the “baby boom” that had just preceded it when a large proportion of births were unplanned.”
The above reasons indicate that Japan was probably not directly correlated with US’ birth rate decline in the 1970s. As mentioned before, Japan’s birth rate was steadily decreasing through the decades, so it isn’t too surprising that it didn’t make a difference on US’ dip in birth rate.
By speculation with the quote above, the dramatic drop may have been associated with womens’ right to saying “no.”
fertility.mean <- fertility %>% group_by(year) %>% summarise(
total.mean=mean(ratio)
)
fertility.mean
## # A tibble: 63 × 2
## year total.mean
## <dbl> <dbl>
## 1 1960 2.83
## 2 1961 2.79
## 3 1962 2.72
## 4 1963 2.66
## 5 1964 2.62
## 6 1965 2.53
## 7 1966 2.15
## 8 1967 2.39
## 9 1968 2.30
## 10 1969 2.29
## # ℹ 53 more rows
ggplot(fertility.mean, aes(`year`,`total.mean`)) + geom_point()
The graph from the last question pretty much accomplishes this point, but there are some insights to consider for both of these first world countries./
In general, we can see that the birth rate average between Japan and US declined drastically between the 1960s and 1970s for Japan and the US. It’s not obvious in the data that the 1960s was a hill; there was actually an dramatic jump right before the data started. The generations that were born during this time are called the “Baby Boomers” and the “Gen X.” /
There’s a two-decade period just before the 2020s that implied that the birth rates were steady. However, if you look at the second graph with both Japan and US, there’s a large, somewhat even distance between the country and the average. In a larger picture, there isn’t too much worry about dynamic birth rates. But if the scope is just on Japan or US, there’s a really low or really high trend, respectively.