China has a, now famous, birth policy named the “One Child Policy” that began in 1980 and ended in 2016. Many people were concerned about the long-term impact of the One Child Policy, and most of the literature has found that under the One Child Policy, infanticide increased predominantly for female children and many men are struggling to find partners in a now male majority China. I look at annual birth rates from 1960 to 2020, indicated by the shaded area on the graph, to observe how the One Child Policy impacted China’s birth rates and what happened to birth rates soon after
search<-wb_search("birth")
brates<-wb_data(indicator='SP.DYN.CBRT.IN')
bratesSub<-subset(brates, iso2c=='1W'|iso2c=='US'|iso2c=='CN'|iso2c=='IN'|iso2c=='MX'|iso2c=='RU')
bratesplot<-ggplot(bratesSub, aes(date,SP.DYN.CBRT.IN, color=country))+ geom_rect(aes(xmin=1980, xmax=2016, ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+geom_point() +theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))
bratesplot
From 1960 to 1980, there is a steep decline in China’s birth rate signaling that before the One Child Policy birth rates were steadily approaching the US. This steep drop in birth rates is likely explained through the spike in reported deaths during this same period. Soon after the One Child Policy was initiated, there is a small increase in the birth rate, then a steady decline. After the end of the One Child Policy, birth rates remain low, with a small decrease in the slope after 2016. From this data, it would appear that the One Child Policy had an impact on the culture leading to a decrease in unregulated birth rates, post policy birth rates. To overcome this, China is now incentivizing young families to have children.
A key way to combat the rise of COVID-19 infections was to predict spikes in outbreaks before they had occurred. Using google trends data I select key words associated with COVID-19, “COVID” and “no smell” and plot them with the daily change in confirmed covid cases by state. The shaded areas are the month when a new variant was discovered. The variants are Beta (May 2020), Alpha (September 2020), Gamma (November 2020), Delta (October 2020), and Omicron (November 2021)
covid<-read.csv('G:/My Drive/Seans Drive/PhD/Classes/AQME/HW/gtrends/United_States_COVID-19_Cases_and_Deaths_by_State_over_Time.csv')
gtrendsCovidNY<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-NY"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidNY$state="NY"
gtrendsCovidNYs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-NY"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidNYs$state="NY"
gtrendsCovidFL<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-FL"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidFL$state="FL"
gtrendsCovidFLs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-FL"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidFLs$state="FL"
gtrendsCovidGA<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-GA"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidGA$state="GA"
gtrendsCovidGAs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-GA"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidGAs$state="GA"
gtrendsCovidMN<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MN"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMN$state="MN"
gtrendsCovidMNs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MN"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMNs$state="MN"
gtrendsCovidMI<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MI"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMI$state="MI"
gtrendsCovidMIs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MI"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMIs$state="MI"
gtrendsCovidWI<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-WI"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidWI$state="WI"
gtrendsCovidWIs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-WI"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidWIs$state="WI"
gtrendsCovidMA<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MA"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMA$state="MA"
gtrendsCovidMAs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-MA"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidMAs$state="MA"
gtrendsCovidTX<-gtrends(c("covid"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-TX"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidTX$state="TX"
gtrendsCovidTXs<-gtrends(c("no smell"), time = "2020-01-01 2022-03-20",
gprop =c("web"), geo = c("US-TX"), onlyInterest=FALSE)$interest_over_time
gtrendsCovidTXs$state="TX"
gtrendsCovid<-rbind(gtrendsCovidNY,gtrendsCovidFL,gtrendsCovidGA,gtrendsCovidMN,gtrendsCovidMI,gtrendsCovidMA)
gtrendsCovids<-rbind(gtrendsCovidNYs,gtrendsCovidFLs,gtrendsCovidGAs,gtrendsCovidMNs,gtrendsCovidMIs,gtrendsCovidMAs)
CovidData<-subset(covid,(state=='NY'|state=='FL'|state=='GA'|state=='MA'|state=='MI'|state=='WI'|state=='MN'|state=='TX'))
CovidData$submission_date<-as.Date(CovidData$submission_date, '%m/%d/%Y')
dt<- as.data.table(CovidData)
CovidData<- as.data.table(CovidData)
dt<-dt[order(state,submission_date)]
CovidData<-CovidData[order(state,submission_date)]
CovidData$cases<-dt[, .(submission_date, cases=(tot_cases-shift(tot_cases,1))/1000), by=state]$cases
names(CovidData)[names(CovidData) == 'submission_date'] <- 'date'
gtrendsCovid$date<-as.Date(gtrendsCovid$date, '%Y/%m/%d')
gtrendsCovids$date<-as.Date(gtrendsCovids$date, '%Y/%m/%d')
plot1<-ggplot()+geom_point(data=gtrendsCovid, aes(x=date,y=hits,color=state))+geom_line(data=CovidData, aes(x=date,y=cases,color=state))+ geom_rect(aes(xmin=as.Date('2020-09-01','%Y-%m-%d'), xmax=as.Date('2020-10-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-05-01','%Y-%m-%d'), xmax=as.Date('2020-06-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-11-01','%Y-%m-%d'), xmax=as.Date('2020-12-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-10-01','%Y-%m-%d'), xmax=as.Date('2020-11-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2021-11-01','%Y-%m-%d'), xmax=as.Date('2021-12-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = .5, color=NA) + scale_y_discrete(breaks = seq(0, 100, by = 20))+ ylab('')+ggtitle("Daily 1000 Covid Cases and Search for Covid")
plot1
The google search of Covid closely follows the daily cases of covid in the US. Surprisingly there appears to be little to no lag in the google trends data and the covid cases. This is likely due to people searching for “covid tests” or searching covid after testing positive. We then compare the search of a symptom of covid, “no smell”, to see if individual search for covid related symptoms before testing positive. We would expect that the increase in searches for covid related symptoms should precede the confirmed cases and give researchers and health officials an indicator for new spikes in COVID-19 cases
plot2<-ggplot()+geom_point(data=gtrendsCovids, aes(x=date,y=hits,color=state))+geom_line(data=CovidData, aes(x=date,y=cases,color=state))+ geom_rect(aes(xmin=as.Date('2020-09-01','%Y-%m-%d'), xmax=as.Date('2020-10-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-05-01','%Y-%m-%d'), xmax=as.Date('2020-06-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-11-01','%Y-%m-%d'), xmax=as.Date('2020-12-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2020-10-01','%Y-%m-%d'), xmax=as.Date('2020-11-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = 0.5, color=NA)+ geom_rect(aes(xmin=as.Date('2021-11-01','%Y-%m-%d'), xmax=as.Date('2021-12-01','%Y-%m-%d'), ymin=-Inf, ymax=Inf), fill = "gray", alpha = .5, color=NA)+ scale_y_continuous(breaks = seq(0, 100, by = 20))+ ylab('')+ggtitle("Daily 1000 Covid Cases and Search for No Smell")
plot2
THe data is much noisier, but similarly to “covid” the increase in people searching for “no smell” does not precede the increase in daily covid cases, but instead follows closely to the confirmed cases. Once again this could be that the symptoms of COVID may come later or post test making it a poor indicator for future spikes in COVID-19 cases in the US.