In this assignment, I present some interesting cases using the World Bank’s World Development Indicator’s Data and Google Trends Data.

TASK 1: Using World Bank Data

I am interested to know the effect of chemical fertilizer usage in national production at the aggregate level. Note that this is very simple example and should not be interpreted as actual analysis. The variables I would need are:

  1. Cereal production (metric tons): Production data on cereals relate to crops harvested for dry grain only.

  2. Fertilizer consumption (% of fertilizer production)

  3. GDP per capita proxies the level of development, but because of lot of missing values, I am using total tax share in GDP as a proxy.

Using the World Bank API, I can download the data and conduct the analysis as follows:

#Downloading data from World Bank:
library(wbstats)
str(wb_cachelist, max.level = 1)
## List of 8
##  $ countries    : tibble [304 × 18] (S3: tbl_df/tbl/data.frame)
##  $ indicators   : tibble [16,649 × 8] (S3: tbl_df/tbl/data.frame)
##  $ sources      : tibble [63 × 9] (S3: tbl_df/tbl/data.frame)
##  $ topics       : tibble [21 × 3] (S3: tbl_df/tbl/data.frame)
##  $ regions      : tibble [48 × 4] (S3: tbl_df/tbl/data.frame)
##  $ income_levels: tibble [7 × 3] (S3: tbl_df/tbl/data.frame)
##  $ lending_types: tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
##  $ languages    : tibble [23 × 3] (S3: tbl_df/tbl/data.frame)
#lets search for the available variables using regular expressions:
income_vars <- wb_search(pattern = "GDP")
income_vars2 <- wb_search(pattern = "production")

# I am interested in GC.TAX.TOTL.GD.ZS, AG.PRD.CREL.MT and AG.CON.FERT.PT.ZS
GDPdata <- wb_data("GC.TAX.TOTL.GD.ZS",
                start_date = 2000, end_date = 2020,
                return_wide = FALSE)
fertilizerdata <- wb_data("AG.CON.FERT.PT.ZS",
                start_date = 2000, end_date = 2020,
                return_wide = FALSE)
productiondata <- wb_data("AG.PRD.CREL.MT",
                start_date = 2000, end_date = 2020,
                return_wide = FALSE)

#joining the dataset with multiple joins
library(dplyr)
temp <- full_join(productiondata, fertilizerdata, by = c('country' = 'country', 'date' = 'date'))
dataset2use <- full_join(temp, GDPdata, by = c('country' = 'country', 'date' = 'date'))

#value.x = data for production, value.y = data for fertilizer, value = economy
dataset2use <- subset(dataset2use, select= c(country, date, value.x, value.y, value)) 
dataset2use$lnproduction = log(dataset2use$value.x)
dataset2use$lnfertilizer = log(dataset2use$value.y)
dataset2use$economy = log(dataset2use$value)

#lets plot the relation and fit a line first,
library(lattice)
xyplot(dataset2use$lnproduction ~ dataset2use$lnfertilizer, type=c("smooth", "p"),
       main = "Figure 1: lnProduction vs. lnFertilizer",
       xlab = "log of fertilizer consumption",
       ylab = "log of cereal production")

#panel regression 
library(plm)
model <- plm(lnproduction ~ lnfertilizer + economy, data = dataset2use, model="within", effect = "twoways")

#tabulate the regression results:
library(stargazer)
stargazer(model,
           type="text",
           align=TRUE,
           no.space=TRUE,
           column.labels=c("log of cereal production"),
           covariate.labels = c("lnfertilizer", "Economic Development"),
           title="Table 1: Impact of Fertilizer Use in Cereal Production Across Nations (2000-2020)")
## 
## Table 1: Impact of Fertilizer Use in Cereal Production Across Nations (2000-2020)
## ================================================
##                          Dependent variable:    
##                      ---------------------------
##                             lnproduction        
##                       log of cereal production  
## ------------------------------------------------
## lnfertilizer                  0.075***          
##                                (0.012)          
## Economic Development          -0.093**          
##                                (0.040)          
## ------------------------------------------------
## Observations                    1,176           
## R2                              0.039           
## Adjusted R2                    -0.048           
## F Statistic           21.903*** (df = 2; 1077)  
## ================================================
## Note:                *p<0.1; **p<0.05; ***p<0.01

This illustrates that, upon using two-way fixed effects regression, increase in 1% consumption of fertilizer increases 0.075% cereal production, which is significant at 1% level. Likewise, with the increasing economy size, the cereal production decreases. Latter result is illustrating that with higher the level of economic development, country produces either more variety of agricultural goods or focuses more on non-agricultural production. This simple illustration in Table 1 interestingly captures the results shown in Figure 1, although the curve in figure is non-linear.