library(ggplot2)
library(markdown)
library(rmarkdown)
library(tidyr)
library(tidyselect)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ tibble 3.1.6 ✔ dplyr 1.0.7
## ✔ readr 2.1.2 ✔ stringr 1.4.0
## ✔ purrr 0.3.4 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readxl)
library(stats)
I am interested in examining the relationship between exports from China to the US, and the increase in Co2 emissions over the years. China is marked to be the highest Co2 emitting country followed by the US. However, it is clear that China pulls some of the carbon weight for the United States by manufacturing a wealth of goods. Research suggests that economic activity in the form of international trade accounts for a significant portion of Chinese Co2 emissions. Initial reports suggest that exports could account for up to 34% of total carbon emissions from china, suggesting more research in this area is needed. Specifically, it is relevant to determine the impact of exports to specific countries, with the goal of guiding international policy on trade and environmental impact.
I am comparing export data with Co2 emissions for three of China’s top export consumers: USA, Hong Kong, and Japan. My goal is to determine if there is 1) a relationship between export values and carbon emissions 2) to determine the level of difference between export values and carbon emissions from these three consumer bodies. This will be an initial probe into the dilemma, as obvious further analyses would be needed con consider differences in export items and their respective Co2 emissions. For example, it could be the case that while the US is the top export consumer, Japan may consume items that substantiate a greater sum of Co2 emissions.
Four initial datasets have been pulled for the purpose of this study. All export data includes data on exports from China to the US, Japan, and Hong Kong. These datasets include date ranges from 1992 to 2019 The overall dataset has 3 columns, each containing 30 rows. This dataset was chosen because it covers a relatively adequate sample range from 1992-2020. This data was pulled from the United Nations COMTRADE database on comerce and trade.
export_USA<- read_excel("~/Documents/export data.xlsx")
export_japan<- read_excel("~/Downloads/comtrade_historical_CHNJPN00002.xls")
export_hk<- read_excel("~/Downloads/comtrade_historical_CHNHKG00002.xls")
summarise(export_USA)
## # A tibble: 1 × 0
summarise(export_japan)
## # A tibble: 1 × 0
summarise(export_hk)
## # A tibble: 1 × 0
Carbon data was pulled to show the difference in carbon emissions from china between the years 2010-2020. It is unclear, yet, if this dataset will be used for final drafts, as it exludes a number of years reviewed in the export data. As a result, the gaps in years may lead to weaker analyses. For now, this data will be considered.
library(readxl)
carbon_data <- read_excel("~/Documents/Carbon Data.xlsx")
summarise(carbon_data)
## # A tibble: 1 × 0
ggplot(export_USA, aes(x = Date, y = Value)) + geom_bar(stat="identity") + labs(title = "China Exports to USA")
ggplot(export_japan, aes(x = Date, y = Value)) + geom_bar(stat="identity") + labs(title = "China Exports to Japan")
ggplot(export_hk, aes(x = Date, y = Value)) + geom_bar(stat="identity") + labs(title = "China Exports to Hong Kong")
Needing to merge these datasets to make a single set for analysis. Here, both data sets were joined by Date to create a single data set containing export values and carbon emissions.
all_china<- list(export_USA, export_hk, export_japan, carbon_data)
all_china1<-all_china %>% reduce(full_join, by='Date')
view(all_china1)
usa_reg<-lm(Emissions ~ Value.x, data = all_china1)
hk_reg<-lm(Emissions ~ Value.y, data = all_china1)
jp_reg<-lm(Emissions ~ Value, data = all_china1)
all_reg<-lm(Emissions ~ Value + Value.x + Value.y, data = all_china1)
summary(usa_reg)
##
## Call:
## lm(formula = Emissions ~ Value.x, data = all_china1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1090596 -212490 -25167 181380 978675
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.577e+06 1.211e+05 21.29 <2e-16 ***
## Value.x 1.881e-05 4.789e-07 39.28 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 399500 on 26 degrees of freedom
## Multiple R-squared: 0.9834, Adjusted R-squared: 0.9828
## F-statistic: 1543 on 1 and 26 DF, p-value: < 2.2e-16
summary(hk_reg)
##
## Call:
## lm(formula = Emissions ~ Value.y, data = all_china1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1761173 -327620 -31856 364807 1521943
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.429e+06 2.268e+05 10.71 4.97e-11 ***
## Value.y 2.423e-05 1.136e-06 21.33 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 721600 on 26 degrees of freedom
## Multiple R-squared: 0.9459, Adjusted R-squared: 0.9439
## F-statistic: 455 on 1 and 26 DF, p-value: < 2.2e-16
summary(jp_reg)
##
## Call:
## lm(formula = Emissions ~ Value, data = all_china1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -958479 -368672 -16869 465440 935227
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.245e+06 1.943e+05 6.408 8.68e-07 ***
## Value 5.953e-05 1.978e-06 30.094 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 518500 on 26 degrees of freedom
## Multiple R-squared: 0.9721, Adjusted R-squared: 0.971
## F-statistic: 905.7 on 1 and 26 DF, p-value: < 2.2e-16
summary(all_reg)
##
## Call:
## lm(formula = Emissions ~ Value + Value.x + Value.y, data = all_china1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -486684 -136543 -44185 75062 853091
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.001e+06 1.449e+05 13.806 6.51e-13 ***
## Value 2.305e-05 5.236e-06 4.402 0.00019 ***
## Value.x 1.115e-05 1.592e-06 7.001 3.07e-07 ***
## Value.y 8.537e-07 2.006e-06 0.426 0.67422
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 282000 on 24 degrees of freedom
## Multiple R-squared: 0.9924, Adjusted R-squared: 0.9914
## F-statistic: 1042 on 3 and 24 DF, p-value: < 2.2e-16
jp_plot<-ggplot(data = all_china1, aes(x = Value, y = Emissions)) +
geom_point() +
geom_smooth(method = lm)
jp_plot
## `geom_smooth()` using formula 'y ~ x'
hk_plot<-ggplot(data = all_china1, aes(x = Value.y, y = Emissions)) +
geom_point() +
geom_smooth(method = lm)
hk_plot
## `geom_smooth()` using formula 'y ~ x'
usa_plot<-ggplot(data = all_china1, aes(x = Value.x, y = Emissions)) +
geom_point() +
geom_smooth(method = lm)
usa_plot
## `geom_smooth()` using formula 'y ~ x'
### Diagnostics
par(mfrow = c(2,3)); plot(usa_reg, which = 1:6)
par(mfrow = c(2,3)); plot(hk_reg, which = 1:6)
par(mfrow = c(2,3)); plot(jp_reg, which = 1:6)
par(mfrow = c(2,3)); plot(all_reg, which = 1:6)