Covariance: Measures the direction of the relationship between two random variables.
Correlation: Measures the direction and the magnitude of the relationship between two random variables.
# Load data
greenhouse_gas <- read.csv("/Users/pin.lyu/Desktop/greenhouse_gas_emissions.csv")
carbon_footprint <- read.csv("/Users/pin.lyu/Desktop/carbon_footprint_by_product.csv")
colnames(carbon_footprint) <- c("Year", "Product", "Base_Storage", "Carbon_Footprint")
colnames(greenhouse_gas) <- c("Year", "Category", "Type", "Scope", "Description", "Emissions")
# Combine two data sets
Emissions <- inner_join(greenhouse_gas, carbon_footprint,
by = "Year" )
## Warning in inner_join(greenhouse_gas, carbon_footprint, by = "Year"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 86 of `x` matches multiple rows in `y`.
## ℹ Row 2 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
# summary table
stargazer(Emissions, type = 'text', title = 'Emission Dats Summary Statistics')
##
## Emission Dats Summary Statistics
## ====================================================================
## Statistic N Mean St. Dev. Min Max
## --------------------------------------------------------------------
## Year 153 2,018.333 2.218 2,015 2,022
## Emissions 119 2,002,429.000 5,406,279.000 -500,000 29,600,000
## Base_Storage 153 71.111 33.081 32 128
## Carbon_Footprint 153 64.778 8.037 54 79
## --------------------------------------------------------------------
plot(Carbon_Footprint ~ Base_Storage,
data = Emissions,
main = "Carbon Footprint vs. Base Storage (2015 - 2022)",
xlab = "Carbon Footprint",
ylab = "Base Storage"
)
# Correlation value
cor(Emissions$Carbon_Footprint, Emissions$Base_Storage, use = "complete.obs")
## [1] 0.140592
Comments: As the graph suggests, the correlation value between the two selected variables are 0.141 which means that there is not a significant relationship between the two variables.
# Covariance value
cov(Emissions$Carbon_Footprint, Emissions$Base_Storage, use = "complete.obs")
## [1] 37.38012
Comments: The value of covariance is at a level of positive 37.38 which suggest that there is a positive relationship between the two variables selected. However, since the maximal value that covariance can take is positive infinite, this means that the positive relationship between the two is relatively weak.