## [1] "Year"
## [2] "Ontario GHG ID"
## [3] "Facility Owner"
## [4] "Facility Name"
## [5] "Facility City"
## [6] "Facility Primary NAICS Code"
## [7] "Carbon dioxide (CO2) from non-biomass in CO2e (t)"
## [8] "Carbon dioxide (CO2) from biomass in CO2e (t)"
## [9] "Methane (CH4) in CO2e (t)"
## [10] "Nitrous oxide (N2O) in CO2e (t)"
## [11] "Sulphur hexafluoride (SF6) in CO2e (t)"
## [12] "Hydrofluorocarbons (HFCs) in CO2e (t)"
## [13] "Perfluorocarbons (PFCs) in CO2e (t)"
## [14] "Nitrogen Trifluoride (NF3) in CO2e (t)"
## [15] "Total CO2e from all sources in CO2e (t)"
## [16] "Reporting Amount in CO2e (t)"
## [17] "Verification Amount in CO2e (t)"
## [18] "Accredited Verification Body"
Year: The year when the facility’s greenhouse gas emissions were assessed.
Ontario GHG ID: A unique identifier assigned by the Ministry to each reporting facility.
Facility Owner: The legal name of the corporation/entity that owns the facility
Facility Name: The specific name of the facility where emissions were recorded.
Facility City: The city where the facility is located
Facility Primary NAICS Code: A standardized classification code (North American Industry Classification System) used to categorize the type of business activity
Carbon Dioxide (\(CO_2\)) from Non-Biomass in CO2e (t): Total emissions from fossil fuel sources (e.g., coal, oil, gas) in CO2e (t)
Carbon Dioxide (\(CO_2\)) from Biomass in CO2e (t) : Total emissions from fossil fuel sources (e.g., wood, crops, organic waste) in CO2e (t)
Methane (\(CH_4\)) in CO2e (t): Total emissions from methane measured in CO2e (t)
Nitrous Oxide (\(N_2O\)) in CO2e (t): Total emissions from nitrous oxide in CO2e (t)
Sulphur Hexafluoride(\(SF_6\)) in CO2e (t): Total emissions from sulphur hexafluoride in CO2e (t)
Hydrofluorocarbons (HFCs) in CO2e (t): Total emissions from HFC used in refrigeration, air conditioning, etc measured in CO2e (t)
Perfluorocarbons (PFCs) in CO2e (t): Total emissions from industrial sources such as aluminium production measured in CO2e (t)
Nitrogen Trifluoride (\(NF_3\)) in CO2e (t): Total emissions from nitrogen trifluoride, often used in electronics manufacturing measured in CO2e (t)
Total CO2e from All Sources in CO2e (t): The sum of all greenhouse gases emitted by the facility (biomass, non-biomass, and other gases) in CO2e (t)
Reporting Amount in CO2e (t) : The total emissions value officially reported by the facility in CO2e (t)
Verification Amount in CO2e (t): The portion of reported emissions independently verified by an accredited third party in CO2e (t)
Accredited Verfication Body: The name of the external organization responsible for verifying the facility’s reported emissions
Note: CO2e standardizes all greenhouse gases based on their global warming potential with respect to CO2, expressed in metric tonnes (t)
Specified Greenhouse Gases Activities from 2010 to 2021 were collected by the Government of Ontario, specifically by the Ministry of the Environment, Conservation and Parks. It was collected by using quantitation methods in the incorporated Guideline for Quantification, Reporting and Verification of Greenhouse Gas Emissions. The dataset was used as a baseline to understand emissions profiles and manage and reduce greenhouse gas emissions.
This report aims to explore the relationship between various factors in greenhouse gas emissions in Ontario from 2010 to 2021 based on Specified Greenhouse Gases Activities from 2010 to 2021 data. In particular, the key research questions are:
What is the trend in greenhouse gas emissions data from 2010 to 2021?
Which facility owner in Ontario produced the highest amount of total greenhouse gas emissions (in CO2e (t)) in 2010 to 2021?
Which city in Ontario emits the highest amount of total greenhouse gas emissions (in CO2e (t)) in 2010 to 2021?
Which year recorded the highest average greenhouse gas emissions (in CO2e (t)) in Ontario?
How do the reported emissions data compare to the actual verified emissions from facilities in Ontario?
How can we forecast the future growth of average greenhouse gas emissions in Ontario?
| Year | CO2 | CH4 | N2O | SF6 | HFCs | PFCs | NF3 | Mean CO2e |
|---|---|---|---|---|---|---|---|---|
| 2010 | 58718651 | 240624.7 | 397213.0 | 0.00 | 18393.00 | 0 | 0 | 398489.6 |
| 2011 | 52638649 | 241876.8 | 356883.6 | 0.00 | 181.14 | 0 | 0 | 352567.3 |
| 2012 | 52624766 | 225215.5 | 302992.0 | 170.38 | 352.79 | 0 | 0 | 354357.1 |
| 2013 | 47726844 | 222279.3 | 292826.9 | 0.00 | 500.32 | 0 | 0 | 313263.1 |
| 2014 | 44982436 | 240587.5 | 254573.0 | 0.00 | 1646.91 | 0 | 0 | 293415.0 |
| 2015 | 45565600 | 238669.9 | 287930.7 | 0.00 | 1348.97 | 0 | 0 | 193670.9 |
| 2016 | 46048303 | 821491.5 | 276829.9 | 87723.37 | 287.93 | 0 | 0 | 171762.8 |
| 2017 | 42731567 | 847137.9 | 264599.3 | 105695.36 | 318.75 | 0 | 0 | 156869.6 |
| 2018 | 45501375 | 764380.4 | 268592.9 | 111119.27 | 940.75 | 0 | 0 | 165825.0 |
| 2019 | 13986535 | 3708263.4 | 255851.2 | 174880.51 | 2367.05 | 0 | 0 | 138253.3 |
| 2020 | 13158262 | 3627471.3 | 240969.1 | 166137.95 | 2653.43 | 0 | 0 | 126251.8 |
| 2021 | 13375289 | 4032305.9 | 289023.1 | 182290.90 | 5135.17 | 0 | 0 | 133722.9 |
The table shows the amount of greenhouse gases produced each year from 2010 to 2021 based on the Specified Greenhouse Gases Activities from 2010 to 2021 data. The table includes the total amount of Carbon Dioxide (\(CO_2\)), Methane (\(NH_4\)), Nitrous Oxide (\(N_2O\)), Sulfur hexafluoride (\(SF_6\)), Hydrofluorocarbons (HFCs), Perfluorocarbons (PFCs), and Nitrogen Trifluoride (\(NF_3\)) gases produced in CO2e (t) units. Moreover the data of Sulfur hexafluoride (\(SF_6\)) and Hydrofluorocarbons (HFCs) are rounded to 2 decimal places. Additionally, the rightmostmost column shows the average total greenhouse gases in CO2e (t) produced by each facility with available data. The table illustrates a gradual decrease in average CO2e emissions each year, starting from 59374943 in 2010 to 51082150 in 2021.
| Facility Owner | Total CO2 |
|---|---|
| Imperial Oil | 37945995 |
| Stelco | 37914837 |
| Ontario Power Generation | 27343473 |
| ArcelorMittal Dofasco | 25546697 |
| Essar Steel Algoma | 20781568 |
| Domtar | 19739183 |
| ArcelorMittal Dofasco G.P. | 19279341 |
| Algoma Steel | 16436029 |
| St. Marys Cement | 15322028 |
| Northland Power | 12562646 |
The table displays the top 10 facility owners in terms of total CO2e produced from all sources from 2010 to 2021 based on the dataset. The name of facility owners are listed under the column “Facility Owner” and the corresponding total CO2e produced are listed under “Total CO2e” with CO2e (t) as the unit.
The table is sorted in descending order based on the total CO2e produced, with Imperial Oil as the biggest contributor of total greenhouse CO2e, followed by Stelco and Ontario Power Generation.
This table can be useful for policymakers, officials, or analysts to identify major greenhouse gas contributors and guide regulatory actions to be made. It can also set a benchmark for companies aiming to reduce their emissions.
| Facility City | Total CO2 |
|---|---|
| Hamilton | 69432908 |
| Sault Ste. Marie | 53440464 |
| Sarnia | 52581271 |
| Haldimand County | 38208158 |
| Corunna | 34334863 |
| Courtright | 28827147 |
| Nanticoke | 27926277 |
| Mississauga | 22567536 |
| Bowmanville | 16911617 |
| Thunder Bay | 16740678 |
The table shows the top 10 facility cities in terms of total CO2e produced from all sources from 2010 to 2021 based on the dataset. The names of facility cities are listed under the column “Facility City” and the corresponding total CO2e produced are listed under “Total CO2e” with CO2e (t) as the unit.
The table is sorted in descending order based on the total CO2e produced, with Hamilton as the biggest contributor of total greenhouse CO2e, followed by Sault Ste. Marie and Sarnia.
This table can be useful for policymakers, city officials, or analysts to identify major greenhouse gas contributors and guide regulatory actions to be made. It can also set a benchmark for cities aiming to reduce their emissions and improve their environmental policies.
| Facility Owner | Difference | Reported CO2e | Verified CO2e |
|---|---|---|---|
| Domtar | 17087994 | 2651189 | 19739183 |
| Produits forestiers Résolu | 8175408 | 1488808 | 9664216 |
| AV Terrace Bay | 7100097 | 1393026 | 8493123 |
| Resolute FP Canada | 3898188 | 772200 | 4670388 |
| AbiBow Canada | 3723549 | 446374 | 4169923 |
| Northland Power | 3340330 | 9222316 | 12562646 |
| Anthony Forest Products Company | 2581197 | 537676 | 3118873 |
| Tembec | 1568336 | 219235 | 1787571 |
| Ontario Power Generation | 1193820 | 26149653 | 27343473 |
| Atlantic Power LP | 1126288 | 2834339 | 3960627 |
The table shows the top 10 facility owners in terms of difference between the reported CO2e produced and verified CO2e produced from all sources based on the dataset. It represents the amount of additional CO2e that is found to be actually produced compared to the total CO2e produced that is reported by the facility owners from 2010 to 2021. The names of facility owners are listed under the column “Facility Owner” and the corresponding total additional CO2e produced are listed under “Difference” with CO2e (t) as the unit.
This table is sorted in descending order based on the values in the column “Difference”, with Domtar having the largest discrepancy between reported and actual CO2, followed by Produits forestiers Resolu and AV Terrace Bay.
The information presented by this table can be useful for regulators to ensure compliance with environmental laws, as it highlights discrepancies and helps to identify facility owners that may be underreporting greenhouse gas emissions. Additionally, it can highlight gaps in reports, driving future improvements.
The bar graph illustrates the top 10 facility cities in terms of total greenhouse gases produced based on the dataset from 2010 until 2021 in CO2e (t). Each bar graph represents the total greenhouse gases produced per city.
Based on the graph, it is sorted in descending order and we can see that Hamilton was the city which produced the most emissions, with an approximation of 70 million CO2e (t), followed by Sault Ste. Marie, with an emission of around 53 million CO2e (t) and Sarina, with an emission of roughly 52 million CO2e (t).
The line graph shows the overall emission of each greenhouse gases, excluding carbon dioxide, from 2010 until 2021, in CO2e (t).
Based on the graph, we can see that from 2015 until 2016, methane (NH4) was significantly increasing from roughly 250 000 CO2e (t) to approximately 800 000 CO2e (t), then it slightly decreasing for 2 years and followed by a drastic growth in 2019 with roughly 3.7 million CO2e (t). The other gases were relatively stable in these period.
The bar graph illustrates the total carbon dioxide gas emission from both non-biomas and biomass for 11 years since 2010, measured in CO2e (t).
Based on the graph, we can see that carbon dioxide from non-biomass has more emission compared to carbon dioxide from biomass. Overall, the trend of the carbon dioxide emission is relatively decreasing as the years go by (from 2010 with approximately 58 million CO2e (t) to 2021 with roughly 45 milion CO2e (t)).
The density plot represents the distribution of reported and actual emissions, measured in CO2e (t).
We can see that, based on the graph, the actual emission is higher than the reported emission. The peak of the distribution of the reported emission is when the emission is at approximately 41.5 million CO2e (t), whereas, the of the distribution of the actual emission is when the emission is at roughly 46 million CO2e (t).
Hypothesis Testing assuming the variance of the data is not equal Hypothesis: \(H_0\) : The average reported emission in CO₂e (r) is equal to the average verified emission in CO₂e (v); that is,
\(H_0\): \(\mu_r\) = \(\mu_v\)
\(H_a\): The average reported emission in CO₂e (r) is not equal to the average verified emission (v); that is,
\(H_a\): \(\mu_r \neq \mu_v\)
To evaluate this hypothesis, we can apply a two-sample t-test to compare the mean of reported emissions with the mean of verified emissions. This involves splitting the data into two groups: one for reported emissions and one for verified emissions.
We then calculate a t-statistic and the corresponding p-value, which represents the probability of observing a difference in sample means as extreme (or more extreme) than the one we calculate, under the assumption that the null hypothesis is true. If the p-value is less than our significance threshold (commonly 0.05), we can reject the null hypothesis and conclude that there is a statistically significant difference between the reported and verified emissions.
In addition, we can compute a confidence interval for the difference in means to estimate the likely range of values in which the true difference between the reported and verified emissions lies.
Overall, this approach allows us to assess whether discrepancies between reported and verified CO₂e emissions are statistically significant, and helps quantify the size of that difference if it exists.
##
## Welch Two Sample t-test
##
## data: Greenhouse$`Reporting Amount in CO2e (t)` and Greenhouse$`Verification Amount in CO2e (t)`
## t = 1.1044, df = 5743.9, p-value = 0.2695
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11348.04 40632.38
## sample estimates:
## mean of x mean of y
## 179946.1 165303.9
Note that the p-value is 0.2695 > 0.05, this would imply that we fail to reject the null-hypothesis. We also have the 95% confidence interval (-11348.04, 40632.38) which includes the zero mean, further suggesting that there isn’t any significant difference between the reported amount of CO2e(t) and the actual verified amount of CO2e(t)
By using our Greenhouse data, the calculated mean of total CO2e produced by each facility is as follows:
## [1] 200631.6
We can use bootstrapping to estimate the sampling distribution of the mean of total CO2e produced by each facility and compute the confidence interval.
## 2.5% 97.5%
## 114514.9 316146.1
Based on the calculation, we get a 95% confidence interval of (114514.9, 316146.1). This means that we are 95% confident that the true mean total CO2e from all sources in CO2e (t) falls within this range. The bootstrap sampling distribution is illustrated as follows:
We want to check the relationship between the year and the average CO2e emission. We first evaluate the regression using a non-linear model, check the intercepts, and then plot it.
##
## Call:
## lm(formula = emission_avg ~ Year + I(Year^2), data = emission_by_year)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30316 -17086 -2607 14940 38880
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.908e+09 2.761e+09 2.865 0.0186 *
## Year -7.821e+06 2.740e+06 -2.855 0.0189 *
## I(Year^2) 1.934e+03 6.796e+02 2.845 0.0192 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24830 on 9 degrees of freedom
## Multiple R-squared: 0.9506, Adjusted R-squared: 0.9396
## F-statistic: 86.61 on 2 and 9 DF, p-value: 1.323e-06
Multiple R-Squared: 95%
Adjusted R-Squared: 93%
Since both are greater than 90%, this would mean that the regression model captures more than 90% of the variance in the data, indicating that it is a good fit.
Note that the p-values for all var (intercept,
Year, Year^2) is significant, meaning they are
all crucial in explaining the relationship between the year and average
emission. Additionally, the interpretation of regression parameters are
as follows:
Intercept (7.908e+09): This intercept represents the
predicted emissions for the starting year of the data (2010)
Year (-7.821e+06): This negative coefficient for
year represents a linear effect of year, meaning that for each one-unit
increase in the variable Year, there would be a decrease of
approximately 7.82 million tons in avg_emission
I(Year^2)(1.934e+0.3): This positive coefficient for
year^2 represents a quadratic effect of year, and since it’s positive,
it suggests that the relationship between year and avg_emission is
curvilinear.
Interpreting the graph: we can see from the linear regression model that the trend for the “Average Emission” is decreasing over the year.
We will use cross validation to check the previous regression model, analyzing the relationship between the year and the emission average in tons.
The idea is to split the data into k-folds / k-parts. The approach we are using is that we use k-1 parts as the training set, and the 1 part left as the testing set. We are going to repeat this process k times using each k part as the testing set exactly once.
## [1] 0.1138045
We have an average Mean Sqare Error (MSE) of 0.1138045 which means that the average error of the regression model is small which conclude that our regression line model fits well. Hence, there is likely a relationship between the year and the emission average.
Based on this report, these are the key findings from greenhouse gas emissions report by facility data:
The mean of total CO2e produced by each facility in 2010-2021 is 200631.6, with the average greenhouse gas emissions per facility gradually decreasing each year from 2010 to 2021. Based on our regression analysis, we could expect the average greenhouse gas emission to slightly decrease in the future.
The major greenhouse gas is CO2, which is produced in much larger quantities than other gases. While the emissions of other gases remain relatively stable each year, methane (NH4) experiences a rapid increase in 2019.
From 2010 to 2021, the facility owner in Ontario that produced the highest amount of total greenhouse gas emissions (in CO2e) is Imperial Oil.
From 2010 to 2021, the city in Ontario which emits the highest amount of total greenhouse gas emissions (in CO2e) is Hamilton.
2010 marks the year that recorded the highest average greenhouse gas emissions (in CO2e) in Ontario.
There are some discrepancies between the reported emissions data compared to the actual verified emissions from facilities in Ontario. However, the difference is not significant, as shown in our test of hypothesis.
These findings highlight the need to address emissions from major facilities and cities, while also improving the accuracy of emissions reporting.