In this exploratory data analysis, we will be examining the data collected by Carleton University BIOL 5512 students in the Fall of 2022 for their project on the Department of Fisheries and Oceans standards regarding the impact of human development activities on fish habitats. This data is currently unpublished, however, the graduate students involved in the project (including myself), are working towards completing the paper for publication.
In 2019, the Department of Fisheries and Oceans (DFO) updated the Fisheries Act to ensure that development activity that may cause the harmful alteration, disruption, or destruction (HADD) of fish habitats is met with offsetting measures that achieve no net loss (NNL). To assess compliance with the updated fisheries act, we acquired 109 authorizations issued by the DFO in 2020 and examined them as well as any supplementary documentation. Offsetting is a widespread global standard when assessing and authorizing developmental projects, yet Canada only appeared to recognize its significance in recent years. Despite this recognition, no research has been conducted to evaluate the DFO’s ability to uphold this new standard.
HADD authorizations and associated offsetting plans are intended to be accessible, however, of the 109 authorizations we only received plans for nine of the authorizations and only ten authorizations were listed as “to be developed”. In addition, seven authorizations stated no offsetting plan existed and two did not mention an offsetting plan. Remarkably, this leaves 81 offsetting plans omitted by DFO from the initial ATIP request. Several unsuccessful attempts to determine the cause of this discrepancy have been made since the Fall of 2022, thus showing DFO’s lack of transparency when it comes to HADD offsetting.
For the purpose of this EDA, we will proceed with the current data. Despite a lack of specific offsetting details, many authorizations provided details on requirements for the offsetting plans. Therefore, we will proceed under the assumption that offsetting plans submitted by developers meet the “expectations” set by DFO in hopes to determine if DFO is requiring developers to achieve NNL when causing harmful alteration, disruption, or destruction of fish habitats, as per the 2019 Fisheries Act.
The data used in this EDA is the data compiled by graduate students of Carleton University for a project in their BIOL 5512 class. The total dataset includes 20 variables of information extracted from 109 HADD Authorizations issued by the DFO in 2020.
The 20 variables included in the dataset are describe below.
Variable descriptions:
| ID | PROVINCE | AUTHORIZATION_TYPE | DATE_ISSUED | END_DATE | HADD_DESCRIPTION | DEVELOPMENT_ACTIVITY | CONSTRUCTION | HABITAT_VALUE | HABITAT_TYPE | HABITAT_LOSS | OFFSET_AREA | RATIO | TECHNIQUE | PRE_CLASS | POST_CLASS | DURATION | NNL | logHABITAT_LOSS | logOFFSET_AREA |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 70 | NB | 1 | 08-10-2020 | 11-15-0202 | Destruction, disruption | Roads and highways | Roadbuilding, culvert | NA | Off-channel | 2293 | 2293 | 1.00 | 3 | NA | Basic | NA | Y | 3.360404 | 3.360404 |
| 20 | QU | 1 | 07-10-2020 | 09-15-0202 | Destruction, disruption | Habitat restoration | Bank stabalization | Critical | Riparian | 4251 | 4251 | 1.00 | 1 | Basic | Basic | 5 | Y | 3.628491 | 3.628491 |
| 21 | QU | 1 | 07-02-2020 | 12-31-2020 | Destruction, disruption | Habitat restoration | Bank stabalization | Critical | Riparian | 2250 | 200 | 0.09 | 6 | Basic | Basic | 5 | N | 3.352183 | 2.301030 |
| 22 | QU | 1 | 01-17-2020 | 08-15-2020 | NA | On-water development | Dredging | Important | Marine | 4990 | NA | NA | NA | Basic | Type 1 | 5 | U | 3.698100 | NA |
Before proceeding with the analysis, we will preformed several variable checks to clean up the data. ID, DATE_ISSUED, and END_DATE are not relevant to this EDA and therefore no data hygiene will be preformed. In addition, CONSTRUCTION will not be examined as it is only used to determine DEVELOPMENT_ACTIVITY.
PROVINCE
unique(dat$PROVINCE) # therefore there are 14 provinces with Newfoundland listed as both NL and NFLD
## [1] "NB" "QU" "NS" "ON" "BC" "NT" "NL" "NV" "AB" "NFLD"
## [11] "PEI" "MB" "YT" "SK"
By examining the unique terms, it is clear that Newfoundland is listed twice as both NL and NFLD. The correct acronym is NL and therefore NFLD will be replaced with NL.
dat$PROVINCE[dat$PROVINCE=="NFLD"]<-"NL" # replace NFLD with NL
unique(dat$PROVINCE) # therefore there are no errors in the province column with a total of 13 provinces listed
## [1] "NB" "QU" "NS" "ON" "BC" "NT" "NL" "NV" "AB" "PEI" "MB" "YT"
## [13] "SK"
HADD_DESCRIPTION
unique(dat$HADD_DESCRIPTION) # 10 options with different combinations of 3 different terms
## [1] "Destruction, disruption" NA
## [3] "Destruction, alteration" "Destruction"
## [5] "Destruction, disruption, alteration" "Disruption, alteration"
## [7] "Alteration, destruction" "Alteration"
## [9] "Disruption" "Alteratiion, destruction"
By examining the unique terms, it is clear that there are several combinations of HADDs separated with commas with some terms capitalized. In addition, alteration is spelled two different ways. To fix this issue, HADD_DESCRIPTION will be separated in to three columns corresponding to a first, second, and third type of HADD. We will then transform the data into long format and examine the unique terms once again.
dat <- separate(dat, col = HADD_DESCRIPTION, into = c('HADD_description1', 'HADD_description2', 'HADD_description3'), sep = (", ")) # separate into 3 columns for 3 potential HADDs and add to dataframe
dat$HADD_description1 <- ifelse(dat$HADD_description1 == 'Destruction', 'destruction',
ifelse(dat$HADD_description1 == 'Disruption', 'disruption',
ifelse(dat$HADD_description1 == 'Alteration' | dat$HADD_description1 == 'Alteratiion', 'alteration', NA))) # rename in all lowercase and with alteration corrected
HADD_DESCRIPTION_DATA <- select(dat, c(HADD_description1, HADD_description2, HADD_description3))
HADD_DESCRIPTION_long <- HADD_DESCRIPTION_DATA %>%
pivot_longer(cols = everything(), names_to = "HADD_description_number", values_to = "description") # pivot data longer
unique(HADD_DESCRIPTION_long$description)
## [1] "destruction" "disruption" NA "alteration"
DEVELOPMENT_ACTIVITY
unique(dat$DEVELOPMENT_ACTIVITY) # confirm no errors or repeats in unique terms
## [1] "Roads and highways" "Habitat restoration" "On-water development"
## [4] NA "Other" "Rural development"
## [7] "Urban development" "Industrial" "Mining"
## [10] "Shoreline developmemt" "Railways" "Agriculture"
## [13] "Oil and gas" "Forestry"
By examining the unique terms, we can confirm there are no errors or repeats and therefore no hygiene is required for DEVELOPMENT_ACTIVITY. All 13 development activities and NA appear in the data.
AUTHORIZATION_TYPE
unique(dat$AUTHORIZATION_TYPE) # confirm no errors or repeats in unique terms
## [1] 1
By examining the unique terms, we can confirm there are no errors or repeats and therefore no hygiene is required for AUTHORIZATION_TYPES. From this it appears that all authorizations were classified as type 1.
HABITAT_VALUE
unique(dat$HABITAT_VALUE) # confirm no errors or repeats in unique terms
## [1] NA "Critical" "Important" "Marginal"
By examining the unique terms, we can confirm there are no errors or repeats and therefore no hygiene is required for HABITAT_VALUE as all are classified as either marginal, important, critical, or NA.
HABITAT_TYPE
unique(dat$HABITAT_TYPE) # confirm no errors or repeats in unique terms
## [1] "Off-channel" "Riparian"
## [3] "Marine" "In-channel"
## [5] "Lacustrine" "Estuarine"
## [7] "In-channel \nLacustrine" "In-channel \nRiparian"
## [9] "In-channel\nRiparian" "Lacustrine, off-channel"
## [11] "In-channel, riparian" "Off-channel, riparian"
## [13] "In-channel, off-channel" "Esturarine"
## [15] "Estuarine, marine" "In-channel, ripirian"
By examining the unique terms, it is clear that there are several combinations of terms separated by three different separators. There is also a combination of capitalized terms and lowercase terms. Therefore, we must first replace all separators with a comma so we can separate the terms before creating two columns for those that have two habitat types listed. Then, we can remove capitalization and fix the spelling of the terms in both columns before transforming the data into long format to view the unique terms once again.
dat <- dat %>%
mutate(HABITAT_TYPE = str_replace_all(HABITAT_TYPE, " \n", ", ")) %>%
mutate(HABITAT_TYPE = str_replace_all(HABITAT_TYPE, "\n", ', ')) # replace " \n" and "\n" with ", " so we can separate
dat <- separate(dat, col = HABITAT_TYPE, into = c('habitat1', 'habitat2'), sep = ", ") # separate into 2 columns for 2 potential habitat types and add to dataframe
dat$habitat1 <- ifelse(dat$habitat1 == 'Estuarine'|dat$habitat1 == 'Esturarine', 'estuarine',
ifelse(dat$habitat1 == "Off-channel", 'off-channel',
ifelse(dat$habitat1 == 'Riparian', 'riparian',
ifelse(dat$habitat1 == 'Marine', 'marine',
ifelse(dat$habitat1 == 'In-channel', 'in-channel',
ifelse(dat$habitat1 == 'Lacustrine', 'lacustrine', NA)))))) # change all to lowercase and fix estuarine
dat$habitat2 <- ifelse(dat$habitat2 == 'Lacustrine', 'lacustrine',
ifelse(dat$habitat2 == 'riparian' | dat$habitat2 == 'ripirian' | dat$habitat2 == 'Riparian', 'riparian',
ifelse(dat$habitat2 == 'off-channel', 'off-channel',
ifelse(dat$habitat2 == 'marine', 'marine', NA)))) # channge all to lowercase and fix riparian
HABITAT_DATA <- select(dat, c(habitat1, habitat2)) # new data frame with only the habitat type data
HABITAT_DATA_LONG <- HABITAT_DATA %>%
pivot_longer(cols = everything(), names_to = "Habitat_number", values_to = "Habitat_type") # pivot data longer
unique(HABITAT_DATA_LONG$Habitat_type) # confirm no errors or repeats in unique terms
## [1] "off-channel" NA "riparian" "marine" "in-channel"
## [6] "lacustrine" "estuarine"
HABITAT_LOSS
range(dat$HABITAT_LOSS, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 65 2299895
By viewing the range of habitat loss areas, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for HABITAT_LOSS.
OFFSET_AREA
range(dat$OFFSET_AREA, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 100 531522
By viewing the range of offset areas, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for OFFSET_AREA.
RATIO
range(dat$RATIO, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 0.02 28.56
By viewing the range of ratios, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for RATIO.
TECHNIQUE
unique(dat$TECHNIQUE) # multiples separated with "," with a max of 2 potential types
## [1] "3" "1" "6" NA "2" "5" "4" "1, 3" "8" "1,4"
## [11] "1,2" "2,3"
By examining the unique terms, it is clear that there are several combinations of offsetting techniques separated with commas with a maximum of two techniques. To fix this issue, TECHNIQUE will be separated in to two columns for the authorizations with multiple techniques and the terms will be confirmed. We will then transform the data into long format and examine the unique terms once again.
dat <- separate(dat, col = TECHNIQUE, into = c('technique1', 'technique2'), sep = ",") # separate into 2 columns for 2 potential technique types
dat$technique2[dat$technique2==" 3"]<-"3" # replace " 3" with "3"
TECHNIQUE_DATA <- select(dat, c(technique1, technique2))
TECHNIQUE_DATA_LONG <- TECHNIQUE_DATA %>%
pivot_longer(cols = everything(), names_to = "technique_number", values_to = "technique_type")
unique(TECHNIQUE_DATA_LONG$technique_type)
## [1] "3" NA "1" "6" "2" "5" "4" "8"
PRE_CLASS
unique(dat$PRE_CLASS) # confirm no errors or repeats in unique terms
## [1] NA "Basic" "Type 2" "Type 1"
By examining the unique terms, we can confirm there are no errors or repeats and therefore no hygiene is required for PRE_CLASS. From this it appears that all authorizations are had basic, type 1, type 2, or no pre-impact monitoring.
POST_CLASS
unique(dat$POST_CLASS) # confirm no errors or repeats in unique terms
## [1] "Basic" "Type 1" "Type 2" NA "Research"
By examining the unique terms, we can confirm there are no errors or repeats and therefore no hygiene is required for POST_CLASS. From this it appears that all authorizations are had basic, type 1, type 2, research, or no post-construction monitoring.
DURATION
range(dat$DURATION, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 1 22
By viewing the range of post-construction monitoring duration, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for DURATION.
NNL
unique(dat$NNL) # Y listed as "y" and "Y"
## [1] "Y" "N" "U" NA "y"
By examining the unique terms, it is clear that achievement of NNL is listed as both “Y” and “y”. Therefore to correct this, “y” must be replaced by “Y”.
dat$NNL[dat$NNL == "y"]<-"Y" # replace "y" with "Y"
unique(dat$NNL) # Y confirm no errors or repeats in unique terms
## [1] "Y" "N" "U" NA
logHABITAT_LOSS
range(dat$logHABITAT_LOSS, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 1.812913 6.361708
By viewing the range of log (base 10) of habitat loss areas, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for logHABITAT_LOSS.
logOFFSET_AREA
range(dat$logOFFSET_AREA, na.rm = T) # no negative numbers and therefore no obvious errors
## [1] 2.000000 5.725521
By viewing the range of log (base 10) of offset areas, there does not appear to be any obvious errors (i.e. negative numbers). Therefore, no hygiene is required for logOFFSET_AREA.
Clean data| ID | PROVINCE | AUTHORIZATION_TYPE | DATE_ISSUED | END_DATE | HADD_description1 | HADD_description2 | HADD_description3 | DEVELOPMENT_ACTIVITY | CONSTRUCTION | HABITAT_VALUE | habitat1 | habitat2 | HABITAT_LOSS | OFFSET_AREA | RATIO | technique1 | technique2 | PRE_CLASS | POST_CLASS | DURATION | NNL | logHABITAT_LOSS | logOFFSET_AREA |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 70 | NB | 1 | 08-10-2020 | 11-15-0202 | destruction | disruption | NA | Roads and highways | Roadbuilding, culvert | NA | off-channel | NA | 2293 | 2293 | 1.00 | 3 | NA | NA | Basic | NA | Y | 3.360404 | 3.360404 |
| 20 | QU | 1 | 07-10-2020 | 09-15-0202 | destruction | disruption | NA | Habitat restoration | Bank stabalization | Critical | riparian | NA | 4251 | 4251 | 1.00 | 1 | NA | Basic | Basic | 5 | Y | 3.628491 | 3.628491 |
| 21 | QU | 1 | 07-02-2020 | 12-31-2020 | destruction | disruption | NA | Habitat restoration | Bank stabalization | Critical | riparian | NA | 2250 | 200 | 0.09 | 6 | NA | Basic | Basic | 5 | N | 3.352183 | 2.301030 |
| 22 | QU | 1 | 01-17-2020 | 08-15-2020 | NA | NA | NA | On-water development | Dredging | Important | marine | NA | 4990 | NA | NA | NA | NA | Basic | Type 1 | 5 | U | 3.698100 | NA |
Province
109 authorizations were issued across 10 provinces and 3 territories. The most authorizations were issued in Quebec accounting for 24.77% of all authorizations followed by Ontario and British Columbia accounting for 17.43% and 16.51% of all authorizations, respectively.
p1 <- ggplot(dat, aes(x = fct_infreq(PROVINCE))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Province or territory') +
ylab('Number of authorizations') +
theme_classic() # provincial authorization frequency plot
p1 # view p1
Figure 1: Number of authorizations issued by province or territory
| PROVINCE | freq |
|---|---|
| QU | 27 |
| ON | 19 |
| BC | 18 |
| NB | 16 |
| NS | 9 |
| AB | 5 |
| PEI | 5 |
| YT | 3 |
| MB | 2 |
| NL | 2 |
| NT | 1 |
| NV | 1 |
| SK | 1 |
HADD description
HADD description is classified as habitat destruction, disruption, and/or alteration. Of the 109 authorizations two did not include HADD descriptions, 46.79% of authorizations contained two forms of HADDs, 33.95% contained one form, and 15.60% contained all three forms of HADDs. Complete destruction of fish habitat occurred the most in 82.57% of authorizations followed by alteration and destruction at 61.47% and 33.95% of authorizations respectively. From this it appears that the more extreme impacts on fish habitat are more common, emphasizing the true importance of no net loss achievement to ensure net destruction does not occur.
p2 <- ggplot(p2_no_na, aes(x = description)) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('HADD Description') +
ylab('Number of authorizations') +
theme_classic() # HADD description authorization frequency plot
p2 # view p2. Therefore, the most common form of HADD description was destruction followed by alteration and finally disruption. Therefore it seems more extreme impacts on the aquatic environments are more common
Figure 2: Number of authorizations issued by HADD description
| description | freq |
|---|---|
| destruction | 90 |
| alteration | 67 |
| disruption | 37 |
Development activity
The 109 authorizations encompassed 13 development activities, with only one authorization missing development activity information. The most common development activity was on-water development accounting for 30.28% of authorizations followed by roads and highways and urban development accounting for 25.69% and 12.85% of authorizations respectively.
p3 <- ggplot(dat, aes(x = fct_infreq(DEVELOPMENT_ACTIVITY))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Development activity') +
ylab('Number of authorizations') +
theme_classic()+
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Development activity authorization frequency plot
p3 # view p3.
Figure 3: Number of authorizations issued by development activity
| DEVELOPMENT_ACTIVITY | freq |
|---|---|
| On-water development | 33 |
| Roads and highways | 28 |
| Urban development | 14 |
| Habitat restoration | 9 |
| Industrial | 8 |
| Oil and gas | 5 |
| Mining | 4 |
| Railways | 2 |
| Agriculture | 1 |
| Forestry | 1 |
| Other | 1 |
| Rural development | 1 |
| Shoreline developmemt | 1 |
| NA | 1 |
Authorization type
All 109 authorizations are listed as being of authorization type 1.
| AUTHORIZATION_TYPE | freq |
|---|---|
| 1 | 109 |
Habitat value
Habitat value is to me classified as critical, important, or marginal. Of the 109 authorizations, 55.05% of the authorizations did not include enough information to assess the habitat value, this is alarming. Of those that could be assessed, critical habitats were impacted by far the most, accounting for 36.07% of all authorizations.
p4 <- ggplot(dat, aes(x = fct_infreq(HABITAT_VALUE))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Habitat value') +
ylab('Number of authorizations') +
theme_classic() # Habitat value authorization frequency plot
p4 # view p4
Figure 4: Number of authorizations issued by habitat value
| HABITAT_VALUE | freq |
|---|---|
| NA | 60 |
| Critical | 40 |
| Important | 6 |
| Marginal | 3 |
Habitat type
The authorizations included development activities in six different habitat types: in-channel, marine, riparian, estuarine, off-channel, and lacustrine. 14.68% of the 109 authorizations included two different habitat types. The most common habitat affected was in-channel in 47.41% of authorizations followed by marine and riparian in 20.18% and 14.68% of authorizations, respectively.
p5 <- ggplot(p5_no_na, aes(x= fct_infreq(habitat_type))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Habitat type') +
ylab('Number of authorizations') +
theme_classic() # Habitat type authorization frequency plot
p5 # view p5
Figure 5: Number of authorizations issued by habitat type
| habitat_type | freq |
|---|---|
| in-channel | 52 |
| marine | 22 |
| riparian | 16 |
| estuarine | 13 |
| off-channel | 13 |
| lacustrine | 9 |
The 14.68% of authorizations that include two habitat types rarely specify the habitat loss area and offset area for each habitat type. Therefore, we will list these as “multiples” and repeat the last analysis. With this modification, the most common habitat affected was in-channel with 36.70% of authorizations followed by marine and multiple accounting for 19.27% and 14.68% of authorizations, respectively.
p6 <- ggplot(dat5, aes(x= fct_infreq(HABITAT_TYPE))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Habitat type') +
ylab('Number of authorizations') +
theme_classic() # Habitat type authorization frequency plot with multiples listed
p6 # view modified p6
Figure 6: Number of authorizations issued by habitat type
| HABITAT_TYPE | freq |
|---|---|
| in-channel | 40 |
| marine | 21 |
| multiple | 16 |
| estuarine | 12 |
| off-channel | 9 |
| lacustrine | 7 |
| riparian | 4 |
Habitat loss area and offset area
For the purpose of this analysis, any authorization listing multiple habitat types was listed as “multiple” and therefore it is unsurprising that this category accounted for the highest amount of habitat loss at 2,408,949 meters-squared. The authorizations within this category also encompassed the largest net deficit of area with 2,114,925 meters-squared. Out of all 7 habitat type categories, in-channel was the only habitat type in which an additional 28,063 meters-squared of offsetting was required, thus achieving no net loss. In all other categories, more habitat loss occurred than offsetting was to compensate for. Alarmingly, over the 109 authorizations there was 2,544,844 meters-squared of habitat loss that was not to be compensated for, despite offsetting standards of no net loss.
p7 <- ggplot(data = area_summary_log_w, aes(x = HABITAT_TYPE, y = area_, fill = loss_or_offset)) +
geom_bar(stat = "identity", position = 'stack', colour = 'black') +
xlab('Habitat type') +
ylab(expression('Log'[10]*'(Area (m)'^'2'*')')) +
guides(fill=guide_legend(title="Area type"))+
theme_classic() +
scale_fill_grey(start = 0.6, end = 0.8, labels = c('Habitat loss area', 'Offset area')) +
scale_x_discrete(limits = c('in-channel', 'multiple', 'marine', 'estuarine', 'off-channel', 'lacustrine', 'riparian')) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # new plot with the loss/offset areas for each habitat type indicated as a single sum value of all authorizations that apply
p7 # view p7
Figure 7: Habitat loss and offsetting areas in meters-squared for each habitat type
| HABITAT_TYPE | HABITAT_LOSS | OFFSET_AREA | NET_AREA |
|---|---|---|---|
| in-channel | 885027 | 913090 | 28063 |
| multiple | 2408649 | 293724 | -2114925 |
| marine | 556006 | 456497 | -99509 |
| estuarine | 358424 | 150809 | -207615 |
| off-channel | 97284 | 7383 | -89901 |
| lacustrine | 61731 | 4590 | -57141 |
| riparian | 8267 | 4451 | -3816 |
There is a clear, positive linear relationship between habitat loss area and offsetting area. The linear model of this relationship has a p-value less than 0.05, suggesting statistical significance. However, an multiple R-squared value of 0.5351 suggests there is significant variation within the model. This variation is unsurprising considering the findings thus far in this EDA suggest major inconsistencies in offsetting requirements.
p12 <- ggplot(dat, aes(x = logHABITAT_LOSS, y = logOFFSET_AREA)) +
geom_point() +
xlab(expression('Log'[10]*'(Habitat loss area (m)'^2*')')) +
ylab(expression('Log'[10]*'(Offset area (m)'^2*')')) +
geom_abline(aes(intercept = 0, slope = 1), colour = "black", linetype = "dashed") +
geom_smooth(method = 'lm', colour = 'black')+
theme_classic() # scatterplot of habitat loss area vs offset area (log transformed) with linear model and confidence intervals present
p12 # view p12
Figure 8: The base 10 logarithmic relationship between habitat loss and offsetting areas (meters-squared). The dashed line represents NNL acheivement, with values on or to the left of the line indicating sufficent offseting. The solid line is the linear model for the relationship with the shaded area representing the confidence intervals. The following are the regression statistics for the linear model
lm_p12 = lm(formula = logOFFSET_AREA ~ logHABITAT_LOSS, data = dat)
summary(lm_p12) # p-value <0.05 suggesting the LM on p12 is statistically significant and there is correlation between the habitat loss area and offset area
##
## Call:
## lm(formula = logOFFSET_AREA ~ logHABITAT_LOSS, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.69665 -0.21686 0.04896 0.30874 1.43681
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.77239 0.33202 2.326 0.023 *
## logHABITAT_LOSS 0.79495 0.08985 8.848 6.36e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5525 on 68 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.5351, Adjusted R-squared: 0.5283
## F-statistic: 78.28 on 1 and 68 DF, p-value: 6.364e-13
Ratio
Of the 109 authorizations, only 70 authorizations include both habitat loss and offset areas, therefore only 65.22% of authorizations contain the necessary information for ratio to be calculated. The mean ratio is 2.40, suggesting that for those authorizations that provide both areas, more is offset than lost and NNL is achieved. Of the remaining authorizations, 33 do not include offsetting requirements, accounting for 2,722,419 meters-squared of lost habitat. This is a likely explanation of the discrepancy between the habitat loss and offset areas seen in Table 10, but suggests that DFO standards regarding authorization completeness are often not being met.
Compensation technique
The authorizations included seven compensation techniques: (1) create in-kind habitat, (2) increase in-kind (habitat) productivity, (3) create out-of-kind habitat, (4) increase out-of-kind habitat productivity, (5) create or increase habitat in a different ecological unit (same species), (6) create or increase habitat in a different ecological unit (different species), and (8) other. Of the 109 authorizations, only four contained multiple authorization techniques. The most common compensation technique used was the creation of in-kind habitat in 31.19% of authorizations followed by creation of out-of-kind habitat and increase of in-kind productivity with 18.35% and 11.01%, respectively. Since the techniques are ranked in preference, it is a good sign that the majority of offsetting measures tend to compensate using the most preferred techniques.
p8 <- ggplot(tech_no_na, aes(x= fct_infreq(technique_type))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Technique') +
ylab('Number of authorizations') +
theme_classic() # Offsetting technique authorization frequency plot with multiples listed
p8 # view p8
Figure 9: Number of authorizations by compensation technique from: (1) create in-kind habitat, (2) increase in-kind (habitat) productivity, (3) create out-of-kind habitat, (4) increase out-of-kind habitat productivity, (5) create or increase habitat in a different ecological unit (same species), (6) create or increase habitat in a different ecological unit (different species), (7) artificial propagation, (8) other, (9) none
| TECHNIQUE | freq |
|---|---|
| 1 | 34 |
| 3 | 20 |
| 2 | 12 |
| 4 | 11 |
| 8 | 7 |
| 5 | 3 |
| 6 | 2 |
Pre-impact and post-construction monitoring
Pre-impact and post-construction monitoring techniques can be classified as basic, type 1, type 2, or research. Only 55.05% of the 109 authorizations contain both monitoring techniques, once again suggesting authorization incompleteness. Of the authorizations, 51.38% do not list pre-impact assessments, whereas 10.09% do not list post-construction monitoring requirements. This may suggest a degree for leniency from the DFO when requiring pre-impact assessments. From Figure 9 it is clear that almost all pre-impact assessments are non-existent or basic, with very few being type 2 and type 1. Conversely, post-construction monitoring classifications appear authorization specific with no clear relationship to pre-impact assessments classification.
p9 <- ggplot(class, aes(x = PRE_CLASS, y = POST_CLASS))+
geom_point(position = position_jitter(w = 0.1, h = 0.1), colour = 'black', size = 3, shape = 1)+
xlab('Pre-Impact assessment class') +
ylab('Post-Construction monitoring class') +
theme_classic() # plot showing the frequency of each combination of pre and post assessment classes.
p9 # view p9. Pre are mainly basic or NA. The post assessments are much more varied with most being basic, type 2, or type 1. From this plot, there does not appear to be a significant relationship between the type of pre and post assessment classes.
Figure 10: Scatter-plot showing the relationship between pre-impact and post-construction assessment class
| PRE_CLASS | POST_CLASS | freq |
|---|---|---|
| Basic | Basic | 20 |
| NA | Type 2 | 19 |
| NA | Basic | 15 |
| NA | Type 1 | 14 |
| Basic | Type 2 | 14 |
| Basic | Type 1 | 8 |
| NA | NA | 7 |
| Type 2 | Type 2 | 5 |
| Basic | NA | 4 |
| NA | Research | 1 |
| Type 1 | Type 1 | 1 |
| Type 2 | Type 1 | 1 |
Post-construction monitoring duration
Post-construction monitoring duration varies significantly among the 109 authorizations with duration ranging from one to 22 years. Of the authorizations, 28 or 25.69% do not include a post-construction monitoring duration. Only eight of these do not include post-construction monitoring classification, indicating 20 authorizations require post-monitoring and yet do not list duration. This is alarming as it would once again indicated DFO has provided incomplete authorizations. The most common post-construction monitoring duration is five-years and is required for 43.12% of authorizations.
p10 <- ggplot(dat, aes(x = DURATION)) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('Post-monitoring duration') +
ylab('Number of authorizations') +
theme_classic() # frequency of post monitoring durations
p10 # view p10
Figure 11: Number of authorizations by post-construction monitoring duration (years)
| DURATION | freq |
|---|---|
| 5 | 47 |
| NA | 28 |
| 3 | 18 |
| 1 | 4 |
| 4 | 4 |
| 10 | 3 |
| 2 | 2 |
| 6 | 2 |
| 22 | 1 |
No net loss requirement status
NNL requirement status was only able to be determined for 81.65% of authorizations. Of the authorizations in which NNL requirement status was able to be determined, only 49.44% of authorizations are required by DFO to achieve NNL. Considering no net loss is the ultimate goal, it this is extremely concerning.
p11 <- ggplot(dat, aes(x = fct_infreq(NNL))) +
geom_bar(fill = 'lightgrey', color = 'black') +
xlab('NNL Status') +
ylab('Number of authorizations') +
scale_x_discrete(labels = c('Achieved', 'Not achieved', 'Undetermined', "NA")) +
theme_classic() # frequency of NNL status
p11 # view p11
Figure 12: Number of authorizations by no net loss requirement status
| NNL | freq |
|---|---|
| Y | 45 |
| N | 44 |
| U | 15 |
| NA | 5 |
This EDA explored data extracted from 109 HADD authorizations issued by the DFO in 2020. By doing so, we hoped to analyze DFO compliance with the 2019 updated Fisheries Act standard of ensuring development activity that may cause the harmful alteration, disruption, or destruction (HADD) of fish habitats is met with offsetting measures that achieve no net loss (NNL).
Unfortunately, the lack of DFO cooperation and transparency makes this task difficult. However, the DFO has been contacted on several occasions with requests to provide additional documentation and was informed that conclusions will be made regardless of their cooperation. As of now, DFO has not provided the additional offsetting plans and conclusions will be made based on the information provided.
Repeatedly within this EDA, it was apparent that DFO standards on consistency and completeness of authorizations are severely lacking. Leading one to question not only the accuracy of the information required but also the true reason behind DFO’s lack of cooperation with this study. Regarding our objective, from this EDA it was determined that DFO is only requiring developers to achieve NNL when causing harmful alteration, disruption, or destruction to fish habitats approximately 50% of the time (not including instances in which this could not be assessed). Unfortunately, the lack of DFO cooperation makes it impossible to determine the reasoning behind this finding. Possibilities include a lack of resources during the COVID-19 pandemic, inadequate training upon the indoctrination of the amended fisheries act, and disregard for the importance of fish habitat preservation. For the future of our fisheries, this is alarming, and steps must be taken to ensure such environmental injustices do not continue.