There are four main sectors of the sablefish daily trip limit (DTL) fishery on the West Coast:
The limited entry sablefish fishery north of 36° N. lat. includes both the primary tier fishery, in which vessels are given up to three stacked permits that they can fish up to throughout the entire season, and the DTL fishery in which vessels can fish up to the weekly and/or bimonthly trip limits specified in federal regulations. Although the fishery is called the daily trip limit fishery, daily trip limits have been removed from all four sectors and only weekly or bimonthly trip limits remain. The sablefish trip limit (STL) model (or “daily trip limit model”) independently projects fleetwide landings in the LEN and OAN sectors. In the past, models have been used to predict fleetwide landings in the LES and OAS sectors, but due to significant decreases in participation, the use of models to predict landings in the two southern sectors does not seem appropriate. In their respective sections of this report, the GMT summarizes effort data in the southern fleets to demonstrate the issue but does not provide models for this review.
The GMT uses the STL model to project annual landings estimates in the biennial harvest specifications and management measures process, as well as inseason landings estimates. In both cases, each sector’s trip limits are adjusted as needed to keep landings within the sector-specific landings target, which is buffered under the sector-specific catch share (i.e., allocation) to account for discard mortality estimates. In November 2022, the Council chose to conduct a methodology review of the STL model due to unrealistic output projections identified by the GMT, which are described in Agenda Item G.4.a, Supplemental GMT Report 1, September 2022. Due to those issues, the Scientific and Statistical Committee (SSC) approved the adjustments to the model recommended by the GMT, which are described in Appendix B of Agenda Item H.4.a, Supplemental GMT Report 3, November 2022. The STL model has also never been reviewed by the SSC.
The key takeaways from the analysis in this report are:
LEN
OAN
This report is written in R Markdown and includes code chunks embedded within the text to highlight key steps in running the model. However, to maintain confidentiality and to simplify the report, not all code necessary to run the model is included here. The following report is divided into the four different sectors, but as described above, model descriptions and analysis are only included for the LEN and OAN sectors.
The first step in running the model is loading the 2011-2022 historical fish ticket data from PacFIN, which is done using an ROracle connection to the online comprehensive fish ticket database. The data values loaded include vessel number, sablefish price per pound, and landed weight of sablefish. The modeler checks for any major outliers in the price data, since they could skew the inputs to the model. There is currently one outlier from 2014 in which the sablefish price per pound exceeds $20, which is removed from the data (code not included). Inflation adjusted price per pound data for both sablefish and Dungeness crab are calculated by dividing the total PacFIN adjusted for inflation (AFI) exvessel revenue by the total landed weight (lbs.) of sablefish for each bimonthly period.
ft <- con %>%
tbl(in_schema("PACFIN_MARTS", "COMPREHENSIVE_FT"))
RAWTIX <- ft %>%
select(LANDING_MONTH,
GMT_SABLEFISH_CODE,
AFI_EXVESSEL_REVENUE,
NOMINAL_TO_ACTUAL_PACFIN_SPECIES_NAME,
COUNCIL_CODE,
FTID,
VESSEL_NUM,
PACFIN_YEAR,
LANDED_WEIGHT_LBS,
PRICE_PER_POUND,
AFI_PRICE_PER_POUND,
IOPAC_PORT_GROUP,
AGENCY_CODE) %>%
filter(PACFIN_YEAR > 2010,
GMT_SABLEFISH_CODE %in% c("LEN", "LES", "OAN", "OAS"),
NOMINAL_TO_ACTUAL_PACFIN_SPECIES_NAME == "SABLEFISH",
COUNCIL_CODE == "P",
PRICE_PER_POUND > 0) %>%
collect()
The data are then summarized by vessel number, year, and sector (i.e., LEN, OAN, LES, OAS). The landings data for each year are also divided into six bimonthly periods within the year. Unknown vessel numbers are removed. 0.3% of all sablefish DTL fish tickets between 2011 and 2022 were not associated with a vessel number and therefore removed. Removed fish tickets made up 0.5% of sablefish poundage landed.
sabl_input <- RAWTIX %>%
group_by(VESSEL_NUM,
PACFIN_YEAR,
LANDING_MONTH,
GMT_SABLEFISH_CODE,
FTID) %>%
summarize(LBS = sum(LANDED_WEIGHT_LBS),
REV = sum(AFI_EXVESSEL_REVENUE),
max_price = max(AFI_PRICE_PER_POUND))
unknown <- RAWTIX %>%
mutate(unknown = ifelse(VESSEL_NUM == "UNKNOWN", T, F)) %>%
group_by(unknown) %>%
summarize(tickets = length(unique(FTID)),
sable_lbs = sum(LANDED_WEIGHT_LBS))
DTL <- sabl_input %>%
filter(VESSEL_NUM != "UNKNOWN") %>%
mutate(PERIOD = ceiling(LANDING_MONTH/2)) %>%
group_by(PACFIN_YEAR,
PERIOD,
GMT_SABLEFISH_CODE) %>%
summarize(LBS = sum(LBS),
REV = sum(REV),
VES_NUM = length(unique(VESSEL_NUM)),
max_price = max(max_price)) %>%
mutate(ADJ_PRICE = REV/LBS,
AVG_LB = LBS/VES_NUM,
ACT_MT = LBS/2204.6)
LEN_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "LEN")
OAN_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "OAN")
LES_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "LES")
OAS_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "OAS")
keycols <- c("PACFIN_YEAR", "PERIOD")
Table 1 below shows total annual landings since 2011 in the four sablefish DTL sectors. Landings in all sectors dropped in 2020 due to COVID-related impacts to the fleet, and while the LEN and OAN sectors seemed to rebound in 2021 and 2022 under high sablefish allocations, the LES and OAS sectors either stagnated or continued to decline, respectively. California Department of Fish and Wildlife representatives on the GMT noted that market and infrastructure issues south of 36° N. lat. have generally prevented those southern sectors from maintaining historical effort.
Table 1: Annual sablefish landings (mt) by DTL sector, 2011-2022
| Year | LEN | LES | OAN | OAS |
|---|---|---|---|---|
| 2011 | 412 | 560 | 374 | 166 |
| 2012 | 232 | 375 | 217 | 73 |
| 2013 | 174 | 455 | 121 | 60 |
| 2014 | 138 | 421 | 224 | 33 |
| 2015 | 190 | 370 | 365 | 29 |
| 2016 | 222 | 369 | 344 | 21 |
| 2017 | 257 | 319 | 392 | 25 |
| 2018 | 229 | 386 | 341 | 20 |
| 2019 | 178 | 344 | 327 | 13 |
| 2020 | 155 | 258 | 169 | 4 |
| 2021 | 170 | 173 | 245 | 3 |
| 2022 | 294 | 180 | 538 | 2 |
Figures 1 and 2 below demonstrate the apparent decline in sablefish landings and participation in the DTL sectors south of 36° N. lat., which lies between Monterey and Morro Bay. Both landings and participation in Morro Bay have steadily declined since 2011, and the number of vessels in the fishery has declined notably in the ports of Los Angeles and San Diego. Confidential data are hidden in the figures.
Figure 1. DTL sablefish landings (lbs.) by IOPAC port group; larger circles represent a larger scale of landings.
Figure 2. Number of vessels that made DTL sablefish landings by IOPAC port group; larger circles represent more vessels.
Since 2012, roughly 10 to 50 vessels participated in the LEN fishery in any one bimonthly period, with up to 70 vessels participating in 2011 (Figure 3). Fleetwide sablefish landings have remained fairly steady since 2011, although vessel participation varies across bimonthly periods within a single year due to trip limits, participation in other fisheries, weather, sablefish prices, and other seasonal factors. Inflation adjusted price per pound of sablefish landed by the LEN sector has steadily declined from up to $6 in 2011 down to nearly $2 in 2022.
Figure 3. LEN participation, sablefish landings (lbs.), and inflation-adjusted price per pound, 2011-2022.
Given that the LEN sector requires a federal permit to participate, unlike the OAN sector, participation tends to involve roughly the same vessels year after year, and therefore, predicting the number of vessels in the model is generally easier than making the same prediction for the OAN sector. However, unprecedentedly low sablefish prices and high allocations can still complicate the LEN model.
The following sections are divided into 1) the current model used by the GMT for harvest specifications and inseason landings projections and 2) various potential model improvements the GMT explored as part of this methodology review. None of the potential improvements have been used for management to date but may be used in future management actions if approved for use by the SSC.
In the following section, the distribution assumptions of the dependent variables in the model are described, and the steps to run the model currently used for management actions are outlined, as well as model performance and diagnostics. Fleetwide landings are predicted for each bimonthly period by multiplying the outputs of two separate linear regression models that predict 1) average landings per vessel and 2) number of vessels in the fleet.
To determine whether the two dependent variables are normally distributed, or if data transformation is warranted or an alternative distribution assumption necessary, the following subsections use Shapiro-Wilk normality tests on both the un-transformed and transformed datasets. In addition, the skewness and kurtosis are examined.
Based on the results of the Shapiro-Wilk normality test shown in Table 2, it is apparent that the historical data for the average landings (lbs.) of sablefish per vessel in the LEN sector is not normally distributed (i.e., the p-value is less than 0.05). All three data transformations still result in non-normally distributed data, but log-transformation provides the distribution closest to normal. This is also demonstrated in Figure 4, where the histogram of the un-transformed data is heavily skewed right, but the log-transformed data reduces the skewness, more so than the other data transformations. Log-transforming the data also improves the normal q-q plot, which should more closely hug the dashed line as normality increases, as shown in Figure 5.
Table 2: Shapiro-Wilk normality test for LEN Avg. lbs. per vessel dataset
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data | |
|---|---|---|---|---|---|
| W | W statistic | 0.825 | 0.953 | 0.901 | 0.921 |
| p value | 0.000 | 0.009 | 0.000 | 0.000 |
Figure 4. Top left panel: histogram of the untransformed average lbs. per vessel dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.
Figure 5. Normal Q-Q plots for the untransformed (left) and log-transformed (right) average lbs. per vessel dataset
Skewness
As described above, the un-transformed data for average landings per vessel appears heavily skewed right, which is confirmed using the following function to calculate the skewness value, the t-value, and the p-value, where a p-value less than 0.05 means the data are significantly skewed. The p-value for the un-transformed data is less than 0.05, and the skewness value is positive, so the data are heavily skewed right. The log-transformed data are still significantly skewed right, but the skewness value is the lowest, and the p-value is closest to 0.05.
skew <- function(AVG_LB){
m3 <- sum((AVG_LB-mean(AVG_LB))^3)/length(AVG_LB)
s3 <- sqrt(var(AVG_LB))^3
m3/s3
}
skew_value <- skew(AVG_LB)
t_value <- skew(AVG_LB)/sqrt(6/length(AVG_LB))
p_value <- 1-pt(skew(AVG_LB)/sqrt(6/length(AVG_LB)), 68)
Table 3: Skewness values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data |
|---|---|---|---|---|
| skewness | 1.820 | 0.666 | 1.244 | 1.053 |
| t value | 6.349 | 2.325 | 4.338 | 3.672 |
| p value | 0.000 | 0.012 | 0.000 | 0.000 |
Kurtosis
Kurtosis is defined as the level at which a distribution is likely to produce outliers, where the standard kurtosis value of a normal distribution is 3. The method of calculating kurtosis used here is considered the “excess kurtosis”, because it subtracts 3 from the kurtosis value so that the value of a normal distribution is zero and easier to interpret. A high, positive kurtosis value indicates many outliers, whereas a negative kurtosis indicates a lack of outliers. Similar to the skewness section above, the t-value and p-value are also extracted and displayed in the table, where a p-value of less than 0.05 indicates that the kurtosis is significantly different from that of a normal distribution. Log-transforming the data brings the kurtosis value closest to zero (normal distribution), and the p-value of 0.27 indicates that the kurtosis of the log-transformed data is not significantly different from that of a normal distribution.
The following function is used to calculate the kurtosis value.
kurtosis <- function(AVG_LB){
m4 <- sum((AVG_LB-mean(AVG_LB))^4)/length(AVG_LB)
s4 <- var(AVG_LB)^2
m4/s4-3
}
kurt_value <- kurtosis(AVG_LB)
t_value <- kurtosis(AVG_LB)/sqrt(24/length(AVG_LB))
p_value <- 1-pt(t_value, 68)
Table 4: Kurtosis values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data |
|---|---|---|---|---|
| kurtosis | 3.994 | 0.728 | 1.866 | 1.382 |
| t value | 6.966 | 1.270 | 3.255 | 2.410 |
| p value | 0.000 | 0.104 | 0.001 | 0.009 |
Based on the results of the Shapiro-Wilk normality test shown in Table 5, the data on number of vessels per bimonthly period appears to not be normally distributed (p-value is less than 0.05). All three data transformations normalize the data. The same functions used to calculate skewness and kurtosis for the average landings per vessel are also applied here. The normal q-q plot of the data on number of vessels appears to have an S-shaped curve at the tails but mostly hugs the dashed line (Figure 7).
Table 5: Shapiro-Wilk normality test for LEN number of vessels dataset.
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data | |
|---|---|---|---|---|---|
| W | W statistic | 0.964 | 0.977 | 0.991 | 0.991 |
| p value | 0.034 | 0.214 | 0.877 | 0.904 |
Figure 6. Top left panel: histogram of the untransformed number of vessels dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.
Figure 7. Normal Q-Q plot for the untransformed number of vessels dataset
Skewness
The data for number of vessels per bimonthly period is significantly skewed (p-value is less than 0.05; Table 6), and all three data transformations normalize the data.
Table 6: Skewness values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data |
|---|---|---|---|---|
| skewness | 0.636 | -0.496 | 0.095 | -0.092 |
| t value | 2.218 | -1.729 | 0.331 | -0.323 |
| p value | 0.015 | 0.956 | 0.371 | 0.626 |
Kurtosis
The kurtosis of the un-transformed vessel number data does not appear to be high, meaning there are not major outliers that could skew the data, and the p-value is greater than 0.05 (Table 7).
Table 7: Kurtosis values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.
| Value | Un-transformed Data | Log-transformed Data | Square Root-transformed Data | Cube Root-transformed Data |
|---|---|---|---|---|
| kurtosis | 0.146 | -0.029 | -0.464 | -0.453 |
| t value | 0.255 | -0.050 | -0.809 | -0.790 |
| p value | 3.000 | 0.520 | 0.789 | 0.784 |
Non-Normal Distributions of Count Data
The number of vessels is considered count data, and a poisson distribution is often used for non-normal count data. However, Figure 8 below demonstrates a markedly different appearance to the distribution of vessel number data compared to the expected poisson distribution of the data. Additionally, a poisson distribution is generally not considered appropriate for data in which zeros are highly unlikely or unobserved, as is the case for the vessel number data. Alternatively, a negative binomial distribution more closely resembles the observed distribution of the data and can be used for count data in which zeros are unlikely. For this reason, we explore the use of negative binomial distribution assumption in Section 2.2.6 - Generalized Linear Model.
Figure 8. Distributions of the vessel number observed data, poisson expected, and negative binomial expected.
Figure 9 plots the relationships between three independent variables and the average number of landings per vessel. The three independent variables used are weekly trip limits, bimonthly trip limits, and average inflation-adjusted sablefish price per pound, and these are the only variables currently used in the model for management purposes. Average landings per vessel are clearly influenced by the weekly and bimonthly trip limits, where higher trip limits result in higher average landings per vessel. This is not a surprising relationship. There does not appear to be a linear relationship with average sablefish price per pound.
Figure 9. Relationships between average landings per vessel and three independent variables.
As shown in Table 8, the strongest model using the two trip limit variables appears to be the one that uses both variables as covariates. There does not appear to be an interaction between the two trip limit variables, even though both trip limits are often increased or decreased at the same time in management actions.
Table 8: Comparison of linear regression models predicting average lbs. landed per vessel, using status quo independent variables.
| Weekly | Bimonthly | Wkly + Bimon | Wkly + Bimon + Wkly:Bimon | |
|---|---|---|---|---|
| (Intercept) | 1190.48 *** | 745.91 *** | 798.69 *** | 622.17 |
| (141.06) | (165.47) | (153.16) | (361.61) | |
| TL.WEEKLY | 1.02 *** | 0.50 *** | 0.65 * | |
| (0.08) | (0.14) | (0.31) | ||
| TL.BIMON | 0.49 *** | 0.29 *** | 0.31 *** | |
| (0.04) | (0.06) | (0.08) | ||
| TL.WEEKLY:TL.BIMON | -0.00 | |||
| (0.00) | ||||
| N | 71 | 71 | 71 | 71 |
| R2 | 0.68 | 0.71 | 0.75 | 0.76 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||
Below is the model currently used to predict average landings per vessel and the summary output for the model. 74.7% of the variance in the average landings per vessel data can be explained by the weekly and bimonthly trip limits.
LEN_land_mod <- lm(AVG_LB ~ TL.WEEKLY + TL.BIMON, data = LEN)
summary(LEN_land_mod)
##
## Call:
## lm(formula = AVG_LB ~ TL.WEEKLY + TL.BIMON, data = LEN)
##
## Residuals:
## Min 1Q Median 3Q Max
## -945.24 -271.15 -58.84 141.74 1784.14
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 798.69376 153.15936 5.215 0.00000188 ***
## TL.WEEKLY 0.50295 0.13809 3.642 0.000523 ***
## TL.BIMON 0.28713 0.06472 4.436 0.00003433 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 472.4 on 68 degrees of freedom
## Multiple R-squared: 0.7546, Adjusted R-squared: 0.7474
## F-statistic: 104.6 on 2 and 68 DF, p-value: < 0.00000000000000022
Figure 10 shows the model diagnostic plots for the model using both weekly and bimonthly trip limits. The residuals vs. fitted plot does not appear to indicate heteroskedasticity (i.e., funnel shape or curvature), but the normal q-q plot has a heavy tail on the right, likely due to high outliers in recent years under exceptionally high sablefish allocations. However, it would not be reasonable to exclude those outliers, because sablefish allocations will likely continue to be exceptionally high in the coming years, and including these outliers informs the data of fishery behavior under those exceptional conditions.
Figure 10. Model diagnostic plots of model predicting average landings per vessel using weekly and bimonthly trip limit covariates.
The following code extracts the intercept and coefficients from the linear regression and calculates the predicted landings per vessel by applying the extracted coefficients to the input data. Note that a different set of coefficients is extracted for each bimonthly period, because trip limits and effort vary by period throughout a single year. Similar code to extract and apply coefficients is used in all other model types in this report, except the generalized linear model, so the code will not be shown again.
LEN_land_lm <- LEN %>%
group_by(PERIOD) %>%
group_modify(~ bind_rows(coefficients(lm(AVG_LB ~ TL.BIMON + TL.WEEKLY,
data = .))))
setnames(LEN_land_lm, 2:4, c("INT_LAND", "BIMO_COEF", "WKLY_COEF_LAND"))
LEN = merge(LEN, LEN_land_lm, by = "PERIOD")
LEN <- LEN %>%
mutate(pred_catch = WKLY_COEF_LAND * TL.WEEKLY + BIMO_COEF * TL.BIMON + INT_LAND)
Figure 11 below plots the predicted landings per vessel against the actual (i.e., observed) landings per vessel in the historical data. The blue line mostly overlaps the black 1:1 ratio line, which means the predicted values closely match the observed values.
Figure 11. Predicted vs. actual landings per vessel.
Figure 12 plots the relationships between the same three independent variables and the number of vessels per bimonthly period. Vessel participation is influenced by inflation adjusted price per pound of sablefish, with no meaningful relationship to trip limits. This is also not surprising, because market factors tend to influence fishery participation, whereas trip limits directly influence the amount participating vessels tend to land.
Figure 12. Relationships between number of vessels per bimonthly period and three independent variables.
Below is the model to predict number of vessels per bimonthly period and the summary output for the model. 50.9% of the variance in the number of vessels that participate each bimonthly period can be explained by average inflation-adjusted sablefish price per pound.
LEN_ves_mod <- lm(VES_NUM ~ ADJ_PRICE, data = LEN)
summary(LEN_ves_mod)
##
## Call:
## lm(formula = VES_NUM ~ ADJ_PRICE, data = LEN)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.116 -6.108 -1.271 3.979 28.534
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.013 4.792 -2.090 0.0403 *
## ADJ_PRICE 12.120 1.405 8.624 0.00000000000144 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.32 on 69 degrees of freedom
## Multiple R-squared: 0.5188, Adjusted R-squared: 0.5118
## F-statistic: 74.38 on 1 and 69 DF, p-value: 0.000000000001444
Figure 13 shows the model diagnostic plots for the model predicting number of vessels based on average sablefish prices. The residuals vs. fitted plot does not appear to indicate heteroskedasticity, but the normal q-q plot has a heavy tail on the right, likely due to high outliers in recent years under exceptionally high sablefish allocations.
Figure 13. Model diagnostic plots of model predicting number of vessels based on inflation-adjusted price per pound.
Figure 14 below shows that the predicted values for number of vessels per bimonthly period closely match the actual number of vessels in the historical data.
Figure 14. Plot of predicted vs. actual number of vessels in the LEN fleet.
The last step in the model is to multiply the predicted average landings per vessel by the predicted number of vessels in the fleet to get the fleetwide predicted landings by period. Figure 15 shows a comparison of the predicted (red) and actual (black) historical fleetwide landings. The current LEN model used in management generally performs well in predicting historical trends in fleet wide landings but over-predicts landings in 2020 and 2021, which is not surprising given the anomalously low effort during those years due to impacts from the COVID-19 pandemic. It also under-predicts 2022, which is likely due to unprecedented LEN trip limits as well as impacts to other fisheries that sablefish vessels participate in (e.g., Dungeness crab and salmon).
Figure 15. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black).
LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)
##
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -68873 -12145 -2659 12142 89290
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4680.3528 6924.0439 -0.676 0.501
## predict 1.0652 0.0766 13.905 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25020 on 69 degrees of freedom
## Multiple R-squared: 0.737, Adjusted R-squared: 0.7332
## F-statistic: 193.4 on 1 and 69 DF, p-value: < 0.00000000000000022
Figure 16 below demonstrates a typical inseason management projection to keep catch within the sector-specific landings targets, with a separate panel for each action alternative. Projections are made under three alternative price scenarios (low, average, high), and expected future prices are estimated based on recent prices with a 10% buffer above and below the average to estimate the high and low price scenarios, respectively. In the case of Figure 16, trip limit options 2 and 3 are progressively lower than the status quo trip limits, which is why the landings projections are also lower than status quo.
Figure 16. Example of inseason management landings projections using the current LEN model.
The following sections analyze potential improvements to the LEN sector model, such as adding variables or using a generalized linear model. For the LEN sector, the GMT explored:
In summary, the GMT concluded that fuel and Dungeness crab prices are not significantly influential for predicting LEN sablefish landings, but adding maximum AFI sablefish price as a predictor of vessel participation significantly improves the model in both the linear regression and generalized linear models. Up-weighting the most recent year of data and log-transforming the response variable only improves the prediction of average landings per vessel but not the number of vessels.
Table 9 below compares model summaries between a model that does not use data weights at all (“SQ Model”) and one that up-weights the most recent year of data (i.e., 2022), which improves the fit to the data. The GMT chose to up-weight only the most recent year of data for simplicity and because any other weighting scheme would likely be subjective to the modeler. Figure 17 plots the predicted landings per vessel against the observed, or actual, landings per vessel in the data.
Table 9: Comparison of linear regression models predicting average lbs. landed per vessel, using the status quo approach and adding data weights (i.e., upweighting most recent year).
| SQ Model | Data Weighting | |
|---|---|---|
| (Intercept) | 798.69 *** | 777.28 *** |
| (153.16) | (141.59) | |
| TL.WEEKLY | 0.50 *** | 0.42 *** |
| (0.14) | (0.12) | |
| TL.BIMON | 0.29 *** | 0.32 *** |
| (0.06) | (0.06) | |
| N | 71 | 71 |
| R2 | 0.75 | 0.86 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||
Figure 17. Plot of predicted vs. actual landings per vessel using data weights.
As shown in Table 10 below, up-weighting the most recent year of data does not improve the fit to the data when predicting number of vessels in the fleet. This could be caused by a number of factors, which could include the lower than average sablefish prices in 2022, as previously shown in Figure 3.
Table 10: Comparison of linear regression models predicting number of vessels, using the status quo approach and adding data weights (i.e., upweighting most recent year).
| SQ Model | Data Weighting | |
|---|---|---|
| (Intercept) | -10.01 * | -2.73 |
| (4.79) | (5.17) | |
| ADJ_PRICE | 12.12 *** | 10.15 *** |
| (1.41) | (1.62) | |
| N | 71 | 71 |
| R2 | 0.52 | 0.36 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||
Figure 18 compares the predicted and actual fleetwide LEN landings when only up-weighting the most recent year of data in the model that predicts landings per vessel, and the following output indicates that there is a 73.1% fit to the actual data. This is a minuscule difference compared to a fit of 73.2% without data weights, but up-weighting the most recent year provides assurance that recent fishery behavior is more informative to the model in light of anomalous events such as the recent COVID-19 pandemic or extremely high sablefish abundance.
Figure 18. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weighted in the model used to predict average lbs. per vessel.
LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)
##
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -71536 -11570 -2745 11874 86714
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4764.2790 6944.6477 -0.686 0.495
## predict 1.0696 0.0771 13.873 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25060 on 69 degrees of freedom
## Multiple R-squared: 0.7361, Adjusted R-squared: 0.7323
## F-statistic: 192.5 on 1 and 69 DF, p-value: < 0.00000000000000022
Section 2.1.1 above discusses the distributions of the two response variables in the model. The data for average landings per vessel is not normally distributed, and while log-transforming the data does not completely normalize the data, it brings the skewness and kurtosis closer to that of a normal distribution, improves the model diagnostics compared to status quo, and slightly improves the fit of predicted to actual fleetwide landings data, as demonstrated below. Figure 19 compares the influence of weekly and bimonthly trip limits on average landings per vessel with no data transformation, log-transforming only the response variable, and log-transforming both the predictor and the response variable. Visually, the closest relationships appear to be between the log-transformed response variable and the log-transformed weekly limit as well as the un-transformed bimonthly limit. Log-transforming the bimonthly limit appears to distort the relationship.
Figure 19. Comparison of linear relationships when transforming the response and predictor variables in the model predicting landings per vessel.
Table 11 compares the model outputs between the status quo model (un-transformed data) and models in which response and/or predictor variables are log-transformed. Log-transforming the response variable (landings per vessel) and the weekly trip limit provides the best fit compared to the other data transformations, but it is still slightly weaker than the status quo model fit. However, Figure 20 shows that log-transforming the response variable and the weekly limit improves the model diagnostics, specifically the normal q-q plot. For all log transformation models (and the “status quo” model), the most recent year is up-weighted given the value in using this approach, as described in the previous section.
Table 11: Comparison of linear regression models predicting average lbs. landed per vessel, using the status quo approach and log-transforming the dependent variable.
| SQ | log(y) | log(y) & log(wkly) | log(y) & log(wkly) + log(bimo) | |
|---|---|---|---|---|
| (Intercept) | 777.28 *** | 7.28 *** | 5.23 *** | 2.43 *** |
| (141.59) | (0.05) | (0.52) | (0.36) | |
| TL.WEEKLY | 0.42 *** | 0.00 * | ||
| (0.12) | (0.00) | |||
| TL.BIMON | 0.32 *** | 0.00 *** | 0.00 *** | |
| (0.06) | (0.00) | (0.00) | ||
| log(TL.WEEKLY) | 0.31 *** | 0.40 *** | ||
| (0.08) | (0.07) | |||
| log(TL.BIMON) | 0.31 *** | |||
| (0.08) | ||||
| N | 71 | 71 | 71 | 71 |
| R2 | 0.86 | 0.79 | 0.82 | 0.81 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||
Figure 20. Diagnostic plots for the model predicting landings per vessel using log-transformed variables.
Since the response variable and one predictor variable are log-transformed in the model, prediction values need to be back-transformed in order to make interpretations and use them for management. The following code is used to extract the sigma value from each period-specific regression and then calculate the back-transformed landings predictions using those sigma values. Figure 21 plots the predicted landings per vessel using log transformations against the actual landings per vessel in the data.
LEN <- LEN %>%
mutate(sigma = ifelse(PERIOD == 1, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 1),
weights = WEIGHT))$sigma,
ifelse(PERIOD == 2, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 2),
weights = WEIGHT))$sigma,
ifelse(PERIOD == 3, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 3),
weights = WEIGHT))$sigma,
ifelse(PERIOD == 4, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 4),
weights = WEIGHT))$sigma,
ifelse(PERIOD == 5, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 5),
weights = WEIGHT))$sigma,
summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON,
data = LEN %>% filter(PERIOD == 6),
weights = WEIGHT))$sigma))))))
LEN <- LEN %>%
mutate(ln_catch = WEEKLY_COEF * log(TL.WEEKLY) + BIMO_COEF * TL.BIMON + INT_LAND,
pred_catch = (exp(ln_catch) + 0.5 * sigma^2),
time = as.numeric(paste(PACFIN_YEAR, PERIOD, sep = ".")))
Figure 21. Predicted vs. actual landings per boat using log-transformed avg. lbs. dependent variable and log-transformed weekly limit independent variable.
Log-transforming either the response variable or the predictor variable in the model that predicts number of vessels does not improve the model (Table 12) and, in fact, transformation worsens the model diagnostics by introducing heteroskedasticity (i.e., the residual plot resembles a funnel) and creates an S-shaped curve in the normal q-q plot (Figure 22).
Table 12: Comparison of linear regression models predicting number of vessels, using the status quo approach and log-transforming the variables.
| SQ | log(y) | log(y) & log(avg price) | |
|---|---|---|---|
| (Intercept) | -10.01 * | 1.98 *** | 1.62 *** |
| (4.79) | (0.19) | (0.23) | |
| ADJ_PRICE | 12.12 *** | 0.40 *** | |
| (1.41) | (0.06) | ||
| log(ADJ_PRICE) | 1.43 *** | ||
| (0.20) | |||
| N | 71 | 71 | 71 |
| R2 | 0.52 | 0.43 | 0.44 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||
Figure 22. Vessel number model plots using the non-transformed and transformed variables.
Figure 23 and the following model output show that log-transforming the data and up-weighting the most recent year in the model that predicts landings per vessel give a slightly lower fit to the actual fleetwide landings data (72.4%) than the model currently used in management, but log-transforming the landings per vessel data improved the model diagnostics. In all other potential model improvements that follow, average landings per vessel are predicted using the log-transformed predictor variable and the most recent year of data is up-weighted.
Figure 23. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weighted in the model used to predict average lbs. per vessel and log-transformation of the average lbs. dataset.
LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)
##
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -81349 -12886 -2271 12063 77175
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3647.71083 6989.70790 -0.522 0.603
## predict 1.07619 0.07897 13.628 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25390 on 69 degrees of freedom
## Multiple R-squared: 0.7291, Adjusted R-squared: 0.7252
## F-statistic: 185.7 on 1 and 69 DF, p-value: < 0.00000000000000022
As described above, average prices currently used in the model are calculated based on inflation-adjusted ex-vessel revenue and landed weight. The GMT considered whether PacFIN’s inflation-adjusted price per pound (hereafter “AFI price” when referring to the PacFIN variable) in lieu of the calculated inflation-adjusted price per pound. The GMT concluded that continuing to calculate the price per pound was more appropriate given that different sizes of landings (e.g., 10,000 lbs. vs. 1,000 lbs.) would carry the same weight when averaging the AFI price (Table 13). Calculating the price per pound in Table 13 results in an average price ($3.18) that is slightly lower than if the price of each fish ticket is averaged ($4.00), because fish ticket A landed 10 times as much sablefish but at a lower price. The blue line in Figure 24 below represents the trend in the bimonthly average of AFI prices, and the orange line represents the trend in calculated average prices.
The GMT did explore, however, the alternative use of median AFI prices as well as adding maximum and/or minimum AFI prices, since those values are still single data points. The green line in Figure 24 represents the trend in maximum AFI prices, the purple line represents the trend in median AFI prices, and the red line represents minimum AFI prices. All zero prices were removed from the data, because vessels may list $0 as the price per pound when selling to themselves or for some other reason where a typical sale was not actually made.
Table 13: Alternative methods of calculating average price per pound.
| Landings (lbs.) | Revenue | Price per Lb. | Price Re-Calculated | |
|---|---|---|---|---|
| Fish Ticket A | 10,000 | $30,000 | $3.00 | - |
| Fish Ticket B | 1,000 | $5,000 | $5.00 | - |
| Combined | 11,000 | $35,000 | $4.00 | $3.18 |
Figure 24. Minimum, median, average, and maximum inflation-adjusted sablefish price per pound in the LEN sector, 2011-2023. Zero prices were removed.
Figure 25 below plots the relationships between the number of vessels in the fleet and the minimum, median, and maximum AFI prices, alongside the calculated average inflation-adjusted price. All variables show a linear relationship with the number of vessels except the minimum AFI price, so the minimum price is not considered any further.
Figure 25. Relationships between number of LEN vessels by period and calculated average, minimum, median, and maximum sablefish prices.
Because the minimum, median, maximum, and average are all different statistics of the same data, the GMT explored whether there may be a correlation between the price variables (Figure 26). There does appear to be a strong linear relationship between median and average price, which is not surprising, but less so between average/median and maximum price, which has a more curved relationship. When considering correlations at an annual scale, however, median and average prices do not present a consistent pattern of positive or negative correlation with maximum price (Figure 27). There appears to be no correlation between minimum price and any other price variables.
Figure 26. Correlation check across sablefish price data.
Figure 27. Relationship of average and median prices with maximum price on an annual scale.
Table 14 below indicates that adding maximum AFI price to the model predicting number of vessels, in addition to the currently used calculated average price, provides a better fit. Additionally, using the median or maximum AFI price instead of the average price does not improve the model.
Table 14: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.
LEN_modsq <- lm(VES_NUM ~ ADJ_PRICE, data = LEN)
LEN_mod1 <- lm(VES_NUM ~ max_afi_price, data = LEN)
LEN_mod2 <- lm(VES_NUM ~ ADJ_PRICE + max_afi_price, data = LEN)
LEN_mod3 <- lm(VES_NUM ~ med_afi_price, data = LEN)
LEN_mod4 <- lm(VES_NUM ~ med_afi_price + max_afi_price, data = LEN)
export_summs(LEN_modsq, LEN_mod1, LEN_mod2, LEN_mod3, LEN_mod4,
model.names = c("Model SQ", "Max AFI Price",
"Avg+Max AFI Price", "Med AFI Price",
"Med+Max AFI Price"), statistics = NULL)
| Model SQ | Max AFI Price | Avg+Max AFI Price | Med AFI Price | Med+Max AFI Price | |
|---|---|---|---|---|---|
| (Intercept) | -10.01 * | -13.16 * | -22.48 *** | -7.50 | -20.40 *** |
| (4.79) | (5.12) | (4.74) | (4.58) | (4.70) | |
| ADJ_PRICE | 12.12 *** | 7.61 *** | |||
| (1.41) | (1.48) | ||||
| max_afi_price | 5.11 *** | 3.23 *** | 3.24 *** | ||
| (0.59) | (0.62) | (0.64) | |||
| med_afi_price | 11.16 *** | 6.85 *** | |||
| (1.32) | (1.42) | ||||
| N | 71 | 71 | 71 | 71 | 71 |
| R2 | 0.52 | 0.52 | 0.66 | 0.51 | 0.64 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||||
The GMT used a likelihood ratio test to determine whether adding maximum AFI prices significantly improves the model (Table 15). A p-value of less than 0.05 indicates that the difference between the full model and the nested model is statistically significant and supports including the additional variable. In this case, the p-value is less than 0.05, supporting the inclusion of maximum AFI prices to predict the number of vessels.
The diagnostic plots of the status quo model and the model using both calculated average and maximum AFI sablefish prices to predict number of vessels are shown in Figure 28. Adding maximum AFI prices improves the residual vs. fitted plot but does not appear to greatly alter the normal q-q plot.
Table 15: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 4 | -246 | |||
| 3 | -258 | -1 | 23.4 | 1.34e-06 |
Figure 28. Model plots for the model predicting vessel number using both average and maximum sablefish prices.
The overall fit of the predicted data to the actual fleetwide landings data improves when maximum AFI sablefish price is added to the model predicting number of vessels, as shown in Figure 29 and the following output. The fit of the status quo model is 73.2% whereas the fit when maximum AFI price is added is 75.2%. The model used to make the predictions in Figure 29 also log-transforms the landings per vessel and weekly trip limit data and back-transforms the landings per vessel predictions, as outlined in Section 2.2.2.
Figure 29. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weights, log-transformation of average lbs. data, and maximum AFI prices added to the model predicting number of vessels.
LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)
##
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -95363 -12252 316 11346 77373
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3683.12176 6514.16964 -0.565 0.574
## predict 1.07444 0.07319 14.680 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24030 on 69 degrees of freedom
## Multiple R-squared: 0.7575, Adjusted R-squared: 0.754
## F-statistic: 215.5 on 1 and 69 DF, p-value: < 0.00000000000000022
Hindcast
LEN <- merge(LEN_tix, LEN_TL, by = keycols)
LEN <- LEN %>%
select(-V7) %>%
mutate(year_period = paste(PACFIN_YEAR, PERIOD, sep = "_")) %>%
filter(PACFIN_YEAR < 2022)
LEN_prices <- price_trend %>%
filter(GMT_SABLEFISH_CODE == "LEN")
LEN <- LEN_prices %>%
select(PACFIN_YEAR,
PERIOD,
avg_afi_price,
min_afi_price,
max_afi_price,
med_afi_price) %>%
full_join(LEN, by = c("PACFIN_YEAR", "PERIOD"))
Vessels that participate in the sablefish fishery also tend to participate in other fisheries, such as Dungeness crab, salmon, and Alaska sablefish. The Dungeness crab fishery in particular can be heavily market-driven, and higher crab prices can entice vessels to prioritize crab fishing. For that reason, the GMT explored whether Dungeness crab prices would influence vessel participation in the DTL sectors enough to consider including it as a variable in the model. Dungeness crab data are pulled from the PacFIN comprehensive fish ticket database using Thomson Fishery Code 01 and PacFIN Species Code “DCRB”.
As expected, higher Dungeness crab prices linearly correlate with lower vessel participation in the LEN sector (Figure 30). However, the GMT cannot rule out the possibility that these are not necessarily related events. In other words, Dungeness crab prices have been increasing over time while sablefish participation has declined in recent years, but those may be concurrent trends caused by different factors. Insight from participants in the fishery may elucidate any connection. Maximum Dungeness crab prices may have some relationship although not a linear one.
Figure 30. Relationship between inflation-adjusted Dungeness crab price per pound and three dependent variables in the LEN data.
The following code returns Table 16, which indicates that there is very little improvement to the model when adding either average or maximum Dungeness crab prices. This is also confirmed with a likelihood ratio test where the p-value of 0.117 indicates that there is no statistically significant difference to the model when average Dungeness crab prices are added (Table 17).
Table 16: Comparison of linear regression models predicting number of vessels, using the status quo approach, sablefish average and maximum prices combined, and 3 combinations of sablefish and crab inflation-adjusted prices per pound.
LEN_crabmod1 <- lm(VES_NUM ~ ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod2 <- lm(VES_NUM ~ adj_crab_price + ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod3 <- lm(VES_NUM ~ max_crab_price + ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod4 <- lm(VES_NUM ~ adj_crab_price, data = LEN_crab)
LEN_crabmod5 <- lm(VES_NUM ~ max_crab_price, data = LEN_crab)
export_summs(LEN_crabmod1, LEN_crabmod2, LEN_crabmod3, LEN_crabmod4, LEN_crabmod5,
model.names = c("Avg+Max Sablefish Prices", "+ Avg. Crab Price",
"+ Max Crab Price",
"Avg Crab Price", "Max Crab Price"),
statistics = NULL)
| Avg+Max Sablefish Prices | + Avg. Crab Price | + Max Crab Price | Avg Crab Price | Max Crab Price | |
|---|---|---|---|---|---|
| (Intercept) | -22.48 *** | -11.67 | -19.00 ** | 55.81 *** | 41.28 *** |
| (4.74) | (8.44) | (6.37) | (5.61) | (4.72) | |
| ADJ_PRICE | 7.61 *** | 7.42 *** | 7.41 *** | ||
| (1.48) | (1.47) | (1.50) | |||
| max_afi_price | 3.23 *** | 2.81 *** | 3.18 *** | ||
| (0.62) | (0.67) | (0.63) | |||
| adj_crab_price | -1.27 | -4.93 *** | |||
| (0.83) | (1.05) | ||||
| max_crab_price | -0.22 | -1.04 * | |||
| (0.27) | (0.42) | ||||
| N | 71 | 71 | 71 | 71 | 71 |
| R2 | 0.66 | 0.67 | 0.66 | 0.24 | 0.08 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||||
Table 17: Likelihood ratio test of nested models. P* value < 0.05 means including dungeness crab prices significantly improves the model.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 5 | -245 | |||
| 4 | -246 | -1 | 2.47 | 0.116 |
The cost of dockside fuel can impact profit margins for fishery participants and influence the decision to participate and in which fishery. The GMT explored whether adding dockside fuel prices (adjusted for inflation) improves the model’s ability to predict the number of vessels in the fishery. 2011-2022 data for pre-tax dockside price per gallon are pulled from the Fisheries Economic Data Program’s EFIN Monthly Marine Fuel Prices database, managed by Pacific States Marine Fisheries Commission. The data are provided for each state separately and are broken out into each IOPAC port along the West Coast. For the purposes of the LEN and OAN models, the GMT used only California data from ports north of 36° N. lat. Additionally, OR and WA fuel price data were combined into one variable, because there is very little difference between OR and WA prices, whereas northern California fuel prices were notably higher (Figures 31 and 32).
All zero prices were filtered out of the data, as well as prices that included tax (since the vast majority of data are pre-tax). For 2011-2015, the removed data makes up 0% to 10% of each year’s data, but between 2016 and 2022, the amount of data removed constitutes an average of 23%, up to 33% in 2020, of the total annual fuel price data.
Figure 31. Trend in average bimonthly dockside fuel prices in WA, OR, northern CA, and southern CA, 2011-2022.
Figure 32. Distribution of average bimonthly fuel prices in WA, OR, northern CA, and southern CA, 2011-2022. Each panel is a bimonthly period (1-6).
Figure 33 indicates very little influence that fuel prices have on LEN participation. Although there seems to be some linear relationship between OR/WA fuel prices and number of vessels, such a causal relationship would not logically be one in which higher fuel prices enticed more vessels into the fishery.
Figure 33. Relationships between regional dockside fuel prices and LEN model response variables.
Tables 18 and 19 show that adding fuel prices from either OR/WA or northern CA does not improve the fit of the model predicting number of vessels. The “nested” model in the likelihood ratio test includes average calculated sablefish price and maximum sablefish AFI price, whereas the “full” model adds combined OR/WA fuel prices. The p-value of the likelihood ratio test indicates that the model with OR/WA fuel prices included is not significantly different from the simpler model.
Table 18: Comparison of linear regression models predicting number of vessels, using the sablefish average and maximum prices and dockside fuel prices from OR/WA combined and northern CA.
| Avg+Max Sablefish Prices | OR+WA Fuel | CA North Fuel | Avg+Max Sable & OR+WA Fuel | Avg+Max Sable & CA N. Fuel | Avg+Max Sable & OR+WA & CA N. Fuel | |
|---|---|---|---|---|---|---|
| (Intercept) | -22.48 *** | 13.97 * | 23.59 * | -24.00 *** | -26.45 *** | -30.65 * |
| (4.74) | (6.79) | (10.19) | (5.58) | (7.55) | (12.23) | |
| ADJ_PRICE | 7.61 *** | 7.38 *** | 7.52 *** | 8.03 *** | ||
| (1.48) | (1.56) | (1.49) | (1.89) | |||
| max_afi_price | 3.23 *** | 3.24 *** | 3.26 *** | 3.30 *** | ||
| (0.62) | (0.63) | (0.63) | (0.64) | |||
| adj_fuel_OR_WA | 4.35 * | 0.61 | -1.81 | |||
| (1.77) | (1.17) | (4.13) | ||||
| adj_fuel_CAN | 1.49 | 0.91 | 2.92 | |||
| (2.26) | (1.35) | (4.77) | ||||
| N | 71 | 71 | 71 | 71 | 71 | 71 |
| R2 | 0.66 | 0.08 | 0.01 | 0.66 | 0.66 | 0.66 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||||
Table 19: Likelihood ratio test comparing a full model with average and maximum sablefish prices and OR/WA fuel prices to a nested model without OR/WA fuel prices.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 5 | -246 | |||
| 4 | -246 | -1 | 0.287 | 0.592 |
Generalized linear models (GLMs) can be useful when the response variable does not follow a normal distribution and therefore a linear regression could predict negative values by assuming a normal distribution. This appears to be the case for the number of vessels used in the DTL model. In the past, using trip limits as a predictor of vessel number has resulted in the model predicting a negative number of vessels in the fishery, which is largely why trip limits were no longer used as a predictor of vessel number as of Fall 2022. The log-link in a GLM forces the model to always lead to positive predictions, and as discussed in Section 2.1.1, vessel number data appear to fit a negative binomial distribution.
The following code uses a formula to find the model with the best fit out of a suite of predictor variables, based on each model’s Akaike Information Criterian (AIC) score. The response variable is number of vessels, and the predictor variables used in the formula are:
Table 20 is the first six rows of the output, with each row representing a separate model ranked from highest to lowest AIC score. Note that all six of the highest ranked models shown in the table have equal AIC scores, which means they fit the data equally well. All but the sixth model include, at a minimum, average sablefish price and maximum sablefish price, and all models include period as a fixed effect.
Likelihood ratio tests were used to determine whether adding average crab price or fuel prices from either region significantly improved the model compared to only using average and maximum sablefish prices (i.e., nested model; Table 21). All p-values are greater than 0.05, which means the nested model is sufficient.
model.full <- glm.nb(as.formula(
paste("VES_NUM",
paste(0, "+", paste(covars, collapse = " + ")),
sep = " ~ ")),
data = LEN,
na.action = "na.fail")
model.suite <- MuMIn::dredge(model.full,
rank = "AIC",
fixed = c("PERIOD"))
Table 20: First six rows of the ranked model.suite output.
| adj_crab_price | adj_fuel_CAN | adj_fuel_OR_WA | ADJ_PRICE | max_afi_price | max_crab_price | med_afi_price | PERIOD | df | logLik | AIC | delta | weight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.213 | 0.104 | + | 9 | -228 | 475 | 0 | 0.206483465365749 | |||||
| -0.267 | 0.238 | 0.155 | 0.0975 | + | 11 | -226 | 475 | 0.076 | 0.198781142501648 | |||
| -0.0366 | 0.205 | 0.0963 | + | 10 | -228 | 475 | 0.301 | 0.177662832608923 | ||||
| -0.0327 | -0.257 | 0.229 | 0.15 | 0.0911 | + | 12 | -226 | 476 | 0.683 | 0.146769478329992 | ||
| 0.207 | 0.102 | -0.00863 | + | 10 | -228 | 476 | 0.805 | 0.13808644631333 | ||||
| -0.374 | 0.339 | 0.0937 | 0.124 | + | 11 | -227 | 476 | 0.892 | 0.132216634880358 |
Table 21: Likelihood ratio test results comparing highest ranked models from model.suite.
| Full Model | Nested Model | P-value |
|---|---|---|
| avg sable price + max sable price + avg crab price | avg sable price + max sable price | 0.1963 |
| avg sable price + max sable price + CA fuel + OR/WA fuel | avg sable price + max sable price | 0.1380 |
| avg sable price + max sable price + CA fuel | avg sable price + max sable price | 0.9043 |
Using a generalized linear model to predict the number of vessels with average and maximum sablefish prices, compared to a linear regression using the same variables, improves the model diagnostics, particularly the normal q-q plot (Figure 34).
Figure 34. Model diagnostic plots of a GLM and a linear regression, both using average and maximum sablefish prices to predict number of vessels.
Figure 35. Historical comparison of predicted fleetwide ladnings and actual fleetwide landings using a GLM to predict number of vessels based on average and maximum sablefish prices, as well as log-transforming data in the average landings per vessel model.
Overall, using a generalized linear model, compared to a linear regression, improves the model diagnostics, but does not improve the fit to actual historical data. Recall that the adjusted R-squared using a linear regression with average and maximum sablefish price as covariates is 0.752.
##
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -73227 -12887 -1721 11651 93926
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -8254.8169 6891.8148 -1.198 0.235
## predict 1.1341 0.0784 14.466 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24300 on 69 degrees of freedom
## Multiple R-squared: 0.752, Adjusted R-squared: 0.7484
## F-statistic: 209.3 on 1 and 69 DF, p-value: < 0.00000000000000022
The OAN sector tends to be more volatile than the LEN sector, because a permit is not required to participate. Anywhere from 25 up to 150 vessels have participated in a single bimonthly period since 2022, excluding period 6 of 2015 during which the sector closed after exceeding the catch share (Figure 36). There is considerable overlap with other fisheries and industries as well. Vessels tend to be smaller, and trip lengths tend to be shorter for this sector, compared to the LEN sector. Given this difficulty in predicting fleet dynamics, and other issues with managing the sector, the Council recently added the development of an OA registration program to the workload prioritization list (Agenda Item F.8.a, NMFS Report 1, March 2023). In the future, such a registration program may be helpful for predicting OAN participation in the DTL model. Similar to the LEN sector, sablefish prices have been slowly declining since 2011, reaching an unprecedented low in 2022 (Figure 36).
Figure 36. OAN participation, sablefish landings (lbs.), and inflation-adjusted price per pound, 2011-2022.
The GMT assessed the OAN distribution assumptions using the same methods as the LEN model, outlined in Section 2.1.1.1 above.
The average landings per vessel data from the OAN sector are not normally distributed (Table 22), likely due to two data points that exceed 4,000 lbs., both from the latter half of 2022 when trip limits were the highest they have ever been for this sector. Removing those two data points would not be appropriate for making landings predictions, because future trip limits are likely to be as high, if not higher, due to high sablefish ACLs. Log-transforming the data is the only data transformation that provides a normal distribution (Table 22). Log-transforming the data also improves the normal q-q plot for average landings data (Figure 38).
Table 22: Shapiro-Wilk normality test for OAN avg. lbs. landed per vessel dataset
| Value | raw_data | log_data | sqrt_data | cube_data | |
|---|---|---|---|---|---|
| W | W statistic | 0.77924 | 0.96759 | 0.89691 | 0.92643 |
| p value | 0.00000 | 0.06325 | 0.00003 | 0.00046 |
Figure 37. Top left panel: histogram of untransformed average landings per vessel. Remaining panels: histograms of the log, square root, and cube root transformed data.
Figure 38. Normal q-q plots of the untransformed and log-transformed OAN data on average landings per vessel.
Skewness
As shown in Table 23, the data transformation that reduces skewness the most is a log transformation, and the p-value is closest to 0.05 with a log-transformation.
Table 23: Skewness values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.
| Value | raw_data | log_data | sqrt_data | cube_data |
|---|---|---|---|---|
| skewness | 2.445 | 0.568 | 1.439 | 1.132 |
| t value | 8.411 | 1.952 | 4.951 | 3.894 |
| p value | 0.000 | 0.027 | 0.000 | 0.000 |
Kurtosis
The OAN data for average landings per vessel has a high excess kurtosis (14.6), where zero is the excess kurtosis of a normal distribution (Table 24). Log transforming the OAN average landings data substantially reduces the kurtosis and brings it closest to that of a normal distribution.
Table 24: Kurtosis values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.
| Value | raw_data | log_data | sqrt_data | cube_data |
|---|---|---|---|---|
| kurtosis | 8.517 | 1.097 | 3.664 | 2.553 |
| t value | 14.649 | 1.886 | 6.302 | 4.391 |
The number of vessels participating in the OAN sector is not normally distributed according to the results of the Shapiro-Wilk normality test in Table 25, but log-transforming the data normalizes it (p-value > 0.05), as well as all other data transformations.
Table 25: Shapiro-Wilk normality test for OAN number of vessels dataset
| Value | raw_data | log_data | sqrt_data | cube_data | |
|---|---|---|---|---|---|
| W | W statistic | 0.949 | 0.967 | 0.969 | 0.971 |
| p value | 0.006 | 0.061 | 0.079 | 0.103 |
Figure 39. Top left panel: histogram of untransformed number of vessels dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.
Skewness
The OAN number of vessels is skewed right but not significantly different from a normal distribution (p-value > 0.05; Table 26). Log-transforming the data normalizes the skew even more but in the opposite direction (left). All data transformations provide a skewness that is not significantly different from that of a normal distribution.
Table 26: Skewness values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.
| Value | raw_data | log_data | sqrt_data | cube_data |
|---|---|---|---|---|
| skewness | 0.466 | -0.256 | 0.117 | -0.005 |
| t value | 1.602 | -0.879 | 0.404 | -0.017 |
| p value | 0.057 | 0.809 | 0.344 | 0.507 |
Kurtosis
The excess kurtosis for the un-transformed vessel number data, as well as all transformed data, have low kurtosis values that are close to zero, the excess kurtosis of a normal distribution (Table 27).
Table 27: Kurtosis values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.
| Value | raw_data | log_data | sqrt_data | cube_data |
|---|---|---|---|---|
| kurtosis | -0.778 | -0.865 | -0.949 | -0.953 |
| t value | -1.338 | -1.487 | -1.632 | -1.639 |
Non-Normal Distributions of Count Data
Similar to the vessel number data for the LEN sector, the distribution appears to closely resemble the expected negative binomial distribution but not the expected poisson distribution. The use of a negative binomial distribution is explored in Section 3.2.5 on using a GLM for the OAN model.
Figure 40. Distributions of the vessel number observed data, poisson expected, and negative binomial expected.
Unlike the LEN model, data weighting has historically been used for the OAN model in making predictions for management decisions, and therefore, the GMT does not provide that as a potential improvement. Rather, every OAN model run inherently up-weights the most recent year. Figure 41 plots the relationships between average landings per vessel in the OAN sector against OAN weekly trip limits, bimonthly trip limits, and calculated inflation-adjusted price per pound. Similar to the LEN sector, average landings per vessel in the OAN fleet are heavily influenced by trip limits with a potentially non-linear relationship with average price per pound.
Figure 41. Relationships between average landings per vessel and three independent variables.
Using the three independent variables in Figure 41, there is very little difference between using the weekly limit only, the bimonthly limit only, or using both the weekly and bimonthly limit (Table 28). This is most likely because, unlike the LEN sector, the OAN weekly trip limits have been exactly half the bimonthly trip limits since 2012. Thus, using both variables in the model is duplicative. Since weekly trip limits can be more constraining than bimonthly limits, the weekly trip limit is used as the sole predictor of average landings per vessel. 94.8% of the variance in average landings per vessel can be explained by weekly trip limits (model summary output).
Table 28: Comparison of linear regression models predicting average lbs. landed per vessel, using status quo independent variables.
| Weekly | Bimonthly | Wkly + Bimon | Wkly + Bimon + Wkly:Bimon | |
|---|---|---|---|---|
| (Intercept) | 201.14 ** | 193.63 ** | 196.11 ** | 357.49 * |
| (64.34) | (64.91) | (64.95) | (144.04) | |
| TL.WEEKLY | 0.97 *** | 0.57 | 0.49 | |
| (0.04) | (0.57) | (0.57) | ||
| TL.BIMON | 0.49 *** | 0.20 | 0.14 | |
| (0.02) | (0.28) | (0.29) | ||
| TL.WEEKLY:TL.BIMON | 0.00 | |||
| (0.00) | ||||
| N | 71 | 71 | 71 | 71 |
| R2 | 0.87 | 0.87 | 0.88 | 0.88 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||
##
## Call:
## lm(formula = AVG_LB ~ TL.WEEKLY, data = OAN, weights = WEIGHT)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -958.45 -125.84 6.46 125.14 835.96
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 112.59656 56.11239 2.007 0.0487 *
## TL.WEEKLY 1.05544 0.02959 35.675 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 271.4 on 69 degrees of freedom
## Multiple R-squared: 0.9486, Adjusted R-squared: 0.9478
## F-statistic: 1273 on 1 and 69 DF, p-value: < 0.00000000000000022
Using weekly trip limit as the predictor of average landings per vessel, the model diagnostics show clear heteroskedasticity, because the residuals are scattered together in the left-hand side of the residuals vs. fitted plot, and there is a diagonal trend line in the scale-location plot (Figure 42). The normal q-q plot is also heavily tailed on both ends. Even so, the predictions of landings per vessel are very close to the actual landings per vessel (Figure 43).
Figure 42. Model diagnostic plots for the model predicting average landings per vessel based on weekly trip limits.
Figure 43. Plot of predicted vs. actual average landings per vessel in the OAN sector, using weekly limit as a predictor.
For the OAN model that predicts the number of vessels in the fleet, a “period 4 peak adjuster”, developed by Dr. Sean Matson (NOAA; former GMT member) is used as a covariate instead of creating regression coefficients for each period. This is because OAN participation peaks in period 4, and the further away from period 4, the fewer vessels tend to participate in the fishery. The following scores are given to each of the six periods:
Figure 44 shows that relationship in which a value of 0 represents period 4 with the most vessels, and a value of -3 represents 1 with the fewest vessels. The figure also demonstrates a positive linear relationship with average sablefish price per pound but no obvious relationship with weekly trip limits. 42.9% of the variance in OAN vessel participation can be explained by the period and the average sablefish price per pound (model summary output). The relationship of predicted to fitted values for the number of vessels is not as strong as that of average landings per vessel (Figure 45).
Figure 44. Relationships between number of vessels per bimonthly period and three independent variables.
OAN_ves_mod <- lm(VES_NUM ~ PER.4.PEAK + ADJ_PRICE, data = OAN, weights = WEIGHT)
summary(OAN_ves_mod)
##
## Call:
## lm(formula = VES_NUM ~ PER.4.PEAK + ADJ_PRICE, data = OAN, weights = WEIGHT)
##
## Weighted Residuals:
## Min 1Q Median 3Q Max
## -59.10 -24.62 1.08 19.60 69.76
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.776 16.319 1.763 0.0823 .
## PER.4.PEAK 15.401 3.025 5.091 0.00000302 ***
## ADJ_PRICE 22.816 4.894 4.662 0.00001510 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 28.1 on 68 degrees of freedom
## Multiple R-squared: 0.4455, Adjusted R-squared: 0.4292
## F-statistic: 27.32 on 2 and 68 DF, p-value: 0.000000001959
Figure 45. Plot of predicted vs. actual number of vessels in the OAN fleet, using the period 4 peak adjuster and average sablefish prices as predictors.
Comparing the retrospective predictions to the actual historical data shows that the OAN model struggles to capture the annual fluctuation in fleetwide landings, largely due to the weakness of the model predicting number of vessels (Figure 46). There is only a 40.4% fit of the predicted to the actual data. For this reason, incorporating other market factors contributing to a willingness to participate, such as prices from other fisheries, may be most helpful for predicting OAN participation, compared to the LEN sector.
Figure 46. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black).
OAN_compare <- lm(LBS ~ predict, data = OAN_predictions)
summary(OAN_compare)
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -112049 -34038 76 27674 156059
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11335.9223 15925.2120 0.712 0.479
## predict 0.9031 0.1299 6.954 0.00000000161 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 51650 on 69 degrees of freedom
## Multiple R-squared: 0.412, Adjusted R-squared: 0.4035
## F-statistic: 48.36 on 1 and 69 DF, p-value: 0.000000001613
The GMT explored the following potential improvements to the OAN model:
As previously discussed, the average landings per vessel data are not normally distributed, and the model using un-transformed data includes heteroskedasticity. Log-transforming the response variable data improves the normality, but as shown in Figures 47 and 48 below, log-transforming both the response variable and the predictor variable, weekly trip limit, provides the greatest improvement to both the linear relationship and the model diagnostics, with less of an impact to fit compared to only transforming the response variable (Table 29).
Figure 47. Relationship plots between average landings per vessel and weekly limit, using raw data, log-transforming only the response variable, or log-transforming both the response variable and the predictor variable.
Table 29: Comparison of linear regression models predicting average landings per vessel, using raw data, transforming only the response variable, or transforming both the response and the predictor variables.
| SQ | log(y) | log(x) & log(y) | |
|---|---|---|---|
| (Intercept) | 112.60 * | 6.61 *** | 0.59 * |
| (56.11) | (0.04) | (0.25) | |
| TL.WEEKLY | 1.06 *** | 0.00 *** | |
| (0.03) | (0.00) | ||
| log(TL.WEEKLY) | 0.94 *** | ||
| (0.03) | |||
| N | 71 | 71 | 71 |
| R2 | 0.95 | 0.86 | 0.91 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||
Figure 48. Model diagnostics plots for the un-transformed landings per vessel data (left) compared to log-transforming the response variable (middle) or log-transforming both the response variable and the predictor variable (right).
Figure 49 plots the predicted values using a model with log-transformed response and predictor variables against the actual landings per vessel in the data and demonstrates that a log transformation model predicts landings per vessel well. Log-transforming the data results in very little difference for the historical fleetwide landings predictions (Figure 50), but removing heteroskedasticity is important for making predictions with confidence to inform management decisions. In all other potential model improvements that follow, average landings per vessel are predicted using log-transformed data of both the response and predictor variables.
Figure 49. Plot of predicted vs. actual landings per vessel using a linear model with log-transformed response and predictor variables.
Figure 50. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a landings per vessel model that log-transforms the response and predictor variables.
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -111121 -33548 -1375 27338 154599
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12506.3902 15866.5311 0.788 0.433
## predict 0.8982 0.1300 6.907 0.00000000196 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 51790 on 69 degrees of freedom
## Multiple R-squared: 0.4088, Adjusted R-squared: 0.4002
## F-statistic: 47.71 on 1 and 69 DF, p-value: 0.000000001957
Similar variables of minimum, median, and maximum prices were explored for predicting the number of vessels in the OAN model as were explored in the LEN model, but the price values used for OAN are pulled from OAN fish tickets. Figure 51 shows the trend in prices and demonstrates that there tends to be less variation in the maximum price compared to the LEN sector but a similar trend in average and median prices.
Figure 51. Minimum, median, average, and maximum inflation-adjusted price per pound in the OAN sector, 2011-2023. Zero prices were removed.
The number of vessels in the OAN fleet per bimonthly period is influenced by the average, median, and maximum sablefish prices, but there does not appear to be a relationship to the minimum prices (Figure 52). Similar to the LEN sector, there does appear to be a strong correlation between median and average prices and potentially some non-linear relationship between median/average prices and maximum price (Figure 53). However, the relationship varies year-to-year and does not show a clear pattern across years (Figure 54).
Figure 52. Relationships between number of OAN vessels and calculated average, minimum, median, and maximum sablefish prices.
Figure 53. Correlation check across sablefish price data.
Figure 54. Relationship of average and median prices with maximum price on an annual scale.
Using various price values as predictors of the number of OAN vessels per bimonthly period, the model with the best fit to data includes both average and maximum sablefish prices (Table 30). A likelihood ratio test indicates that adding maximum sablefish prices to the model, along with average prices, significantly improves the model fit (Table 31).
Table 30: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.
| Model SQ | Max AFI Price | Avg+Max AFI Price | Med AFI Price | Med+Max AFI Price | |
|---|---|---|---|---|---|
| (Intercept) | 28.78 | -55.96 ** | -59.59 ** | 34.49 * | -57.27 ** |
| (16.32) | (20.24) | (20.17) | (14.82) | (20.00) | |
| ADJ_PRICE | 22.82 *** | 7.48 | |||
| (4.89) | (4.79) | ||||
| PER.4.PEAK | 15.40 *** | 11.85 *** | 11.94 *** | 14.96 *** | 11.82 *** |
| (3.02) | (2.57) | (2.55) | (3.02) | (2.54) | |
| max_afi_price | 16.81 *** | 14.67 *** | 14.51 *** | ||
| (2.12) | (2.51) | (2.51) | |||
| med_afi_price | 18.14 *** | 6.25 | |||
| (3.79) | (3.74) | ||||
| N | 71 | 71 | 71 | 71 | 71 |
| R2 | 0.45 | 0.62 | 0.63 | 0.45 | 0.64 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||||
Table 31: Likelihood ratio test of nested models. P* value < 0.05 means including maximum prices significantly improves the model.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 4 | -331 | |||
| 5 | -317 | 1 | 29.3 | 6.12e-08 |
Adding maximum sablefish prices does appear to add some curvature to the residuals plot and is slightly less random than the model using only average sablefish prices, but including maximum prices improves the normal q-q plot by smoothing the bottom tail.
Figure 55. Model plots for the model predicting vessel number using both average and maximum sablefish prices.
Adding maximum sablefish price to the model to predict the number of vessels in the OAN fleet does not improve the fit to actual historical data. Recall that the fit for the status quo model is 0.4035 and the fit for the log-transformation model is 0.4002, virtually identical to the fit when both log-transformation and maximum sablefish prices are used. The next section will demonstrate that maximum Dungeness crab prices may be a better predictor than maximum sablefish prices, when used alongside average sablefish prices.
Figure 56. Historical comparison of predicted fleetwide (red) and actual fleetwide landings (black), using a model that log-transforms the average landings and weekly trip limit data and adds maximum sablefish prices to predict number of vessels.
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -111121 -33548 -1375 27338 154599
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12506.3902 15866.5311 0.788 0.433
## predict 0.8982 0.1300 6.907 0.00000000196 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 51790 on 69 degrees of freedom
## Multiple R-squared: 0.4088, Adjusted R-squared: 0.4002
## F-statistic: 47.71 on 1 and 69 DF, p-value: 0.000000001957
The only relationship that OAN vessel participation has with Dungeness crab prices appears to be maximum crab price (Figure 57), which exhibits an inverse relationship. This is expected, because it suggests that higher crab prices will tend to lead vessels to prioritize the Dungeness crab fishery over the sablefish fishery, especially given that Dungeness crab prices have risen since 2011 while sablefish prices have declined. Particularly for the OA fishery, portfolio diversity is only likely to increase with climate change and the need for adaptability, and projection models are likely to benefit from considering the crossover between fisheries.
Figure 57. Relationship plots between OAN number of vessels and average, minimum, and maximum Dungeness crab prices.
Table 32 shows a slight improvement in the fit of the model when maximum Dungeness crab prices are included in addition to average sablefish prices to predict the number of OAN vessels. A likelihood ratio test also indicates that the addition of maximum crab prices significantly improves the model, compared to only using sablefish prices (Table 33). In all cases, the period 4 adjuster is used as well.
Table 32: Comparison of linear regression models using the status quo approach and adding maximum Dungeness crab prices.
| Model SQ | Max Crab Price | Avg Sable + Max Crab Prices | |
|---|---|---|---|
| (Intercept) | 28.34 | 136.02 *** | 61.77 ** |
| (16.35) | (12.33) | (22.02) | |
| ADJ_PRICE | 22.97 *** | 19.61 *** | |
| (4.90) | (5.01) | ||
| PER.4.PEAK | 15.24 *** | 13.81 *** | 13.39 *** |
| (3.03) | (3.38) | (3.07) | |
| max_crab_price | -3.77 ** | -2.44 * | |
| (1.17) | (1.11) | ||
| N | 70 | 70 | 70 |
| R2 | 0.45 | 0.37 | 0.48 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||
Table 33: Likelihood ratio test comparing two models with (full) and without (nested) Dungness crab prices to predict number of vessels.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 5 | -324 | |||
| 4 | -327 | -1 | 4.93 | 0.0264 |
The model diagnostics also improve when maximum Dungeness crab prices are added to the model to predict OAN number of vessels. The normal q-q plot hugs the line more closely in the full model, and the residuals appear more randomly scattered. Recall that the inclusion of maximum sablefish prices worsened the model diagnostics, compared to the status quo model.
Figure 58. Model plots for the model predicting vessel number using both average and maximum sablefish prices.
In Figure 59 below, note that the model that includes maximum Dungeness crab prices better predicts the extremely high sablefish landings in 2022 than a comparable model that does not (Figures 50 & 56). Additionally, the fit to the model (0.4553) is higher than the status quo model or the model that includes maximum sablefish prices. However, the model still struggles to capture the annual variation in OAN landings.
Figure 59. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a model that log-transforms the average landings and weekly limit data and adds maximum Dungeness crab prices to predict number of vessels.
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -103502 -30392 -1605 26672 151073
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9730.5057 14906.3748 0.653 0.516
## predict 0.9251 0.1208 7.660 0.0000000000906 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 49230 on 68 degrees of freedom
## Multiple R-squared: 0.4632, Adjusted R-squared: 0.4553
## F-statistic: 58.68 on 1 and 68 DF, p-value: 0.00000000009063
Neither the average landings per vessel nor the number of vessels in the OAN fleet appear to be heavily influenced by fuel prices from either region (Figure 60). There may be some relationship between average landings per vessel and northern CA fuel prices, but it is unclear why higher fuel prices would lead to higher landings per vessel. There does appear to be a slight inverse relationship between northern CA fuel prices and the number of OAN vessels, although the plot’s scatter appears mostly random.
Table 34 demonstrates that adding dockside fuel prices from OR/WA combined and northern CA, in addition to average sablefish and maximum Dungeness crab prices, improves the model fit. However, there is very little difference between the model that only adds northern CA fuel prices and the model that adds all fuel prices. When all fuel prices are included, the fuel predictors are not statistically significant, whereas when only northern CA fuel prices are added, the one fuel variable is statistically significant. A likelihood ratio test indicates that there is no statistically significant difference between the model that includes all fuel prices and the model that includes only northern CA fuel prices (Table 35). Another likelihood ratio test indicates that, compared to only using sablefish and Dungeness crab prices, adding northern CA fuel prices significantly improves the model (Table 36).
Figure 60. Relationships between regional fuel prices and landings per vessel or number of vessels in the OAN sector.
Table 34: Comparison of linear regression models using the status quo approach and adding maximum crab and dockside fuel prices as predictors.
| SQ Model | Sable & Crab Prices | OR+WA Fuel | CA North Fuel | Sable & Crab & OR+WA Fuel | Sable & Crab & CA N. Fuel | Sable & Crab & OR+WA & CA N. Fuel | |
|---|---|---|---|---|---|---|---|
| (Intercept) | 28.34 | 61.77 ** | 119.39 *** | 151.39 *** | 86.37 ** | 107.97 *** | 125.95 *** |
| (16.35) | (22.02) | (17.02) | (19.17) | (26.98) | (30.47) | (33.32) | |
| ADJ_PRICE | 22.97 *** | 19.61 *** | 19.10 *** | 16.65 ** | 12.90 * | ||
| (4.90) | (5.01) | (4.97) | (5.07) | (5.81) | |||
| PER.4.PEAK | 15.24 *** | 13.39 *** | 18.02 *** | 19.05 *** | 14.28 *** | 14.84 *** | 15.12 *** |
| (3.03) | (3.07) | (3.52) | (3.36) | (3.09) | (3.07) | (3.06) | |
| max_crab_price | -2.44 * | -2.66 * | -2.61 * | -2.38 * | |||
| (1.11) | (1.11) | (1.09) | (1.10) | ||||
| adj_fuel_OR_WA | -4.35 | -4.88 | 11.85 | ||||
| (3.70) | (3.15) | (9.12) | |||||
| adj_fuel_CAN | -9.98 ** | -6.94 * | -18.61 | ||||
| (3.60) | (3.25) | (9.55) | |||||
| N | 70 | 70 | 70 | 70 | 70 | 70 | 70 |
| R2 | 0.45 | 0.48 | 0.28 | 0.34 | 0.50 | 0.52 | 0.53 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||||||
Table 35: Likelihood ratio test comparing a model with all fuel prices (full model) and a model with only northern CA fuel prices (nested model).
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 7 | -321 | |||
| 6 | -322 | -1 | 1.82 | 0.177 |
Table 36: Likelihood ratio test comparing a model that includes northern CA fuel prices (full model) and one that does not (nested model). Both models include average sablefish and maximum crab prices.
| #Df | LogLik | Df | Chisq | Pr(>Chisq) |
|---|---|---|---|---|
| 6 | -322 | |||
| 5 | -324 | -1 | 4.75 | 0.0294 |
Figure 61 compares the residual and normal q-q plots for the models with and without northern CA fuel prices. Adding northern CA fuel prices does not greatly impact the model assumptions.
Figure 61. Model diagnostics plots for the model using only average sablefish and maximum D. crab prices (left) and the model that also includes northern CA fuel prices (right).
The retrospective fit is not as strong when northern CA fuel prices are added, compared to only including sablefish and Dungeness crab prices as predictors of OAN vessel participation. This may be due to the high variability of fuel prices since the COVID-19 pandemic began in 2020. At this time, the GMT is proposing to not include fuel prices as a covariate but to keep it in the tool belt for potential future inclusion when retrospective fits, especially in recent years, appear stronger.
Figure 62. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a model that includes northern CA dockside fuel prices in addition to average sablefish and maximum crab prices.
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -113038 -28050 -2010 29984 141959
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5971.1158 15944.9256 0.374 0.709
## predict 0.9375 0.1275 7.355 0.000000000324 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50150 on 68 degrees of freedom
## Multiple R-squared: 0.4431, Adjusted R-squared: 0.4349
## F-statistic: 54.1 on 1 and 68 DF, p-value: 0.0000000003241
Similar to Section 2.2.6 on the LEN GLMs, the following code uses the model.suite formula to rank the “best fit” models by AIC score using the covariates provided (“covars”). Similar to LEN, the following OAN GLMs assume a negative binomial distribution. Instead of bimonthly period, the period 4 peak adjuster is provided as a fixed effect. Table 37 shows the first six rows of the ranked list of models. The model with the lowest AIC score (best model) is one that includes maximum sablefish, maximum Dungeness crab, northern CA fuel prices, and the period 4 peak adjuster as predictors. Note that model.suite did not include average sablefish prices in the top five ranked models but instead included maximum sablefish price in all top six models.
OAN <- OAN %>%
mutate(PERIOD = as.factor(PERIOD),
PER.4.PEAK = as.factor(PER.4.PEAK))
covars <- c("ADJ_PRICE", "med_afi_price", "max_afi_price",
"adj_crab_price", "max_crab_price", "adj_fuel_OR_WA",
"adj_fuel_CAN", "PER.4.PEAK")
library(MASS)
model.full <- glm.nb(as.formula(
paste("VES_NUM",
paste(0, "+", paste(covars, collapse = " + ")),
sep = " ~ ")),
data = OAN,
weights = WEIGHT,
na.action = "na.fail")
model.suite <- MuMIn::dredge(model.full,
rank = "AIC",
fixed = c("PER.4.PEAK"))
Table 37: First six rows of the model.suite output ranking OAN GLMs by AIC score (i.e., best fit).
| adj_crab_price | adj_fuel_CAN | adj_fuel_OR_WA | ADJ_PRICE | max_afi_price | max_crab_price | med_afi_price | PER.4.PEAK | df | logLik | AIC | delta | weight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| -0.0514 | 0.218 | -0.0159 | + | 8 | -399 | 815 | 0 | 0.27375340025816 | ||||
| -0.0402 | 0.228 | -0.016 | + | 8 | -400 | 816 | 1.04 | 0.162470294713297 | ||||
| 0.231 | -0.0152 | + | 7 | -401 | 816 | 1.25 | 0.146234313955052 | |||||
| -0.0602 | 0.231 | -0.0162 | -0.0344 | + | 9 | -399 | 816 | 1.28 | 0.1442526975406 | |||
| -0.0489 | 0.233 | + | 7 | -401 | 816 | 1.36 | 0.138358796057569 | |||||
| -0.0543 | -0.0372 | 0.23 | -0.0162 | + | 9 | -399 | 816 | 1.41 | 0.134930497475322 |
Although model.suite ranked the GLM model with northern CA fuel, maximum sablefish, and maximum Dungeness crab prices highest, a likelihood ratio test indicates that there is no statistically significant difference between that model and a GLM that includes only maximum sablefish and crab prices as predictors (Table 38).
Table 38: Likelihood ratio test results to assess inclusion of fuel prices in the OAN GLM predicting number of vessels.
| Full Model | Nested Model | P-value |
|---|---|---|
| CA fuel + max sable price + max crab price | max sable price + max crab price | 0.07125 |
| OR/WA fuel + max sable price + max crab price | max sable price + max crab price | 0.13710 |
The normal q-q plot for the linear regression that includes average sablefish and maximum Dungeness crab prices appears better than those of either of the first and third highest ranked GLM, and there is very little difference among the residual plots (Figure 63). There appears to be a slight improvement in the normal q-q plot when northern CA fuel prices are included in the GLM, compared to not including them. Given that northern CA fuel prices were in the highest ranked model, the p-value of the likelihood ratio test was 0.07 (indicating some nearly significant difference), and there is a slight improvement in the model diagnostics, the following retrospective fleetwide predictions are done using maximum sablefish, maximum Dungeness crab, and northern CA fuel prices as predictors.
Figure 63. Model diagnostics plots for two GLMs and a linear regression that predict number of vessels using sablefish, Dungeness crab, and/or northern CA fuel prices.
Using a GLM and assuming a negative binomial distribution, the model fit to historical data is 0.61, much higher than either the status quo linear regression model (0.4035) or the linear regression that adds maximum Dungeness crab prices (0.4553). Note that the GLM was only applied to the model that predicts the number of vessels, and the model that predicts average landings per vessel is still a linear regression that log-transforms the data. The GLM is better able to capture and predict the annual variation in OAN landings, whereas the linear regressions generally do not capture the variability (Figure 64).
Figure 64. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a GLM to predict the number of vessels based on maximum sablefish and Dungeness crab prices, along with northern CA fuel prices.
##
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -100975 -22835 -3781 19002 162581
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1172.74569 11985.45603 -0.098 0.922
## predict 1.01245 0.09631 10.512 0.000000000000000571 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 41760 on 69 degrees of freedom
## Multiple R-squared: 0.6156, Adjusted R-squared: 0.61
## F-statistic: 110.5 on 1 and 69 DF, p-value: 0.0000000000000005714
In Figure LES_part below, the dark blue dots represent the weekly trip limit for each bimonthly period between 2011 and 2022, while the dots of varying colors represent the number of LES vessels making landings each period (each unique color represents a different year). As the figure shows, participation dropped precipitously in 2019 and has continued declining since then. This is despite the fact that the weekly trip limit has increased since 2019 due to high sablefish allocations. This data leads the model to conclude that high trip limits cause low participation, but the decline in participation is related to market and infrastructure constraints, not to trip limits.
Figure 65. Trend in LES participation (number of vessels; colored dots) and weekly trip limit (dark blue dots), 2011-2022.
Declines in sablefish price, particularly maximum price per pound, are a potential driver of low participation in the fleet. Figure LES_prices below shows the minimum (red), average (blue), and maximum (green) sablefish price per pound, adjusted for inflation, between 2011 and 2022. The average and low prices per pound have been relatively stable since 2011, whereas there tends to be more variation in the maximum price per pound. While the maximum price does not exceed $7 after 2019, this is likely due to COVID-related impacts to markets in 2020 and occurs only after the drop in participation in 2019.
Similar to the LES sector, declines in sablefish price, particularly maximum price per pound, are a potential driver of low participation in the fleet. Figure 66 below shows the declining number of vessels over the years in the bars while the blue line shows the increase in trip limits through time. Due to the limited effort data, the former model lost the ability to accurately predict participation, particularly because it assumed that increasing trip limits resulted in declining participation despite the high likelihood that those are unrelated trends.Therefore, this model is no longer used to predict what is a very low effort in the OAS sector.
Figure 66. Trend in OAS participation (number of vessels; bar) and bimonthly trip limit (blue line), 2011-2022.