1 Introduction

There are four main sectors of the sablefish daily trip limit (DTL) fishery on the West Coast:

Limited entry north of 36° N. lat. (LEN)
Open access north of 36° N. lat. (OAN)
Limited entry south of 36° N. lat. (LES)
Open access south of 36° N. lat. (OAS)

The limited entry sablefish fishery north of 36° N. lat. includes both the primary tier fishery, in which vessels are given up to three stacked permits that they can fish up to throughout the entire season, and the DTL fishery in which vessels can fish up to the weekly and/or bimonthly trip limits specified in federal regulations. Although the fishery is called the daily trip limit fishery, daily trip limits have been removed from all four sectors and only weekly or bimonthly trip limits remain. The sablefish trip limit (STL) model (or “daily trip limit model”) independently projects fleetwide landings in the LEN and OAN sectors. In the past, models have been used to predict fleetwide landings in the LES and OAS sectors, but due to significant decreases in participation, the use of models to predict landings in the two southern sectors does not seem appropriate. In their respective sections of this report, the GMT summarizes effort data in the southern fleets to demonstrate the issue but does not provide models for this review.

The GMT uses the STL model to project annual landings estimates in the biennial harvest specifications and management measures process, as well as inseason landings estimates. In both cases, each sector’s trip limits are adjusted as needed to keep landings within the sector-specific landings target, which is buffered under the sector-specific catch share (i.e., allocation) to account for discard mortality estimates. In November 2022, the Council chose to conduct a methodology review of the STL model due to unrealistic output projections identified by the GMT, which are described in Agenda Item G.4.a, Supplemental GMT Report 1, September 2022. Due to those issues, the Scientific and Statistical Committee (SSC) approved the adjustments to the model recommended by the GMT, which are described in Appendix B of Agenda Item H.4.a, Supplemental GMT Report 3, November 2022. The STL model has also never been reviewed by the SSC.

The key takeaways from the analysis in this report are:

LEN

Up-weighting the most recent year in the average landings per vessel model results in very little difference in the model fit but provides flexibility and responsiveness to recent trends.
Log-transforming the response variable in the average landings per vessel model slightly improves the model diagnostics.
Adding maximum sablefish price per pound to the model predicting number of vessels improves the model fit and model diagnostics.
Fuel prices and Dungeness crab prices are not significant or meaningful predictor variables.

OAN

Log-transforming both the average landings per vessel and weekly trip limit variables in the average landings model improve both the model fit and model diagnostics.
Adding maximum Dungeness crab prices provides a slight improvement to the model fit for the model predicting the number of vessels.
A Generalized Linear Model that uses maximum sablefish and potentially northern California fuel prices improves the fit to observed data and is better able to capture the OAN fleet’s volatility.

This report is written in R Markdown and includes code chunks embedded within the text to highlight key steps in running the model. However, to maintain confidentiality and to simplify the report, not all code necessary to run the model is included here. The following report is divided into the four different sectors, but as described above, model descriptions and analysis are only included for the LEN and OAN sectors.

The first step in running the model is loading the 2011-2022 historical fish ticket data from PacFIN, which is done using an ROracle connection to the online comprehensive fish ticket database. The data values loaded include vessel number, sablefish price per pound, and landed weight of sablefish. The modeler checks for any major outliers in the price data, since they could skew the inputs to the model. There is currently one outlier from 2014 in which the sablefish price per pound exceeds $20, which is removed from the data (code not included). Inflation adjusted price per pound data for both sablefish and Dungeness crab are calculated by dividing the total PacFIN adjusted for inflation (AFI) exvessel revenue by the total landed weight (lbs.) of sablefish for each bimonthly period.

ft <- con %>% 
  tbl(in_schema("PACFIN_MARTS", "COMPREHENSIVE_FT"))

RAWTIX <- ft %>% 
  select(LANDING_MONTH,
         GMT_SABLEFISH_CODE,
         AFI_EXVESSEL_REVENUE,
         NOMINAL_TO_ACTUAL_PACFIN_SPECIES_NAME,
         COUNCIL_CODE,
         FTID,
         VESSEL_NUM,
         PACFIN_YEAR,
         LANDED_WEIGHT_LBS,
         PRICE_PER_POUND,
         AFI_PRICE_PER_POUND,
         IOPAC_PORT_GROUP,
         AGENCY_CODE) %>% 
  filter(PACFIN_YEAR > 2010,
         GMT_SABLEFISH_CODE %in% c("LEN", "LES", "OAN", "OAS"),
         NOMINAL_TO_ACTUAL_PACFIN_SPECIES_NAME == "SABLEFISH",
         COUNCIL_CODE == "P",
         PRICE_PER_POUND > 0) %>% 
  collect()

The data are then summarized by vessel number, year, and sector (i.e., LEN, OAN, LES, OAS). The landings data for each year are also divided into six bimonthly periods within the year. Unknown vessel numbers are removed. 0.3% of all sablefish DTL fish tickets between 2011 and 2022 were not associated with a vessel number and therefore removed. Removed fish tickets made up 0.5% of sablefish poundage landed.

sabl_input <- RAWTIX %>% 
  group_by(VESSEL_NUM,
           PACFIN_YEAR,
           LANDING_MONTH,
           GMT_SABLEFISH_CODE,
           FTID) %>% 
  summarize(LBS = sum(LANDED_WEIGHT_LBS),
            REV = sum(AFI_EXVESSEL_REVENUE),
            max_price = max(AFI_PRICE_PER_POUND))

unknown <- RAWTIX %>% 
  mutate(unknown = ifelse(VESSEL_NUM == "UNKNOWN", T, F)) %>% 
  group_by(unknown) %>% 
  summarize(tickets = length(unique(FTID)),
            sable_lbs = sum(LANDED_WEIGHT_LBS))

DTL <- sabl_input %>% 
  filter(VESSEL_NUM != "UNKNOWN") %>% 
  mutate(PERIOD = ceiling(LANDING_MONTH/2)) %>% 
  group_by(PACFIN_YEAR,
           PERIOD,
           GMT_SABLEFISH_CODE) %>% 
  summarize(LBS = sum(LBS),
            REV = sum(REV),
            VES_NUM = length(unique(VESSEL_NUM)),
            max_price = max(max_price)) %>% 
  mutate(ADJ_PRICE = REV/LBS,
         AVG_LB = LBS/VES_NUM,
         ACT_MT = LBS/2204.6)

LEN_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "LEN")
OAN_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "OAN")
LES_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "LES")
OAS_tix <- DTL %>% filter(GMT_SABLEFISH_CODE == "OAS")

keycols <- c("PACFIN_YEAR", "PERIOD")

Table 1 below shows total annual landings since 2011 in the four sablefish DTL sectors. Landings in all sectors dropped in 2020 due to COVID-related impacts to the fleet, and while the LEN and OAN sectors seemed to rebound in 2021 and 2022 under high sablefish allocations, the LES and OAS sectors either stagnated or continued to decline, respectively. California Department of Fish and Wildlife representatives on the GMT noted that market and infrastructure issues south of 36° N. lat. have generally prevented those southern sectors from maintaining historical effort.

Table 1: Annual sablefish landings (mt) by DTL sector, 2011-2022

Year	LEN	LES	OAN	OAS
2011	412	560	374	166
2012	232	375	217	73
2013	174	455	121	60
2014	138	421	224	33
2015	190	370	365	29
2016	222	369	344	21
2017	257	319	392	25
2018	229	386	341	20
2019	178	344	327	13
2020	155	258	169	4
2021	170	173	245	3
2022	294	180	538	2

Figures 1 and 2 below demonstrate the apparent decline in sablefish landings and participation in the DTL sectors south of 36° N. lat., which lies between Monterey and Morro Bay. Both landings and participation in Morro Bay have steadily declined since 2011, and the number of vessels in the fishery has declined notably in the ports of Los Angeles and San Diego. Confidential data are hidden in the figures.

Figure 1. DTL sablefish landings (lbs.) by IOPAC port group; larger circles represent a larger scale of landings.

Figure 2. Number of vessels that made DTL sablefish landings by IOPAC port group; larger circles represent more vessels.

2 Limited Entry North (LEN)

Since 2012, roughly 10 to 50 vessels participated in the LEN fishery in any one bimonthly period, with up to 70 vessels participating in 2011 (Figure 3). Fleetwide sablefish landings have remained fairly steady since 2011, although vessel participation varies across bimonthly periods within a single year due to trip limits, participation in other fisheries, weather, sablefish prices, and other seasonal factors. Inflation adjusted price per pound of sablefish landed by the LEN sector has steadily declined from up to $6 in 2011 down to nearly $2 in 2022.

Figure 3. LEN participation, sablefish landings (lbs.), and inflation-adjusted price per pound, 2011-2022.

Given that the LEN sector requires a federal permit to participate, unlike the OAN sector, participation tends to involve roughly the same vessels year after year, and therefore, predicting the number of vessels in the model is generally easier than making the same prediction for the OAN sector. However, unprecedentedly low sablefish prices and high allocations can still complicate the LEN model.

The following sections are divided into 1) the current model used by the GMT for harvest specifications and inseason landings projections and 2) various potential model improvements the GMT explored as part of this methodology review. None of the potential improvements have been used for management to date but may be used in future management actions if approved for use by the SSC.

2.1 LEN - Current Model

In the following section, the distribution assumptions of the dependent variables in the model are described, and the steps to run the model currently used for management actions are outlined, as well as model performance and diagnostics. Fleetwide landings are predicted for each bimonthly period by multiplying the outputs of two separate linear regression models that predict 1) average landings per vessel and 2) number of vessels in the fleet.

To determine whether the two dependent variables are normally distributed, or if data transformation is warranted or an alternative distribution assumption necessary, the following subsections use Shapiro-Wilk normality tests on both the un-transformed and transformed datasets. In addition, the skewness and kurtosis are examined.

2.1.1 LEN - Distribution Assumptions

2.1.1.1 LEN - Average Pounds per Vessel

Based on the results of the Shapiro-Wilk normality test shown in Table 2, it is apparent that the historical data for the average landings (lbs.) of sablefish per vessel in the LEN sector is not normally distributed (i.e., the p-value is less than 0.05). All three data transformations still result in non-normally distributed data, but log-transformation provides the distribution closest to normal. This is also demonstrated in Figure 4, where the histogram of the un-transformed data is heavily skewed right, but the log-transformed data reduces the skewness, more so than the other data transformations. Log-transforming the data also improves the normal q-q plot, which should more closely hug the dashed line as normality increases, as shown in Figure 5.

Table 2: Shapiro-Wilk normality test for LEN Avg. lbs. per vessel dataset

	Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
W	W statistic	0.825	0.953	0.901	0.921
	p value	0.000	0.009	0.000	0.000

Figure 4. Top left panel: histogram of the untransformed average lbs. per vessel dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.

Figure 5. Normal Q-Q plots for the untransformed (left) and log-transformed (right) average lbs. per vessel dataset

Skewness

As described above, the un-transformed data for average landings per vessel appears heavily skewed right, which is confirmed using the following function to calculate the skewness value, the t-value, and the p-value, where a p-value less than 0.05 means the data are significantly skewed. The p-value for the un-transformed data is less than 0.05, and the skewness value is positive, so the data are heavily skewed right. The log-transformed data are still significantly skewed right, but the skewness value is the lowest, and the p-value is closest to 0.05.

skew <- function(AVG_LB){
  m3 <- sum((AVG_LB-mean(AVG_LB))^3)/length(AVG_LB)
  s3 <- sqrt(var(AVG_LB))^3
  m3/s3
}

skew_value <- skew(AVG_LB)
t_value <- skew(AVG_LB)/sqrt(6/length(AVG_LB))
p_value <- 1-pt(skew(AVG_LB)/sqrt(6/length(AVG_LB)), 68)

Table 3: Skewness values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.

Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
skewness	1.820	0.666	1.244	1.053
t value	6.349	2.325	4.338	3.672
p value	0.000	0.012	0.000	0.000

Kurtosis

Kurtosis is defined as the level at which a distribution is likely to produce outliers, where the standard kurtosis value of a normal distribution is 3. The method of calculating kurtosis used here is considered the “excess kurtosis”, because it subtracts 3 from the kurtosis value so that the value of a normal distribution is zero and easier to interpret. A high, positive kurtosis value indicates many outliers, whereas a negative kurtosis indicates a lack of outliers. Similar to the skewness section above, the t-value and p-value are also extracted and displayed in the table, where a p-value of less than 0.05 indicates that the kurtosis is significantly different from that of a normal distribution. Log-transforming the data brings the kurtosis value closest to zero (normal distribution), and the p-value of 0.27 indicates that the kurtosis of the log-transformed data is not significantly different from that of a normal distribution.

The following function is used to calculate the kurtosis value.

kurtosis <- function(AVG_LB){
  m4 <- sum((AVG_LB-mean(AVG_LB))^4)/length(AVG_LB)
  s4 <- var(AVG_LB)^2
  m4/s4-3
}

kurt_value <- kurtosis(AVG_LB)
t_value <- kurtosis(AVG_LB)/sqrt(24/length(AVG_LB))
p_value <- 1-pt(t_value, 68)

Table 4: Kurtosis values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.

Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
kurtosis	3.994	0.728	1.866	1.382
t value	6.966	1.270	3.255	2.410
p value	0.000	0.104	0.001	0.009

2.1.1.2 LEN - Number of Vessels

Based on the results of the Shapiro-Wilk normality test shown in Table 5, the data on number of vessels per bimonthly period appears to not be normally distributed (p-value is less than 0.05). All three data transformations normalize the data. The same functions used to calculate skewness and kurtosis for the average landings per vessel are also applied here. The normal q-q plot of the data on number of vessels appears to have an S-shaped curve at the tails but mostly hugs the dashed line (Figure 7).

Table 5: Shapiro-Wilk normality test for LEN number of vessels dataset.

	Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
W	W statistic	0.964	0.977	0.991	0.991
	p value	0.034	0.214	0.877	0.904

Figure 6. Top left panel: histogram of the untransformed number of vessels dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.

Figure 7. Normal Q-Q plot for the untransformed number of vessels dataset

Skewness

The data for number of vessels per bimonthly period is significantly skewed (p-value is less than 0.05; Table 6), and all three data transformations normalize the data.

Table 6: Skewness values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.

Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
skewness	0.636	-0.496	0.095	-0.092
t value	2.218	-1.729	0.331	-0.323
p value	0.015	0.956	0.371	0.626

Kurtosis

The kurtosis of the un-transformed vessel number data does not appear to be high, meaning there are not major outliers that could skew the data, and the p-value is greater than 0.05 (Table 7).

Table 7: Kurtosis values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.

Value	Un-transformed Data	Log-transformed Data	Square Root-transformed Data	Cube Root-transformed Data
kurtosis	0.146	-0.029	-0.464	-0.453
t value	0.255	-0.050	-0.809	-0.790
p value	3.000	0.520	0.789	0.784

Non-Normal Distributions of Count Data

The number of vessels is considered count data, and a poisson distribution is often used for non-normal count data. However, Figure 8 below demonstrates a markedly different appearance to the distribution of vessel number data compared to the expected poisson distribution of the data. Additionally, a poisson distribution is generally not considered appropriate for data in which zeros are highly unlikely or unobserved, as is the case for the vessel number data. Alternatively, a negative binomial distribution more closely resembles the observed distribution of the data and can be used for count data in which zeros are unlikely. For this reason, we explore the use of negative binomial distribution assumption in Section 2.2.6 - Generalized Linear Model.

Figure 8. Distributions of the vessel number observed data, poisson expected, and negative binomial expected.

2.1.2 LEN - Model Run

Figure 9 plots the relationships between three independent variables and the average number of landings per vessel. The three independent variables used are weekly trip limits, bimonthly trip limits, and average inflation-adjusted sablefish price per pound, and these are the only variables currently used in the model for management purposes. Average landings per vessel are clearly influenced by the weekly and bimonthly trip limits, where higher trip limits result in higher average landings per vessel. This is not a surprising relationship. There does not appear to be a linear relationship with average sablefish price per pound.

Figure 9. Relationships between average landings per vessel and three independent variables.

As shown in Table 8, the strongest model using the two trip limit variables appears to be the one that uses both variables as covariates. There does not appear to be an interaction between the two trip limit variables, even though both trip limits are often increased or decreased at the same time in management actions.

Table 8: Comparison of linear regression models predicting average lbs. landed per vessel, using status quo independent variables.

	Weekly	Bimonthly	Wkly + Bimon	Wkly + Bimon + Wkly:Bimon
(Intercept)	1190.48 ***	745.91 ***	798.69 ***	622.17
	(141.06)	(165.47)	(153.16)	(361.61)
TL.WEEKLY	1.02 ***		0.50 ***	0.65 *
	(0.08)		(0.14)	(0.31)
TL.BIMON		0.49 ***	0.29 ***	0.31 ***
		(0.04)	(0.06)	(0.08)
TL.WEEKLY:TL.BIMON				-0.00
				(0.00)
N	71	71	71	71
R2	0.68	0.71	0.75	0.76
* p < 0.001; p < 0.01; * p < 0.05.

Below is the model currently used to predict average landings per vessel and the summary output for the model. 74.7% of the variance in the average landings per vessel data can be explained by the weekly and bimonthly trip limits.

LEN_land_mod <- lm(AVG_LB ~ TL.WEEKLY + TL.BIMON, data = LEN)
summary(LEN_land_mod)

## 
## Call:
## lm(formula = AVG_LB ~ TL.WEEKLY + TL.BIMON, data = LEN)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -945.24 -271.15  -58.84  141.74 1784.14 
## 
## Coefficients:
##              Estimate Std. Error t value   Pr(>|t|)    
## (Intercept) 798.69376  153.15936   5.215 0.00000188 ***
## TL.WEEKLY     0.50295    0.13809   3.642   0.000523 ***
## TL.BIMON      0.28713    0.06472   4.436 0.00003433 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 472.4 on 68 degrees of freedom
## Multiple R-squared:  0.7546, Adjusted R-squared:  0.7474 
## F-statistic: 104.6 on 2 and 68 DF,  p-value: < 0.00000000000000022

Figure 10 shows the model diagnostic plots for the model using both weekly and bimonthly trip limits. The residuals vs. fitted plot does not appear to indicate heteroskedasticity (i.e., funnel shape or curvature), but the normal q-q plot has a heavy tail on the right, likely due to high outliers in recent years under exceptionally high sablefish allocations. However, it would not be reasonable to exclude those outliers, because sablefish allocations will likely continue to be exceptionally high in the coming years, and including these outliers informs the data of fishery behavior under those exceptional conditions.

Figure 10. Model diagnostic plots of model predicting average landings per vessel using weekly and bimonthly trip limit covariates.

The following code extracts the intercept and coefficients from the linear regression and calculates the predicted landings per vessel by applying the extracted coefficients to the input data. Note that a different set of coefficients is extracted for each bimonthly period, because trip limits and effort vary by period throughout a single year. Similar code to extract and apply coefficients is used in all other model types in this report, except the generalized linear model, so the code will not be shown again.

LEN_land_lm <- LEN %>% 
  group_by(PERIOD) %>% 
  group_modify(~ bind_rows(coefficients(lm(AVG_LB ~ TL.BIMON + TL.WEEKLY, 
                                           data = .))))

setnames(LEN_land_lm, 2:4, c("INT_LAND", "BIMO_COEF", "WKLY_COEF_LAND"))
LEN = merge(LEN, LEN_land_lm, by = "PERIOD")

LEN <- LEN %>% 
  mutate(pred_catch = WKLY_COEF_LAND * TL.WEEKLY + BIMO_COEF * TL.BIMON + INT_LAND)

Figure 11 below plots the predicted landings per vessel against the actual (i.e., observed) landings per vessel in the historical data. The blue line mostly overlaps the black 1:1 ratio line, which means the predicted values closely match the observed values.

Figure 11. Predicted vs. actual landings per vessel.

Figure 12 plots the relationships between the same three independent variables and the number of vessels per bimonthly period. Vessel participation is influenced by inflation adjusted price per pound of sablefish, with no meaningful relationship to trip limits. This is also not surprising, because market factors tend to influence fishery participation, whereas trip limits directly influence the amount participating vessels tend to land.

Figure 12. Relationships between number of vessels per bimonthly period and three independent variables.

Below is the model to predict number of vessels per bimonthly period and the summary output for the model. 50.9% of the variance in the number of vessels that participate each bimonthly period can be explained by average inflation-adjusted sablefish price per pound.

LEN_ves_mod <- lm(VES_NUM ~ ADJ_PRICE, data = LEN)
summary(LEN_ves_mod)

## 
## Call:
## lm(formula = VES_NUM ~ ADJ_PRICE, data = LEN)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -17.116  -6.108  -1.271   3.979  28.534 
## 
## Coefficients:
##             Estimate Std. Error t value         Pr(>|t|)    
## (Intercept)  -10.013      4.792  -2.090           0.0403 *  
## ADJ_PRICE     12.120      1.405   8.624 0.00000000000144 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.32 on 69 degrees of freedom
## Multiple R-squared:  0.5188, Adjusted R-squared:  0.5118 
## F-statistic: 74.38 on 1 and 69 DF,  p-value: 0.000000000001444

Figure 13 shows the model diagnostic plots for the model predicting number of vessels based on average sablefish prices. The residuals vs. fitted plot does not appear to indicate heteroskedasticity, but the normal q-q plot has a heavy tail on the right, likely due to high outliers in recent years under exceptionally high sablefish allocations.

Figure 13. Model diagnostic plots of model predicting number of vessels based on inflation-adjusted price per pound.

Figure 14 below shows that the predicted values for number of vessels per bimonthly period closely match the actual number of vessels in the historical data.

Figure 14. Plot of predicted vs. actual number of vessels in the LEN fleet.

The last step in the model is to multiply the predicted average landings per vessel by the predicted number of vessels in the fleet to get the fleetwide predicted landings by period. Figure 15 shows a comparison of the predicted (red) and actual (black) historical fleetwide landings. The current LEN model used in management generally performs well in predicting historical trends in fleet wide landings but over-predicts landings in 2020 and 2021, which is not surprising given the anomalously low effort during those years due to impacts from the COVID-19 pandemic. It also under-predicts 2022, which is likely due to unprecedented LEN trip limits as well as impacts to other fisheries that sablefish vessels participate in (e.g., Dungeness crab and salmon).

Figure 15. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black).

LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)

## 
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -68873 -12145  -2659  12142  89290 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -4680.3528  6924.0439  -0.676               0.501    
## predict         1.0652     0.0766  13.905 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25020 on 69 degrees of freedom
## Multiple R-squared:  0.737,  Adjusted R-squared:  0.7332 
## F-statistic: 193.4 on 1 and 69 DF,  p-value: < 0.00000000000000022

Figure 16 below demonstrates a typical inseason management projection to keep catch within the sector-specific landings targets, with a separate panel for each action alternative. Projections are made under three alternative price scenarios (low, average, high), and expected future prices are estimated based on recent prices with a 10% buffer above and below the average to estimate the high and low price scenarios, respectively. In the case of Figure 16, trip limit options 2 and 3 are progressively lower than the status quo trip limits, which is why the landings projections are also lower than status quo.

Figure 16. Example of inseason management landings projections using the current LEN model.

2.2 LEN - Potential Model Improvements

The following sections analyze potential improvements to the LEN sector model, such as adding variables or using a generalized linear model. For the LEN sector, the GMT explored:

up-weighting the most recent year in the model,
log-transforming the data,
adding minimum, median, and maximum sablefish prices as predictors,
adding minimum, average, and maximum Dungeness crab prices as predictors,
adding price per gallon of fuel as a predictor, and
using a generalized linear model with a negative binomial distribution.

In summary, the GMT concluded that fuel and Dungeness crab prices are not significantly influential for predicting LEN sablefish landings, but adding maximum AFI sablefish price as a predictor of vessel participation significantly improves the model in both the linear regression and generalized linear models. Up-weighting the most recent year of data and log-transforming the response variable only improves the prediction of average landings per vessel but not the number of vessels.

2.2.1 LEN - Data Weights

Table 9 below compares model summaries between a model that does not use data weights at all (“SQ Model”) and one that up-weights the most recent year of data (i.e., 2022), which improves the fit to the data. The GMT chose to up-weight only the most recent year of data for simplicity and because any other weighting scheme would likely be subjective to the modeler. Figure 17 plots the predicted landings per vessel against the observed, or actual, landings per vessel in the data.

Table 9: Comparison of linear regression models predicting average lbs. landed per vessel, using the status quo approach and adding data weights (i.e., upweighting most recent year).

	SQ Model	Data Weighting
(Intercept)	798.69 ***	777.28 ***
	(153.16)	(141.59)
TL.WEEKLY	0.50 ***	0.42 ***
	(0.14)	(0.12)
TL.BIMON	0.29 ***	0.32 ***
	(0.06)	(0.06)
N	71	71
R2	0.75	0.86
* p < 0.001; p < 0.01; * p < 0.05.

Figure 17. Plot of predicted vs. actual landings per vessel using data weights.

As shown in Table 10 below, up-weighting the most recent year of data does not improve the fit to the data when predicting number of vessels in the fleet. This could be caused by a number of factors, which could include the lower than average sablefish prices in 2022, as previously shown in Figure 3.

Table 10: Comparison of linear regression models predicting number of vessels, using the status quo approach and adding data weights (i.e., upweighting most recent year).

	SQ Model	Data Weighting
(Intercept)	-10.01 *	-2.73
	(4.79)	(5.17)
ADJ_PRICE	12.12 ***	10.15 ***
	(1.41)	(1.62)
N	71	71
R2	0.52	0.36
* p < 0.001; p < 0.01; * p < 0.05.

Figure 18 compares the predicted and actual fleetwide LEN landings when only up-weighting the most recent year of data in the model that predicts landings per vessel, and the following output indicates that there is a 73.1% fit to the actual data. This is a minuscule difference compared to a fit of 73.2% without data weights, but up-weighting the most recent year provides assurance that recent fishery behavior is more informative to the model in light of anomalous events such as the recent COVID-19 pandemic or extremely high sablefish abundance.

Figure 18. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weighted in the model used to predict average lbs. per vessel.

LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)

## 
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -71536 -11570  -2745  11874  86714 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -4764.2790  6944.6477  -0.686               0.495    
## predict         1.0696     0.0771  13.873 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25060 on 69 degrees of freedom
## Multiple R-squared:  0.7361, Adjusted R-squared:  0.7323 
## F-statistic: 192.5 on 1 and 69 DF,  p-value: < 0.00000000000000022

2.2.2 LEN - Log Transformation

Section 2.1.1 above discusses the distributions of the two response variables in the model. The data for average landings per vessel is not normally distributed, and while log-transforming the data does not completely normalize the data, it brings the skewness and kurtosis closer to that of a normal distribution, improves the model diagnostics compared to status quo, and slightly improves the fit of predicted to actual fleetwide landings data, as demonstrated below. Figure 19 compares the influence of weekly and bimonthly trip limits on average landings per vessel with no data transformation, log-transforming only the response variable, and log-transforming both the predictor and the response variable. Visually, the closest relationships appear to be between the log-transformed response variable and the log-transformed weekly limit as well as the un-transformed bimonthly limit. Log-transforming the bimonthly limit appears to distort the relationship.

Figure 19. Comparison of linear relationships when transforming the response and predictor variables in the model predicting landings per vessel.

Table 11 compares the model outputs between the status quo model (un-transformed data) and models in which response and/or predictor variables are log-transformed. Log-transforming the response variable (landings per vessel) and the weekly trip limit provides the best fit compared to the other data transformations, but it is still slightly weaker than the status quo model fit. However, Figure 20 shows that log-transforming the response variable and the weekly limit improves the model diagnostics, specifically the normal q-q plot. For all log transformation models (and the “status quo” model), the most recent year is up-weighted given the value in using this approach, as described in the previous section.

Table 11: Comparison of linear regression models predicting average lbs. landed per vessel, using the status quo approach and log-transforming the dependent variable.

	SQ	log(y)	log(y) & log(wkly)	log(y) & log(wkly) + log(bimo)
(Intercept)	777.28 ***	7.28 ***	5.23 ***	2.43 ***
	(141.59)	(0.05)	(0.52)	(0.36)
TL.WEEKLY	0.42 ***	0.00 *
	(0.12)	(0.00)
TL.BIMON	0.32 ***	0.00 ***	0.00 ***
	(0.06)	(0.00)	(0.00)
log(TL.WEEKLY)			0.31 ***	0.40 ***
			(0.08)	(0.07)
log(TL.BIMON)				0.31 ***
				(0.08)
N	71	71	71	71
R2	0.86	0.79	0.82	0.81
* p < 0.001; p < 0.01; * p < 0.05.

Figure 20. Diagnostic plots for the model predicting landings per vessel using log-transformed variables.

Since the response variable and one predictor variable are log-transformed in the model, prediction values need to be back-transformed in order to make interpretations and use them for management. The following code is used to extract the sigma value from each period-specific regression and then calculate the back-transformed landings predictions using those sigma values. Figure 21 plots the predicted landings per vessel using log transformations against the actual landings per vessel in the data.

LEN <- LEN %>% 
  mutate(sigma = ifelse(PERIOD == 1, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 1),
                                        weights = WEIGHT))$sigma,
                 ifelse(PERIOD == 2, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 2),
                                        weights = WEIGHT))$sigma,
                 ifelse(PERIOD == 3, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 3),
                                        weights = WEIGHT))$sigma,
                 ifelse(PERIOD == 4, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 4),
                                        weights = WEIGHT))$sigma,
                 ifelse(PERIOD == 5, summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 5),
                                        weights = WEIGHT))$sigma,
                                     summary(lm(log(AVG_LB) ~ log(TL.WEEKLY) + TL.BIMON, 
                                     data = LEN %>% filter(PERIOD == 6),
                                        weights = WEIGHT))$sigma))))))

LEN <- LEN %>% 
  mutate(ln_catch = WEEKLY_COEF * log(TL.WEEKLY) + BIMO_COEF * TL.BIMON + INT_LAND,
         pred_catch = (exp(ln_catch) + 0.5 * sigma^2),
         time = as.numeric(paste(PACFIN_YEAR, PERIOD, sep = ".")))

Figure 21. Predicted vs. actual landings per boat using log-transformed avg. lbs. dependent variable and log-transformed weekly limit independent variable.

Log-transforming either the response variable or the predictor variable in the model that predicts number of vessels does not improve the model (Table 12) and, in fact, transformation worsens the model diagnostics by introducing heteroskedasticity (i.e., the residual plot resembles a funnel) and creates an S-shaped curve in the normal q-q plot (Figure 22).

Table 12: Comparison of linear regression models predicting number of vessels, using the status quo approach and log-transforming the variables.

	SQ	log(y)	log(y) & log(avg price)
(Intercept)	-10.01 *	1.98 ***	1.62 ***
	(4.79)	(0.19)	(0.23)
ADJ_PRICE	12.12 ***	0.40 ***
	(1.41)	(0.06)
log(ADJ_PRICE)			1.43 ***
			(0.20)
N	71	71	71
R2	0.52	0.43	0.44
* p < 0.001; p < 0.01; * p < 0.05.

Figure 22. Vessel number model plots using the non-transformed and transformed variables.

Figure 23 and the following model output show that log-transforming the data and up-weighting the most recent year in the model that predicts landings per vessel give a slightly lower fit to the actual fleetwide landings data (72.4%) than the model currently used in management, but log-transforming the landings per vessel data improved the model diagnostics. In all other potential model improvements that follow, average landings per vessel are predicted using the log-transformed predictor variable and the most recent year of data is up-weighted.

Figure 23. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weighted in the model used to predict average lbs. per vessel and log-transformation of the average lbs. dataset.

LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)

## 
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -81349 -12886  -2271  12063  77175 
## 
## Coefficients:
##                Estimate  Std. Error t value            Pr(>|t|)    
## (Intercept) -3647.71083  6989.70790  -0.522               0.603    
## predict         1.07619     0.07897  13.628 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25390 on 69 degrees of freedom
## Multiple R-squared:  0.7291, Adjusted R-squared:  0.7252 
## F-statistic: 185.7 on 1 and 69 DF,  p-value: < 0.00000000000000022

2.2.3 LEN - AFI Prices

As described above, average prices currently used in the model are calculated based on inflation-adjusted ex-vessel revenue and landed weight. The GMT considered whether PacFIN’s inflation-adjusted price per pound (hereafter “AFI price” when referring to the PacFIN variable) in lieu of the calculated inflation-adjusted price per pound. The GMT concluded that continuing to calculate the price per pound was more appropriate given that different sizes of landings (e.g., 10,000 lbs. vs. 1,000 lbs.) would carry the same weight when averaging the AFI price (Table 13). Calculating the price per pound in Table 13 results in an average price ($3.18) that is slightly lower than if the price of each fish ticket is averaged ($4.00), because fish ticket A landed 10 times as much sablefish but at a lower price. The blue line in Figure 24 below represents the trend in the bimonthly average of AFI prices, and the orange line represents the trend in calculated average prices.

The GMT did explore, however, the alternative use of median AFI prices as well as adding maximum and/or minimum AFI prices, since those values are still single data points. The green line in Figure 24 represents the trend in maximum AFI prices, the purple line represents the trend in median AFI prices, and the red line represents minimum AFI prices. All zero prices were removed from the data, because vessels may list $0 as the price per pound when selling to themselves or for some other reason where a typical sale was not actually made.

Table 13: Alternative methods of calculating average price per pound.

	Landings (lbs.)	Revenue	Price per Lb.	Price Re-Calculated
Fish Ticket A	10,000	$30,000	$3.00	-
Fish Ticket B	1,000	$5,000	$5.00	-
Combined	11,000	$35,000	$4.00	$3.18

Figure 24. Minimum, median, average, and maximum inflation-adjusted sablefish price per pound in the LEN sector, 2011-2023. Zero prices were removed.

Figure 25 below plots the relationships between the number of vessels in the fleet and the minimum, median, and maximum AFI prices, alongside the calculated average inflation-adjusted price. All variables show a linear relationship with the number of vessels except the minimum AFI price, so the minimum price is not considered any further.

Figure 25. Relationships between number of LEN vessels by period and calculated average, minimum, median, and maximum sablefish prices.

Because the minimum, median, maximum, and average are all different statistics of the same data, the GMT explored whether there may be a correlation between the price variables (Figure 26). There does appear to be a strong linear relationship between median and average price, which is not surprising, but less so between average/median and maximum price, which has a more curved relationship. When considering correlations at an annual scale, however, median and average prices do not present a consistent pattern of positive or negative correlation with maximum price (Figure 27). There appears to be no correlation between minimum price and any other price variables.

Figure 26. Correlation check across sablefish price data.

Figure 27. Relationship of average and median prices with maximum price on an annual scale.

Table 14 below indicates that adding maximum AFI price to the model predicting number of vessels, in addition to the currently used calculated average price, provides a better fit. Additionally, using the median or maximum AFI price instead of the average price does not improve the model.

Table 14: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.

LEN_modsq <- lm(VES_NUM ~ ADJ_PRICE, data = LEN)
LEN_mod1 <- lm(VES_NUM ~ max_afi_price, data = LEN)
LEN_mod2 <- lm(VES_NUM ~ ADJ_PRICE + max_afi_price, data = LEN)
LEN_mod3 <- lm(VES_NUM ~ med_afi_price, data = LEN)
LEN_mod4 <- lm(VES_NUM ~ med_afi_price + max_afi_price, data = LEN)

export_summs(LEN_modsq, LEN_mod1, LEN_mod2, LEN_mod3, LEN_mod4,
               model.names = c("Model SQ", "Max AFI Price", 
                               "Avg+Max AFI Price", "Med AFI Price",
                               "Med+Max AFI Price"), statistics = NULL)

	Model SQ	Max AFI Price	Avg+Max AFI Price	Med AFI Price	Med+Max AFI Price
(Intercept)	-10.01 *	-13.16 *	-22.48 ***	-7.50	-20.40 ***
	(4.79)	(5.12)	(4.74)	(4.58)	(4.70)
ADJ_PRICE	12.12 ***		7.61 ***
	(1.41)		(1.48)
max_afi_price		5.11 ***	3.23 ***		3.24 ***
		(0.59)	(0.62)		(0.64)
med_afi_price				11.16 ***	6.85 ***
				(1.32)	(1.42)
N	71	71	71	71	71
R2	0.52	0.52	0.66	0.51	0.64
* p < 0.001; p < 0.01; * p < 0.05.

The GMT used a likelihood ratio test to determine whether adding maximum AFI prices significantly improves the model (Table 15). A p-value of less than 0.05 indicates that the difference between the full model and the nested model is statistically significant and supports including the additional variable. In this case, the p-value is less than 0.05, supporting the inclusion of maximum AFI prices to predict the number of vessels.

The diagnostic plots of the status quo model and the model using both calculated average and maximum AFI sablefish prices to predict number of vessels are shown in Figure 28. Adding maximum AFI prices improves the residual vs. fitted plot but does not appear to greatly alter the normal q-q plot.

Table 15: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
4	-246
3	-258	-1	23.4	1.34e-06

Figure 28. Model plots for the model predicting vessel number using both average and maximum sablefish prices.

The overall fit of the predicted data to the actual fleetwide landings data improves when maximum AFI sablefish price is added to the model predicting number of vessels, as shown in Figure 29 and the following output. The fit of the status quo model is 73.2% whereas the fit when maximum AFI price is added is 75.2%. The model used to make the predictions in Figure 29 also log-transforms the landings per vessel and weekly trip limit data and back-transforms the landings per vessel predictions, as outlined in Section 2.2.2.

Figure 29. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), with data weights, log-transformation of average lbs. data, and maximum AFI prices added to the model predicting number of vessels.

LEN_compare <- lm(LBS ~ predict, data = LEN_predictions)
summary(LEN_compare)

## 
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -95363 -12252    316  11346  77373 
## 
## Coefficients:
##                Estimate  Std. Error t value            Pr(>|t|)    
## (Intercept) -3683.12176  6514.16964  -0.565               0.574    
## predict         1.07444     0.07319  14.680 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 24030 on 69 degrees of freedom
## Multiple R-squared:  0.7575, Adjusted R-squared:  0.754 
## F-statistic: 215.5 on 1 and 69 DF,  p-value: < 0.00000000000000022

Hindcast

LEN <- merge(LEN_tix, LEN_TL, by = keycols)
LEN <- LEN %>% 
  select(-V7) %>% 
  mutate(year_period = paste(PACFIN_YEAR, PERIOD, sep = "_")) %>% 
  filter(PACFIN_YEAR < 2022)

LEN_prices <- price_trend %>% 
  filter(GMT_SABLEFISH_CODE == "LEN")

LEN <- LEN_prices %>% 
  select(PACFIN_YEAR,
         PERIOD,
         avg_afi_price,
         min_afi_price,
         max_afi_price,
         med_afi_price) %>% 
  full_join(LEN, by = c("PACFIN_YEAR", "PERIOD"))

2.2.4 LEN - Dungeness Crab Prices

Vessels that participate in the sablefish fishery also tend to participate in other fisheries, such as Dungeness crab, salmon, and Alaska sablefish. The Dungeness crab fishery in particular can be heavily market-driven, and higher crab prices can entice vessels to prioritize crab fishing. For that reason, the GMT explored whether Dungeness crab prices would influence vessel participation in the DTL sectors enough to consider including it as a variable in the model. Dungeness crab data are pulled from the PacFIN comprehensive fish ticket database using Thomson Fishery Code 01 and PacFIN Species Code “DCRB”.

As expected, higher Dungeness crab prices linearly correlate with lower vessel participation in the LEN sector (Figure 30). However, the GMT cannot rule out the possibility that these are not necessarily related events. In other words, Dungeness crab prices have been increasing over time while sablefish participation has declined in recent years, but those may be concurrent trends caused by different factors. Insight from participants in the fishery may elucidate any connection. Maximum Dungeness crab prices may have some relationship although not a linear one.

Figure 30. Relationship between inflation-adjusted Dungeness crab price per pound and three dependent variables in the LEN data.

The following code returns Table 16, which indicates that there is very little improvement to the model when adding either average or maximum Dungeness crab prices. This is also confirmed with a likelihood ratio test where the p-value of 0.117 indicates that there is no statistically significant difference to the model when average Dungeness crab prices are added (Table 17).

Table 16: Comparison of linear regression models predicting number of vessels, using the status quo approach, sablefish average and maximum prices combined, and 3 combinations of sablefish and crab inflation-adjusted prices per pound.

LEN_crabmod1 <- lm(VES_NUM ~ ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod2 <- lm(VES_NUM ~ adj_crab_price + ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod3 <- lm(VES_NUM ~ max_crab_price + ADJ_PRICE + max_afi_price, data = LEN_crab)
LEN_crabmod4 <- lm(VES_NUM ~ adj_crab_price, data = LEN_crab)
LEN_crabmod5 <- lm(VES_NUM ~ max_crab_price, data = LEN_crab)

export_summs(LEN_crabmod1, LEN_crabmod2, LEN_crabmod3, LEN_crabmod4, LEN_crabmod5,
               model.names = c("Avg+Max Sablefish Prices", "+ Avg. Crab Price", 
                               "+ Max Crab Price", 
                               "Avg Crab Price", "Max Crab Price"),
             statistics = NULL)

	Avg+Max Sablefish Prices	+ Avg. Crab Price	+ Max Crab Price	Avg Crab Price	Max Crab Price
(Intercept)	-22.48 ***	-11.67	-19.00 **	55.81 ***	41.28 ***
	(4.74)	(8.44)	(6.37)	(5.61)	(4.72)
ADJ_PRICE	7.61 ***	7.42 ***	7.41 ***
	(1.48)	(1.47)	(1.50)
max_afi_price	3.23 ***	2.81 ***	3.18 ***
	(0.62)	(0.67)	(0.63)
adj_crab_price		-1.27		-4.93 ***
		(0.83)		(1.05)
max_crab_price			-0.22		-1.04 *
			(0.27)		(0.42)
N	71	71	71	71	71
R2	0.66	0.67	0.66	0.24	0.08
* p < 0.001; p < 0.01; * p < 0.05.

Table 17: Likelihood ratio test of nested models. P* value < 0.05 means including dungeness crab prices significantly improves the model.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
5	-245
4	-246	-1	2.47	0.116

2.2.5 LEN - Fuel Prices

The cost of dockside fuel can impact profit margins for fishery participants and influence the decision to participate and in which fishery. The GMT explored whether adding dockside fuel prices (adjusted for inflation) improves the model’s ability to predict the number of vessels in the fishery. 2011-2022 data for pre-tax dockside price per gallon are pulled from the Fisheries Economic Data Program’s EFIN Monthly Marine Fuel Prices database, managed by Pacific States Marine Fisheries Commission. The data are provided for each state separately and are broken out into each IOPAC port along the West Coast. For the purposes of the LEN and OAN models, the GMT used only California data from ports north of 36° N. lat. Additionally, OR and WA fuel price data were combined into one variable, because there is very little difference between OR and WA prices, whereas northern California fuel prices were notably higher (Figures 31 and 32).

All zero prices were filtered out of the data, as well as prices that included tax (since the vast majority of data are pre-tax). For 2011-2015, the removed data makes up 0% to 10% of each year’s data, but between 2016 and 2022, the amount of data removed constitutes an average of 23%, up to 33% in 2020, of the total annual fuel price data.

Figure 31. Trend in average bimonthly dockside fuel prices in WA, OR, northern CA, and southern CA, 2011-2022.

Figure 32. Distribution of average bimonthly fuel prices in WA, OR, northern CA, and southern CA, 2011-2022. Each panel is a bimonthly period (1-6).

Figure 33 indicates very little influence that fuel prices have on LEN participation. Although there seems to be some linear relationship between OR/WA fuel prices and number of vessels, such a causal relationship would not logically be one in which higher fuel prices enticed more vessels into the fishery.

Figure 33. Relationships between regional dockside fuel prices and LEN model response variables.

Tables 18 and 19 show that adding fuel prices from either OR/WA or northern CA does not improve the fit of the model predicting number of vessels. The “nested” model in the likelihood ratio test includes average calculated sablefish price and maximum sablefish AFI price, whereas the “full” model adds combined OR/WA fuel prices. The p-value of the likelihood ratio test indicates that the model with OR/WA fuel prices included is not significantly different from the simpler model.

Table 18: Comparison of linear regression models predicting number of vessels, using the sablefish average and maximum prices and dockside fuel prices from OR/WA combined and northern CA.

	Avg+Max Sablefish Prices	OR+WA Fuel	CA North Fuel	Avg+Max Sable & OR+WA Fuel	Avg+Max Sable & CA N. Fuel	Avg+Max Sable & OR+WA & CA N. Fuel
(Intercept)	-22.48 ***	13.97 *	23.59 *	-24.00 ***	-26.45 ***	-30.65 *
	(4.74)	(6.79)	(10.19)	(5.58)	(7.55)	(12.23)
ADJ_PRICE	7.61 ***			7.38 ***	7.52 ***	8.03 ***
	(1.48)			(1.56)	(1.49)	(1.89)
max_afi_price	3.23 ***			3.24 ***	3.26 ***	3.30 ***
	(0.62)			(0.63)	(0.63)	(0.64)
adj_fuel_OR_WA		4.35 *		0.61		-1.81
		(1.77)		(1.17)		(4.13)
adj_fuel_CAN			1.49		0.91	2.92
			(2.26)		(1.35)	(4.77)
N	71	71	71	71	71	71
R2	0.66	0.08	0.01	0.66	0.66	0.66
* p < 0.001; p < 0.01; * p < 0.05.

Table 19: Likelihood ratio test comparing a full model with average and maximum sablefish prices and OR/WA fuel prices to a nested model without OR/WA fuel prices.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
5	-246
4	-246	-1	0.287	0.592

2.2.6 LEN - Generalized Linear Model (GLM)

Generalized linear models (GLMs) can be useful when the response variable does not follow a normal distribution and therefore a linear regression could predict negative values by assuming a normal distribution. This appears to be the case for the number of vessels used in the DTL model. In the past, using trip limits as a predictor of vessel number has resulted in the model predicting a negative number of vessels in the fishery, which is largely why trip limits were no longer used as a predictor of vessel number as of Fall 2022. The log-link in a GLM forces the model to always lead to positive predictions, and as discussed in Section 2.1.1, vessel number data appear to fit a negative binomial distribution.

The following code uses a formula to find the model with the best fit out of a suite of predictor variables, based on each model’s Akaike Information Criterian (AIC) score. The response variable is number of vessels, and the predictor variables used in the formula are:

average inflation-adjusted sablefish price per pound
median AFI sablefish price per pound
maximum AFI sablefish price per pound
bimonthly period (fixed effect)
average inflation-adjusted Dungeness crab price
maximum Dungeness crab price
OR/WA dockside fuel price
CA dockside fuel price in ports north of 36° N. lat.

Table 20 is the first six rows of the output, with each row representing a separate model ranked from highest to lowest AIC score. Note that all six of the highest ranked models shown in the table have equal AIC scores, which means they fit the data equally well. All but the sixth model include, at a minimum, average sablefish price and maximum sablefish price, and all models include period as a fixed effect.

Likelihood ratio tests were used to determine whether adding average crab price or fuel prices from either region significantly improved the model compared to only using average and maximum sablefish prices (i.e., nested model; Table 21). All p-values are greater than 0.05, which means the nested model is sufficient.

model.full <- glm.nb(as.formula(
  paste("VES_NUM",
        paste(0, "+", paste(covars, collapse = " + ")),
        sep = " ~ ")),
  data = LEN,
  na.action = "na.fail")

model.suite <- MuMIn::dredge(model.full,
                             rank = "AIC",
                             fixed = c("PERIOD"))

Table 20: First six rows of the ranked model.suite output.

adj_crab_price	adj_fuel_CAN	adj_fuel_OR_WA	ADJ_PRICE	max_afi_price	max_crab_price	med_afi_price	PERIOD	df	logLik	AIC	delta	weight
			0.213	0.104			+	9	-228	475	0	0.206483465365749
	-0.267	0.238	0.155	0.0975			+	11	-226	475	0.076	0.198781142501648
-0.0366			0.205	0.0963			+	10	-228	475	0.301	0.177662832608923
-0.0327	-0.257	0.229	0.15	0.0911			+	12	-226	476	0.683	0.146769478329992
			0.207	0.102	-0.00863		+	10	-228	476	0.805	0.13808644631333
	-0.374	0.339		0.0937		0.124	+	11	-227	476	0.892	0.132216634880358

Table 21: Likelihood ratio test results comparing highest ranked models from model.suite.

Full Model	Nested Model	P-value
avg sable price + max sable price + avg crab price	avg sable price + max sable price	0.1963
avg sable price + max sable price + CA fuel + OR/WA fuel	avg sable price + max sable price	0.1380
avg sable price + max sable price + CA fuel	avg sable price + max sable price	0.9043

Using a generalized linear model to predict the number of vessels with average and maximum sablefish prices, compared to a linear regression using the same variables, improves the model diagnostics, particularly the normal q-q plot (Figure 34).

Figure 34. Model diagnostic plots of a GLM and a linear regression, both using average and maximum sablefish prices to predict number of vessels.

Figure 35. Historical comparison of predicted fleetwide ladnings and actual fleetwide landings using a GLM to predict number of vessels based on average and maximum sablefish prices, as well as log-transforming data in the average landings per vessel model.

Overall, using a generalized linear model, compared to a linear regression, improves the model diagnostics, but does not improve the fit to actual historical data. Recall that the adjusted R-squared using a linear regression with average and maximum sablefish price as covariates is 0.752.

## 
## Call:
## lm(formula = LBS ~ predict, data = LEN_predictions)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -73227 -12887  -1721  11651  93926 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -8254.8169  6891.8148  -1.198               0.235    
## predict         1.1341     0.0784  14.466 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 24300 on 69 degrees of freedom
## Multiple R-squared:  0.752,  Adjusted R-squared:  0.7484 
## F-statistic: 209.3 on 1 and 69 DF,  p-value: < 0.00000000000000022

3 Open Access North (OAN)

The OAN sector tends to be more volatile than the LEN sector, because a permit is not required to participate. Anywhere from 25 up to 150 vessels have participated in a single bimonthly period since 2022, excluding period 6 of 2015 during which the sector closed after exceeding the catch share (Figure 36). There is considerable overlap with other fisheries and industries as well. Vessels tend to be smaller, and trip lengths tend to be shorter for this sector, compared to the LEN sector. Given this difficulty in predicting fleet dynamics, and other issues with managing the sector, the Council recently added the development of an OA registration program to the workload prioritization list (Agenda Item F.8.a, NMFS Report 1, March 2023). In the future, such a registration program may be helpful for predicting OAN participation in the DTL model. Similar to the LEN sector, sablefish prices have been slowly declining since 2011, reaching an unprecedented low in 2022 (Figure 36).

Figure 36. OAN participation, sablefish landings (lbs.), and inflation-adjusted price per pound, 2011-2022.

3.1 OAN - Current Model

3.1.1 OAN - Distribution Assumptions

The GMT assessed the OAN distribution assumptions using the same methods as the LEN model, outlined in Section 2.1.1.1 above.

3.1.1.1 OAN - Average Pounds per Vessel

The average landings per vessel data from the OAN sector are not normally distributed (Table 22), likely due to two data points that exceed 4,000 lbs., both from the latter half of 2022 when trip limits were the highest they have ever been for this sector. Removing those two data points would not be appropriate for making landings predictions, because future trip limits are likely to be as high, if not higher, due to high sablefish ACLs. Log-transforming the data is the only data transformation that provides a normal distribution (Table 22). Log-transforming the data also improves the normal q-q plot for average landings data (Figure 38).

Table 22: Shapiro-Wilk normality test for OAN avg. lbs. landed per vessel dataset

	Value	raw_data	log_data	sqrt_data	cube_data
W	W statistic	0.77924	0.96759	0.89691	0.92643
	p value	0.00000	0.06325	0.00003	0.00046

Figure 37. Top left panel: histogram of untransformed average landings per vessel. Remaining panels: histograms of the log, square root, and cube root transformed data.

Figure 38. Normal q-q plots of the untransformed and log-transformed OAN data on average landings per vessel.

Skewness

As shown in Table 23, the data transformation that reduces skewness the most is a log transformation, and the p-value is closest to 0.05 with a log-transformation.

Table 23: Skewness values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.

Value	raw_data	log_data	sqrt_data	cube_data
skewness	2.445	0.568	1.439	1.132
t value	8.411	1.952	4.951	3.894
p value	0.000	0.027	0.000	0.000

Kurtosis

The OAN data for average landings per vessel has a high excess kurtosis (14.6), where zero is the excess kurtosis of a normal distribution (Table 24). Log transforming the OAN average landings data substantially reduces the kurtosis and brings it closest to that of a normal distribution.

Table 24: Kurtosis values for the non-transformed average lbs. per vessel dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.

Value	raw_data	log_data	sqrt_data	cube_data
kurtosis	8.517	1.097	3.664	2.553
t value	14.649	1.886	6.302	4.391

3.1.1.2 OAN - Number of Vessels

The number of vessels participating in the OAN sector is not normally distributed according to the results of the Shapiro-Wilk normality test in Table 25, but log-transforming the data normalizes it (p-value > 0.05), as well as all other data transformations.

Table 25: Shapiro-Wilk normality test for OAN number of vessels dataset

	Value	raw_data	log_data	sqrt_data	cube_data
W	W statistic	0.949	0.967	0.969	0.971
	p value	0.006	0.061	0.079	0.103

Figure 39. Top left panel: histogram of untransformed number of vessels dataset. Remaining panels: histograms of the log, square root, and cube root transformed datasets.

Skewness

The OAN number of vessels is skewed right but not significantly different from a normal distribution (p-value > 0.05; Table 26). Log-transforming the data normalizes the skew even more but in the opposite direction (left). All data transformations provide a skewness that is not significantly different from that of a normal distribution.

Table 26: Skewness values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the skewness is significantly different from a normal distribution.

Value	raw_data	log_data	sqrt_data	cube_data
skewness	0.466	-0.256	0.117	-0.005
t value	1.602	-0.879	0.404	-0.017
p value	0.057	0.809	0.344	0.507

Kurtosis

The excess kurtosis for the un-transformed vessel number data, as well as all transformed data, have low kurtosis values that are close to zero, the excess kurtosis of a normal distribution (Table 27).

Table 27: Kurtosis values for the non-transformed number of vessels dataset and each of the transformed datasets, along with the t value and p value to used determined whether the kurtosis is significantly different from a normal distribution.

Value	raw_data	log_data	sqrt_data	cube_data
kurtosis	-0.778	-0.865	-0.949	-0.953
t value	-1.338	-1.487	-1.632	-1.639

Non-Normal Distributions of Count Data

Similar to the vessel number data for the LEN sector, the distribution appears to closely resemble the expected negative binomial distribution but not the expected poisson distribution. The use of a negative binomial distribution is explored in Section 3.2.5 on using a GLM for the OAN model.

Figure 40. Distributions of the vessel number observed data, poisson expected, and negative binomial expected.

3.1.2 OAN - Model Run

Unlike the LEN model, data weighting has historically been used for the OAN model in making predictions for management decisions, and therefore, the GMT does not provide that as a potential improvement. Rather, every OAN model run inherently up-weights the most recent year. Figure 41 plots the relationships between average landings per vessel in the OAN sector against OAN weekly trip limits, bimonthly trip limits, and calculated inflation-adjusted price per pound. Similar to the LEN sector, average landings per vessel in the OAN fleet are heavily influenced by trip limits with a potentially non-linear relationship with average price per pound.

Figure 41. Relationships between average landings per vessel and three independent variables.

Using the three independent variables in Figure 41, there is very little difference between using the weekly limit only, the bimonthly limit only, or using both the weekly and bimonthly limit (Table 28). This is most likely because, unlike the LEN sector, the OAN weekly trip limits have been exactly half the bimonthly trip limits since 2012. Thus, using both variables in the model is duplicative. Since weekly trip limits can be more constraining than bimonthly limits, the weekly trip limit is used as the sole predictor of average landings per vessel. 94.8% of the variance in average landings per vessel can be explained by weekly trip limits (model summary output).

Table 28: Comparison of linear regression models predicting average lbs. landed per vessel, using status quo independent variables.

	Weekly	Bimonthly	Wkly + Bimon	Wkly + Bimon + Wkly:Bimon
(Intercept)	201.14 **	193.63 **	196.11 **	357.49 *
	(64.34)	(64.91)	(64.95)	(144.04)
TL.WEEKLY	0.97 ***		0.57	0.49
	(0.04)		(0.57)	(0.57)
TL.BIMON		0.49 ***	0.20	0.14
		(0.02)	(0.28)	(0.29)
TL.WEEKLY:TL.BIMON				0.00
				(0.00)
N	71	71	71	71
R2	0.87	0.87	0.88	0.88
* p < 0.001; p < 0.01; * p < 0.05.

## 
## Call:
## lm(formula = AVG_LB ~ TL.WEEKLY, data = OAN, weights = WEIGHT)
## 
## Weighted Residuals:
##     Min      1Q  Median      3Q     Max 
## -958.45 -125.84    6.46  125.14  835.96 
## 
## Coefficients:
##              Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) 112.59656   56.11239   2.007              0.0487 *  
## TL.WEEKLY     1.05544    0.02959  35.675 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 271.4 on 69 degrees of freedom
## Multiple R-squared:  0.9486, Adjusted R-squared:  0.9478 
## F-statistic:  1273 on 1 and 69 DF,  p-value: < 0.00000000000000022

Using weekly trip limit as the predictor of average landings per vessel, the model diagnostics show clear heteroskedasticity, because the residuals are scattered together in the left-hand side of the residuals vs. fitted plot, and there is a diagonal trend line in the scale-location plot (Figure 42). The normal q-q plot is also heavily tailed on both ends. Even so, the predictions of landings per vessel are very close to the actual landings per vessel (Figure 43).

Figure 42. Model diagnostic plots for the model predicting average landings per vessel based on weekly trip limits.

Figure 43. Plot of predicted vs. actual average landings per vessel in the OAN sector, using weekly limit as a predictor.

For the OAN model that predicts the number of vessels in the fleet, a “period 4 peak adjuster”, developed by Dr. Sean Matson (NOAA; former GMT member) is used as a covariate instead of creating regression coefficients for each period. This is because OAN participation peaks in period 4, and the further away from period 4, the fewer vessels tend to participate in the fishery. The following scores are given to each of the six periods:

Period 1 = -3
Period 2 = -2
Period 3 = -1
Period 4 = 0
Period 5 = -1
Period 6 = -2

Figure 44 shows that relationship in which a value of 0 represents period 4 with the most vessels, and a value of -3 represents 1 with the fewest vessels. The figure also demonstrates a positive linear relationship with average sablefish price per pound but no obvious relationship with weekly trip limits. 42.9% of the variance in OAN vessel participation can be explained by the period and the average sablefish price per pound (model summary output). The relationship of predicted to fitted values for the number of vessels is not as strong as that of average landings per vessel (Figure 45).

Figure 44. Relationships between number of vessels per bimonthly period and three independent variables.

OAN_ves_mod <- lm(VES_NUM ~ PER.4.PEAK + ADJ_PRICE, data = OAN, weights = WEIGHT)
summary(OAN_ves_mod)

## 
## Call:
## lm(formula = VES_NUM ~ PER.4.PEAK + ADJ_PRICE, data = OAN, weights = WEIGHT)
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -59.10 -24.62   1.08  19.60  69.76 
## 
## Coefficients:
##             Estimate Std. Error t value   Pr(>|t|)    
## (Intercept)   28.776     16.319   1.763     0.0823 .  
## PER.4.PEAK    15.401      3.025   5.091 0.00000302 ***
## ADJ_PRICE     22.816      4.894   4.662 0.00001510 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 28.1 on 68 degrees of freedom
## Multiple R-squared:  0.4455, Adjusted R-squared:  0.4292 
## F-statistic: 27.32 on 2 and 68 DF,  p-value: 0.000000001959

Figure 45. Plot of predicted vs. actual number of vessels in the OAN fleet, using the period 4 peak adjuster and average sablefish prices as predictors.

Comparing the retrospective predictions to the actual historical data shows that the OAN model struggles to capture the annual fluctuation in fleetwide landings, largely due to the weakness of the model predicting number of vessels (Figure 46). There is only a 40.4% fit of the predicted to the actual data. For this reason, incorporating other market factors contributing to a willingness to participate, such as prices from other fisheries, may be most helpful for predicting OAN participation, compared to the LEN sector.

Figure 46. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black).

OAN_compare <- lm(LBS ~ predict, data = OAN_predictions)
summary(OAN_compare)

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -112049  -34038      76   27674  156059 
## 
## Coefficients:
##               Estimate Std. Error t value      Pr(>|t|)    
## (Intercept) 11335.9223 15925.2120   0.712         0.479    
## predict         0.9031     0.1299   6.954 0.00000000161 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 51650 on 69 degrees of freedom
## Multiple R-squared:  0.412,  Adjusted R-squared:  0.4035 
## F-statistic: 48.36 on 1 and 69 DF,  p-value: 0.000000001613

3.2 OAN - Potential Model Improvements

The GMT explored the following potential improvements to the OAN model:

log-transforming the data,
adding minimum, median, and maximum sablefish prices as predictors,
adding minimum, average, and maximum Dungeness crab prices as predictors,
adding price per gallon of fuel as a predictor, and
using a generalized linear model with a negative binomial distribution.

3.2.1 OAN - Log Transformation

As previously discussed, the average landings per vessel data are not normally distributed, and the model using un-transformed data includes heteroskedasticity. Log-transforming the response variable data improves the normality, but as shown in Figures 47 and 48 below, log-transforming both the response variable and the predictor variable, weekly trip limit, provides the greatest improvement to both the linear relationship and the model diagnostics, with less of an impact to fit compared to only transforming the response variable (Table 29).

Figure 47. Relationship plots between average landings per vessel and weekly limit, using raw data, log-transforming only the response variable, or log-transforming both the response variable and the predictor variable.

Table 29: Comparison of linear regression models predicting average landings per vessel, using raw data, transforming only the response variable, or transforming both the response and the predictor variables.

	SQ	log(y)	log(x) & log(y)
(Intercept)	112.60 *	6.61 ***	0.59 *
	(56.11)	(0.04)	(0.25)
TL.WEEKLY	1.06 ***	0.00 ***
	(0.03)	(0.00)
log(TL.WEEKLY)			0.94 ***
			(0.03)
N	71	71	71
R2	0.95	0.86	0.91
* p < 0.001; p < 0.01; * p < 0.05.

Figure 48. Model diagnostics plots for the un-transformed landings per vessel data (left) compared to log-transforming the response variable (middle) or log-transforming both the response variable and the predictor variable (right).

Figure 49 plots the predicted values using a model with log-transformed response and predictor variables against the actual landings per vessel in the data and demonstrates that a log transformation model predicts landings per vessel well. Log-transforming the data results in very little difference for the historical fleetwide landings predictions (Figure 50), but removing heteroskedasticity is important for making predictions with confidence to inform management decisions. In all other potential model improvements that follow, average landings per vessel are predicted using log-transformed data of both the response and predictor variables.

Figure 49. Plot of predicted vs. actual landings per vessel using a linear model with log-transformed response and predictor variables.

Figure 50. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a landings per vessel model that log-transforms the response and predictor variables.

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -111121  -33548   -1375   27338  154599 
## 
## Coefficients:
##               Estimate Std. Error t value      Pr(>|t|)    
## (Intercept) 12506.3902 15866.5311   0.788         0.433    
## predict         0.8982     0.1300   6.907 0.00000000196 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 51790 on 69 degrees of freedom
## Multiple R-squared:  0.4088, Adjusted R-squared:  0.4002 
## F-statistic: 47.71 on 1 and 69 DF,  p-value: 0.000000001957

3.2.2 OAN - AFI Prices

Similar variables of minimum, median, and maximum prices were explored for predicting the number of vessels in the OAN model as were explored in the LEN model, but the price values used for OAN are pulled from OAN fish tickets. Figure 51 shows the trend in prices and demonstrates that there tends to be less variation in the maximum price compared to the LEN sector but a similar trend in average and median prices.

Figure 51. Minimum, median, average, and maximum inflation-adjusted price per pound in the OAN sector, 2011-2023. Zero prices were removed.

The number of vessels in the OAN fleet per bimonthly period is influenced by the average, median, and maximum sablefish prices, but there does not appear to be a relationship to the minimum prices (Figure 52). Similar to the LEN sector, there does appear to be a strong correlation between median and average prices and potentially some non-linear relationship between median/average prices and maximum price (Figure 53). However, the relationship varies year-to-year and does not show a clear pattern across years (Figure 54).

Figure 52. Relationships between number of OAN vessels and calculated average, minimum, median, and maximum sablefish prices.

Figure 53. Correlation check across sablefish price data.

Figure 54. Relationship of average and median prices with maximum price on an annual scale.

Using various price values as predictors of the number of OAN vessels per bimonthly period, the model with the best fit to data includes both average and maximum sablefish prices (Table 30). A likelihood ratio test indicates that adding maximum sablefish prices to the model, along with average prices, significantly improves the model fit (Table 31).

Table 30: Comparison of linear regression models predicting number of vessels, using the status quo approach and 5 alternative approaches with PacFIN AFI prices.

	Model SQ	Max AFI Price	Avg+Max AFI Price	Med AFI Price	Med+Max AFI Price
(Intercept)	28.78	-55.96 **	-59.59 **	34.49 *	-57.27 **
	(16.32)	(20.24)	(20.17)	(14.82)	(20.00)
ADJ_PRICE	22.82 ***		7.48
	(4.89)		(4.79)
PER.4.PEAK	15.40 ***	11.85 ***	11.94 ***	14.96 ***	11.82 ***
	(3.02)	(2.57)	(2.55)	(3.02)	(2.54)
max_afi_price		16.81 ***	14.67 ***		14.51 ***
		(2.12)	(2.51)		(2.51)
med_afi_price				18.14 ***	6.25
				(3.79)	(3.74)
N	71	71	71	71	71
R2	0.45	0.62	0.63	0.45	0.64
* p < 0.001; p < 0.01; * p < 0.05.

Table 31: Likelihood ratio test of nested models. P* value < 0.05 means including maximum prices significantly improves the model.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
4	-331
5	-317	1	29.3	6.12e-08

Adding maximum sablefish prices does appear to add some curvature to the residuals plot and is slightly less random than the model using only average sablefish prices, but including maximum prices improves the normal q-q plot by smoothing the bottom tail.

Figure 55. Model plots for the model predicting vessel number using both average and maximum sablefish prices.

Adding maximum sablefish price to the model to predict the number of vessels in the OAN fleet does not improve the fit to actual historical data. Recall that the fit for the status quo model is 0.4035 and the fit for the log-transformation model is 0.4002, virtually identical to the fit when both log-transformation and maximum sablefish prices are used. The next section will demonstrate that maximum Dungeness crab prices may be a better predictor than maximum sablefish prices, when used alongside average sablefish prices.

Figure 56. Historical comparison of predicted fleetwide (red) and actual fleetwide landings (black), using a model that log-transforms the average landings and weekly trip limit data and adds maximum sablefish prices to predict number of vessels.

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -111121  -33548   -1375   27338  154599 
## 
## Coefficients:
##               Estimate Std. Error t value      Pr(>|t|)    
## (Intercept) 12506.3902 15866.5311   0.788         0.433    
## predict         0.8982     0.1300   6.907 0.00000000196 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 51790 on 69 degrees of freedom
## Multiple R-squared:  0.4088, Adjusted R-squared:  0.4002 
## F-statistic: 47.71 on 1 and 69 DF,  p-value: 0.000000001957

3.2.3 OAN - Dungeness Crab Prices

The only relationship that OAN vessel participation has with Dungeness crab prices appears to be maximum crab price (Figure 57), which exhibits an inverse relationship. This is expected, because it suggests that higher crab prices will tend to lead vessels to prioritize the Dungeness crab fishery over the sablefish fishery, especially given that Dungeness crab prices have risen since 2011 while sablefish prices have declined. Particularly for the OA fishery, portfolio diversity is only likely to increase with climate change and the need for adaptability, and projection models are likely to benefit from considering the crossover between fisheries.

Figure 57. Relationship plots between OAN number of vessels and average, minimum, and maximum Dungeness crab prices.

Table 32 shows a slight improvement in the fit of the model when maximum Dungeness crab prices are included in addition to average sablefish prices to predict the number of OAN vessels. A likelihood ratio test also indicates that the addition of maximum crab prices significantly improves the model, compared to only using sablefish prices (Table 33). In all cases, the period 4 adjuster is used as well.

Table 32: Comparison of linear regression models using the status quo approach and adding maximum Dungeness crab prices.

	Model SQ	Max Crab Price	Avg Sable + Max Crab Prices
(Intercept)	28.34	136.02 ***	61.77 **
	(16.35)	(12.33)	(22.02)
ADJ_PRICE	22.97 ***		19.61 ***
	(4.90)		(5.01)
PER.4.PEAK	15.24 ***	13.81 ***	13.39 ***
	(3.03)	(3.38)	(3.07)
max_crab_price		-3.77 **	-2.44 *
		(1.17)	(1.11)
N	70	70	70
R2	0.45	0.37	0.48
* p < 0.001; p < 0.01; * p < 0.05.

Table 33: Likelihood ratio test comparing two models with (full) and without (nested) Dungness crab prices to predict number of vessels.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
5	-324
4	-327	-1	4.93	0.0264

The model diagnostics also improve when maximum Dungeness crab prices are added to the model to predict OAN number of vessels. The normal q-q plot hugs the line more closely in the full model, and the residuals appear more randomly scattered. Recall that the inclusion of maximum sablefish prices worsened the model diagnostics, compared to the status quo model.

Figure 58. Model plots for the model predicting vessel number using both average and maximum sablefish prices.

In Figure 59 below, note that the model that includes maximum Dungeness crab prices better predicts the extremely high sablefish landings in 2022 than a comparable model that does not (Figures 50 & 56). Additionally, the fit to the model (0.4553) is higher than the status quo model or the model that includes maximum sablefish prices. However, the model still struggles to capture the annual variation in OAN landings.

Figure 59. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a model that log-transforms the average landings and weekly limit data and adds maximum Dungeness crab prices to predict number of vessels.

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -103502  -30392   -1605   26672  151073 
## 
## Coefficients:
##               Estimate Std. Error t value        Pr(>|t|)    
## (Intercept)  9730.5057 14906.3748   0.653           0.516    
## predict         0.9251     0.1208   7.660 0.0000000000906 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 49230 on 68 degrees of freedom
## Multiple R-squared:  0.4632, Adjusted R-squared:  0.4553 
## F-statistic: 58.68 on 1 and 68 DF,  p-value: 0.00000000009063

3.2.4 OAN - Fuel Prices

Neither the average landings per vessel nor the number of vessels in the OAN fleet appear to be heavily influenced by fuel prices from either region (Figure 60). There may be some relationship between average landings per vessel and northern CA fuel prices, but it is unclear why higher fuel prices would lead to higher landings per vessel. There does appear to be a slight inverse relationship between northern CA fuel prices and the number of OAN vessels, although the plot’s scatter appears mostly random.

Table 34 demonstrates that adding dockside fuel prices from OR/WA combined and northern CA, in addition to average sablefish and maximum Dungeness crab prices, improves the model fit. However, there is very little difference between the model that only adds northern CA fuel prices and the model that adds all fuel prices. When all fuel prices are included, the fuel predictors are not statistically significant, whereas when only northern CA fuel prices are added, the one fuel variable is statistically significant. A likelihood ratio test indicates that there is no statistically significant difference between the model that includes all fuel prices and the model that includes only northern CA fuel prices (Table 35). Another likelihood ratio test indicates that, compared to only using sablefish and Dungeness crab prices, adding northern CA fuel prices significantly improves the model (Table 36).

Figure 60. Relationships between regional fuel prices and landings per vessel or number of vessels in the OAN sector.

Table 34: Comparison of linear regression models using the status quo approach and adding maximum crab and dockside fuel prices as predictors.

	SQ Model	Sable & Crab Prices	OR+WA Fuel	CA North Fuel	Sable & Crab & OR+WA Fuel	Sable & Crab & CA N. Fuel	Sable & Crab & OR+WA & CA N. Fuel
(Intercept)	28.34	61.77 **	119.39 ***	151.39 ***	86.37 **	107.97 ***	125.95 ***
	(16.35)	(22.02)	(17.02)	(19.17)	(26.98)	(30.47)	(33.32)
ADJ_PRICE	22.97 ***	19.61 ***			19.10 ***	16.65 **	12.90 *
	(4.90)	(5.01)			(4.97)	(5.07)	(5.81)
PER.4.PEAK	15.24 ***	13.39 ***	18.02 ***	19.05 ***	14.28 ***	14.84 ***	15.12 ***
	(3.03)	(3.07)	(3.52)	(3.36)	(3.09)	(3.07)	(3.06)
max_crab_price		-2.44 *			-2.66 *	-2.61 *	-2.38 *
		(1.11)			(1.11)	(1.09)	(1.10)
adj_fuel_OR_WA			-4.35		-4.88		11.85
			(3.70)		(3.15)		(9.12)
adj_fuel_CAN				-9.98 **		-6.94 *	-18.61
				(3.60)		(3.25)	(9.55)
N	70	70	70	70	70	70	70
R2	0.45	0.48	0.28	0.34	0.50	0.52	0.53
* p < 0.001; p < 0.01; * p < 0.05.

Table 35: Likelihood ratio test comparing a model with all fuel prices (full model) and a model with only northern CA fuel prices (nested model).

#Df	LogLik	Df	Chisq	Pr(>Chisq)
7	-321
6	-322	-1	1.82	0.177

Table 36: Likelihood ratio test comparing a model that includes northern CA fuel prices (full model) and one that does not (nested model). Both models include average sablefish and maximum crab prices.

#Df	LogLik	Df	Chisq	Pr(>Chisq)
6	-322
5	-324	-1	4.75	0.0294

Figure 61 compares the residual and normal q-q plots for the models with and without northern CA fuel prices. Adding northern CA fuel prices does not greatly impact the model assumptions.

Figure 61. Model diagnostics plots for the model using only average sablefish and maximum D. crab prices (left) and the model that also includes northern CA fuel prices (right).

The retrospective fit is not as strong when northern CA fuel prices are added, compared to only including sablefish and Dungeness crab prices as predictors of OAN vessel participation. This may be due to the high variability of fuel prices since the COVID-19 pandemic began in 2020. At this time, the GMT is proposing to not include fuel prices as a covariate but to keep it in the tool belt for potential future inclusion when retrospective fits, especially in recent years, appear stronger.

Figure 62. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a model that includes northern CA dockside fuel prices in addition to average sablefish and maximum crab prices.

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -113038  -28050   -2010   29984  141959 
## 
## Coefficients:
##               Estimate Std. Error t value       Pr(>|t|)    
## (Intercept)  5971.1158 15944.9256   0.374          0.709    
## predict         0.9375     0.1275   7.355 0.000000000324 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 50150 on 68 degrees of freedom
## Multiple R-squared:  0.4431, Adjusted R-squared:  0.4349 
## F-statistic:  54.1 on 1 and 68 DF,  p-value: 0.0000000003241

3.2.5 OAN - Generalized Linear Model (GLM)

Similar to Section 2.2.6 on the LEN GLMs, the following code uses the model.suite formula to rank the “best fit” models by AIC score using the covariates provided (“covars”). Similar to LEN, the following OAN GLMs assume a negative binomial distribution. Instead of bimonthly period, the period 4 peak adjuster is provided as a fixed effect. Table 37 shows the first six rows of the ranked list of models. The model with the lowest AIC score (best model) is one that includes maximum sablefish, maximum Dungeness crab, northern CA fuel prices, and the period 4 peak adjuster as predictors. Note that model.suite did not include average sablefish prices in the top five ranked models but instead included maximum sablefish price in all top six models.

OAN <- OAN %>% 
  mutate(PERIOD = as.factor(PERIOD),
         PER.4.PEAK = as.factor(PER.4.PEAK))

covars <- c("ADJ_PRICE", "med_afi_price", "max_afi_price",
            "adj_crab_price", "max_crab_price", "adj_fuel_OR_WA",
            "adj_fuel_CAN", "PER.4.PEAK")

library(MASS)

model.full <- glm.nb(as.formula(
  paste("VES_NUM",
        paste(0, "+", paste(covars, collapse = " + ")),
        sep = " ~ ")),
  data = OAN,
  weights = WEIGHT,
  na.action = "na.fail")

model.suite <- MuMIn::dredge(model.full,
                             rank = "AIC",
                             fixed = c("PER.4.PEAK"))

Table 37: First six rows of the model.suite output ranking OAN GLMs by AIC score (i.e., best fit).

adj_fuel_CAN	adj_fuel_OR_WA	ADJ_PRICE	max_afi_price	max_crab_price	med_afi_price	PER.4.PEAK	df	logLik	AIC	delta	weight
-0.0514			0.218	-0.0159		+	8	-399	815	0	0.27375340025816
	-0.0402		0.228	-0.016		+	8	-400	816	1.04	0.162470294713297
			0.231	-0.0152		+	7	-401	816	1.25	0.146234313955052
-0.0602			0.231	-0.0162	-0.0344	+	9	-399	816	1.28	0.1442526975406
-0.0489			0.233			+	7	-401	816	1.36	0.138358796057569
-0.0543		-0.0372	0.23	-0.0162		+	9	-399	816	1.41	0.134930497475322

Although model.suite ranked the GLM model with northern CA fuel, maximum sablefish, and maximum Dungeness crab prices highest, a likelihood ratio test indicates that there is no statistically significant difference between that model and a GLM that includes only maximum sablefish and crab prices as predictors (Table 38).

Table 38: Likelihood ratio test results to assess inclusion of fuel prices in the OAN GLM predicting number of vessels.

Full Model	Nested Model	P-value
CA fuel + max sable price + max crab price	max sable price + max crab price	0.07125
OR/WA fuel + max sable price + max crab price	max sable price + max crab price	0.13710

The normal q-q plot for the linear regression that includes average sablefish and maximum Dungeness crab prices appears better than those of either of the first and third highest ranked GLM, and there is very little difference among the residual plots (Figure 63). There appears to be a slight improvement in the normal q-q plot when northern CA fuel prices are included in the GLM, compared to not including them. Given that northern CA fuel prices were in the highest ranked model, the p-value of the likelihood ratio test was 0.07 (indicating some nearly significant difference), and there is a slight improvement in the model diagnostics, the following retrospective fleetwide predictions are done using maximum sablefish, maximum Dungeness crab, and northern CA fuel prices as predictors.

Figure 63. Model diagnostics plots for two GLMs and a linear regression that predict number of vessels using sablefish, Dungeness crab, and/or northern CA fuel prices.

Using a GLM and assuming a negative binomial distribution, the model fit to historical data is 0.61, much higher than either the status quo linear regression model (0.4035) or the linear regression that adds maximum Dungeness crab prices (0.4553). Note that the GLM was only applied to the model that predicts the number of vessels, and the model that predicts average landings per vessel is still a linear regression that log-transforms the data. The GLM is better able to capture and predict the annual variation in OAN landings, whereas the linear regressions generally do not capture the variability (Figure 64).

Figure 64. Historical comparison of predicted fleetwide landings (red) and actual fleetwide landings (black), using a GLM to predict the number of vessels based on maximum sablefish and Dungeness crab prices, along with northern CA fuel prices.

## 
## Call:
## lm(formula = LBS ~ predict, data = OAN_predictions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -100975  -22835   -3781   19002  162581 
## 
## Coefficients:
##                Estimate  Std. Error t value             Pr(>|t|)    
## (Intercept) -1172.74569 11985.45603  -0.098                0.922    
## predict         1.01245     0.09631  10.512 0.000000000000000571 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 41760 on 69 degrees of freedom
## Multiple R-squared:  0.6156, Adjusted R-squared:   0.61 
## F-statistic: 110.5 on 1 and 69 DF,  p-value: 0.0000000000000005714

4 Limited Entry South (LES)

In Figure LES_part below, the dark blue dots represent the weekly trip limit for each bimonthly period between 2011 and 2022, while the dots of varying colors represent the number of LES vessels making landings each period (each unique color represents a different year). As the figure shows, participation dropped precipitously in 2019 and has continued declining since then. This is despite the fact that the weekly trip limit has increased since 2019 due to high sablefish allocations. This data leads the model to conclude that high trip limits cause low participation, but the decline in participation is related to market and infrastructure constraints, not to trip limits.

Figure 65. Trend in LES participation (number of vessels; colored dots) and weekly trip limit (dark blue dots), 2011-2022.

Declines in sablefish price, particularly maximum price per pound, are a potential driver of low participation in the fleet. Figure LES_prices below shows the minimum (red), average (blue), and maximum (green) sablefish price per pound, adjusted for inflation, between 2011 and 2022. The average and low prices per pound have been relatively stable since 2011, whereas there tends to be more variation in the maximum price per pound. While the maximum price does not exceed $7 after 2019, this is likely due to COVID-related impacts to markets in 2020 and occurs only after the drop in participation in 2019.

5 Open Access South (OAS)

Similar to the LES sector, declines in sablefish price, particularly maximum price per pound, are a potential driver of low participation in the fleet. Figure 66 below shows the declining number of vessels over the years in the bars while the blue line shows the increase in trip limits through time. Due to the limited effort data, the former model lost the ability to accurately predict participation, particularly because it assumed that increasing trip limits resulted in declining participation despite the high likelihood that those are unrelated trends.Therefore, this model is no longer used to predict what is a very low effort in the OAS sector.

Figure 66. Trend in OAS participation (number of vessels; bar) and bimonthly trip limit (blue line), 2011-2022.

Sablefish Daily Trip Limit Model Methodology Review 2023

Whitney Roberts, Groundfish Management Team

2023-04-25

1 Introduction

2 Limited Entry North (LEN)

2.1 LEN - Current Model

2.1.1 LEN - Distribution Assumptions

2.1.1.1 LEN - Average Pounds per Vessel

2.1.1.2 LEN - Number of Vessels

2.1.2 LEN - Model Run

2.2 LEN - Potential Model Improvements

2.2.1 LEN - Data Weights

2.2.2 LEN - Log Transformation

2.2.3 LEN - AFI Prices

2.2.4 LEN - Dungeness Crab Prices

2.2.5 LEN - Fuel Prices

2.2.6 LEN - Generalized Linear Model (GLM)

3 Open Access North (OAN)

3.1 OAN - Current Model

3.1.1 OAN - Distribution Assumptions

3.1.1.1 OAN - Average Pounds per Vessel

3.1.1.2 OAN - Number of Vessels

3.1.2 OAN - Model Run

3.2 OAN - Potential Model Improvements

3.2.1 OAN - Log Transformation

3.2.2 OAN - AFI Prices

3.2.3 OAN - Dungeness Crab Prices

3.2.4 OAN - Fuel Prices

3.2.5 OAN - Generalized Linear Model (GLM)

4 Limited Entry South (LES)

5 Open Access South (OAS)