Introduction

Fisheries scientists often compute proportional stock distribution (PSD) and relative weight (Wr) indices (see here and here for more details). Proportional stock indices require the construction of a variable that contains the Gabelhouse length categories (i.e., “stock”, “quality”, “preferred”, “memorable”, “trophy”) derived from observed lengths. Computation of relative weights requires that a variable that contains the standard weight for the fish be constructed. The Gabelhouse length categories and standard weight equations are species-specific and, thus, past implementations required that application of the categories or standard weight equations occur on a species-by-species basis. In other words, if a fisheries scientists wanted to create these variables for several species in a data.frame then he or she would have to subset the data.frame by a single species and then apply the categories or formula to that subsetted data.frame. This would be repeated for each species.

I recently modified or added two functions to the FSA package that, when coupled with functions in the dplyr package, can efficiently create the length categorization and relative weight variables for all species in a data.frame without having to subset that data.frame. These two functions are illustrated below.

First, I loaded the FSA and dplyr packages for use below.

library(FSA)
library(dplyr)

Some Data

The InchLake2 data.frame in the FSAdata package contains the observed lengths and weights for several species of fish from an inland lake in Wisconsin.

library(FSAdata)
data(InchLake2)
str(InchLake2)
## 'data.frame':    516 obs. of  6 variables:
##  $ netID  : int  206 205 205 205 205 205 205 205 205 205 ...
##  $ fishID : int  501 502 503 504 505 506 507 508 509 510 ...
##  $ species: Factor w/ 9 levels "Black Crappie",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ length : num  1.5 1.7 2.2 2.1 1.5 1.9 1.9 1.4 1.2 1.4 ...
##  $ weight : num  0.7 1.4 1.5 1.4 1 1.8 1.4 0.6 0.3 0.8 ...
##  $ year   : int  2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 ...

The lengths were measured in inches and the weights were recorded in grams. The lengths were converted to mm so that the two measures would be in the same type of units. For display purposes below, I also sorted the data.frame by species and length and removed variables that were not needed for this post.

InchLake2 <- InchLake2 %>%
  mutate(tl=round(length*25.4,0)) %>%
  arrange(species,tl) %>%
  select(-netID,-fishID,-year,-length)
head(InchLake2)
##         species weight  tl
## 1 Black Crappie     37 142
## 2 Black Crappie     37 147
## 3 Black Crappie     46 152
## 4 Black Crappie     47 155
## 5 Black Crappie    102 188
## 6 Black Crappie    275 267

The New Functions

The new function that adds the Gabelhouse length categories is psdAdd(). When used with mutate() from dplyr, psdAdd() takes vectors of the length and species names as the first two arguments.

InchLake2 <- mutate(InchLake2,gcat=psdAdd(tl,species))
InchLake2[seq(1,501,50),] # Examine every 50th row
##              species weight  tl      gcat
## 1      Black Crappie   37.0 142     stock
## 51          Bluegill    1.0  41      zero
## 101         Bluegill   41.0 137     stock
## 151         Bluegill   98.0 170   quality
## 201         Bluegill  170.0 203 preferred
## 251 Bluntnose Minnow    0.9  53      <NA>
## 301 Bluntnose Minnow    1.4  64      <NA>
## 351      Iowa Darter    0.9  46      <NA>
## 401  Largemouth Bass  333.0 297     stock
## 451  Largemouth Bass  624.0 363   quality
## 501     Yellow Perch  150.0 239   quality

From these results, it is seen that a new variable, called gcat contains the length category names for all species for which Gabelhouse length categoies exist, an NA appears for those species for which Gabelhouse length categories do not exist; e.g., Bluntnose Minnow and Iowa Darter), and the category name is “zero” if the fish’s length is below the stock value.

psdAdd() assumes that metric (mm and g) units are used and that the user wants names for the Gabelhouse categories (e.g., “stock”, “quality”, etc.) in the resulting variable. If the data were recorded in English units (inches and lbs) then use units='English'. The minimum values for the categories, illustrated below, will be used if use.names=FALSE.

InchLake2 <- mutate(InchLake2,gcatv=psdAdd(tl,species,use.names=FALSE))
InchLake2[seq(1,501,50),] # Examine every 50th row
##              species weight  tl      gcat gcatv
## 1      Black Crappie   37.0 142     stock   130
## 51          Bluegill    1.0  41      zero     0
## 101         Bluegill   41.0 137     stock    80
## 151         Bluegill   98.0 170   quality   150
## 201         Bluegill  170.0 203 preferred   200
## 251 Bluntnose Minnow    0.9  53      <NA>    NA
## 301 Bluntnose Minnow    1.4  64      <NA>    NA
## 351      Iowa Darter    0.9  46      <NA>    NA
## 401  Largemouth Bass  333.0 297     stock   200
## 451  Largemouth Bass  624.0 363   quality   300
## 501     Yellow Perch  150.0 239   quality   200

The wrAdd() function is used to add a variable with the relative weights for each species. When used with mutate() it requires vectors of observed weights, lengths, and species names as the first three arguments. As with psdAdd(), wrAdd() assumes metric units (mm and g) but can use English units (inches and lbs) with units="metric".

InchLake2 <- mutate(InchLake2,Wr=wrAdd(weight,tl,species))
InchLake2[seq(1,501,50),] # Examine every 50th row
##              species weight  tl      gcat gcatv    Wr
## 1      Black Crappie   37.0 142     stock   130 97.01
## 51          Bluegill    1.0  41      zero     0    NA
## 101         Bluegill   41.0 137     stock    80 79.69
## 151         Bluegill   98.0 170   quality   150 93.12
## 201         Bluegill  170.0 203 preferred   200 89.70
## 251 Bluntnose Minnow    0.9  53      <NA>    NA    NA
## 301 Bluntnose Minnow    1.4  64      <NA>    NA    NA
## 351      Iowa Darter    0.9  46      <NA>    NA    NA
## 401  Largemouth Bass  333.0 297     stock   200 90.60
## 451  Largemouth Bass  624.0 363   quality   300 88.03
## 501     Yellow Perch  150.0 239   quality   200 75.83

wrAdd() will not compute a relative weight if the length is less than the minimum or more than the maximum length for which the standard weight should be applied for a species. This can be seen on the second line above where the relative weight for the 41 mm Bluegill is shown as NA. The relative weight value for species without a known standard weight equation will also be NA.

Limitations

These functions are both general and new and, thus, have some limitations. Both functions require that the species names be recorded as they appear in PSDlit and WRlit, the databases of Gabelhouse lengths and standard weight equations in FSA. The species names in these datafiles are full names and, thus, species codes, as are commonly used by many agencies, cannot be used. However, recodeF() in FSA or recode in car can be used to efficiently create a new variable that converts “non-standard” species names to names accepted by psdAdd() and wrAdd().

Currently, psdAdd() can only be used to add Gabelhouse length categories. Thus, the user currently cannot use a length interval different than what has been defined in the literature for a particular species.

Currently wrAdd() only uses standard weight equations that have been constructed for the 75th percentile of mean lengths. While the vast majority of standard weight equations use the 75th percentile, some recent equations for some species include other percentiles. These other percentiles cannot be used with wrAdd().

Simple Summaries

The incremental and traditional PSD calculations for all species are constructed most efficiently by first removing all of the sub-stock fish (labeled with “zero”) and all individuals where the length categorization value is NA (i.e., removes species without Gabelhouse length categories). Use droplevels() to remove “zero” and the unused species from the list of possible levels.

Inch4PSD <- InchLake2 %>%
  filter(gcat!="zero") %>%
  mutate(gcat=droplevels(gcat)) %>%
  filter(!is.na(gcat)) %>%
  mutate(species=droplevels(species))

The incremental PSD values are then computed, for all species where these calculations are possible, as follows.

freq <- xtabs(~species+gcat,data=Inch4PSD)
iPSDs <- prop.table(freq,margin=1)*100
round(iPSDs,1)
##                  gcat
## species           stock quality preferred memorable
##   Black Crappie    20.0     0.0      28.0      52.0
##   Bluegill         27.5    43.8      28.7       0.0
##   Largemouth Bass  31.0    61.9       7.1       0.0
##   Pumpkinseed      12.5    62.5      25.0       0.0
##   Yellow Perch      0.0    52.2      43.5       4.3

Thus, for example 20% of Black Crappies are between stock- and quality-length (this is PSD S-Q) and 25% of Pumpkinseeds are between preferred- and memorable-length (this is PSD P-M).

The traditional PSD values are then computed as follows

PSDs <- t(apply(iPSDs,MARGIN=1,FUN=rcumsum))
round(PSDs,1)
##                  
## species           stock quality preferred memorable
##   Black Crappie     100    80.0      80.0      52.0
##   Bluegill          100    72.5      28.8       0.0
##   Largemouth Bass   100    69.0       7.1       0.0
##   Pumpkinseed       100    87.5      25.0       0.0
##   Yellow Perch      100   100.0      47.8       4.3

Thus, for example 80% of stock-length Black Crappies are quality-length (or greater; i.e., this is the PSD) and 25% of stock-length Pumpkinseeds are preferred-length (or greater; this is PSD-P).

Descriptive statistics for the relative weights of each species can be quickly computed with Summarize() from FSA.

Summarize(Wr~species,data=InchLake2,digits=1)
##           species   n mean   sd  min   Q1 median    Q3   max percZero
## 1   Black Crappie  25 90.2  6.0 75.6 86.4   90.0  94.4 105.0        0
## 2        Bluegill 160 86.2 13.0 42.8 82.3   88.6  94.6 117.0        0
## 3 Largemouth Bass  87 83.8  8.0 63.1 80.2   83.8  87.5 133.0        0
## 4     Pumpkinseed  12 95.3 23.6 43.4 85.0  103.0 111.0 127.0        0
## 5    Yellow Perch  26 70.9  6.9 60.1 66.0   70.5  75.3  90.8        0

Descriptive statistics for the relative weights of each species by Gabelhouse length category can be computed with Summarize().

Summarize(Wr~gcat*species,data=InchLake2,digits=1)
##         gcat         species  n  mean   sd   min    Q1 median    Q3   max
## 1      stock   Black Crappie  5  95.2  6.7  86.4  91.9   96.0  97.0 105.0
## 2  preferred   Black Crappie  7  88.0  6.7  75.6  85.7   87.2  93.6  94.4
## 3  memorable   Black Crappie 13  89.5  4.5  82.5  87.5   89.7  91.7  97.8
## 4      stock        Bluegill 44  76.0 18.7  42.8  56.9   79.6  87.9 117.0
## 5    quality        Bluegill 70  88.1  7.1  71.1  84.0   87.9  93.4 102.0
## 6  preferred        Bluegill 46  92.9  5.5  82.3  89.2   91.2  97.8 104.0
## 7       zero Largemouth Bass  3  87.6  3.5  84.3  85.7   87.2  89.3  91.3
## 8      stock Largemouth Bass 26  86.4 10.5  75.3  81.7   85.1  87.8 133.0
## 9    quality Largemouth Bass 52  83.4  5.3  68.5  80.4   83.1  86.9  92.5
## 10 preferred Largemouth Bass  6  74.3  9.7  63.1  65.9   76.3  79.5  87.4
## 11      zero     Pumpkinseed  4  85.5 13.7  67.2  82.1   87.2  90.6 100.0
## 12     stock     Pumpkinseed  1  43.4   NA  43.4  43.4   43.4  43.4  43.4
## 13   quality     Pumpkinseed  5 107.1 18.1  78.8 105.0  108.0 117.0 127.0
## 14 preferred     Pumpkinseed  2 111.1  0.1 111.0 111.0  111.0 111.0 111.0
## 15      zero    Yellow Perch  3  73.3  6.9  65.4  71.1   76.7  77.3  77.9
## 16   quality    Yellow Perch 12  71.4  7.4  61.5  68.0   70.7  73.6  90.8
## 17 preferred    Yellow Perch 10  68.8  6.6  60.1  64.4   67.4  71.7  81.9
## 18 memorable    Yellow Perch  1  76.6   NA  76.6  76.6   76.6  76.6  76.6
##    percZero
## 1         0
## 2         0
## 3         0
## 4         0
## 5         0
## 6         0
## 7         0
## 8         0
## 9         0
## 10        0
## 11        0
## 12        0
## 13        0
## 14        0
## 15        0
## 16        0
## 17        0
## 18        0