Fisheries scientists often compute proportional stock distribution (PSD) and relative weight (Wr) indices (see here and here for more details). Proportional stock indices require the construction of a variable that contains the Gabelhouse length categories (i.e., “stock”, “quality”, “preferred”, “memorable”, “trophy”) derived from observed lengths. Computation of relative weights requires that a variable that contains the standard weight for the fish be constructed. The Gabelhouse length categories and standard weight equations are species-specific and, thus, past implementations required that application of the categories or standard weight equations occur on a species-by-species basis. In other words, if a fisheries scientists wanted to create these variables for several species in a data.frame then he or she would have to subset the data.frame by a single species and then apply the categories or formula to that subsetted data.frame. This would be repeated for each species.
I recently modified or added two functions to the FSA package that, when coupled with functions in the dplyr package, can efficiently create the length categorization and relative weight variables for all species in a data.frame without having to subset that data.frame. These two functions are illustrated below.
First, I loaded the FSA and dplyr packages for use below.
library(FSA)
library(dplyr)
The InchLake2 data.frame in the FSAdata package contains the observed lengths and weights for several species of fish from an inland lake in Wisconsin.
library(FSAdata)
data(InchLake2)
str(InchLake2)
## 'data.frame': 516 obs. of 6 variables:
## $ netID : int 206 205 205 205 205 205 205 205 205 205 ...
## $ fishID : int 501 502 503 504 505 506 507 508 509 510 ...
## $ species: Factor w/ 9 levels "Black Crappie",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ length : num 1.5 1.7 2.2 2.1 1.5 1.9 1.9 1.4 1.2 1.4 ...
## $ weight : num 0.7 1.4 1.5 1.4 1 1.8 1.4 0.6 0.3 0.8 ...
## $ year : int 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 ...
The lengths were measured in inches and the weights were recorded in grams. The lengths were converted to mm so that the two measures would be in the same type of units. For display purposes below, I also sorted the data.frame by species and length and removed variables that were not needed for this post.
InchLake2 <- InchLake2 %>%
mutate(tl=round(length*25.4,0)) %>%
arrange(species,tl) %>%
select(-netID,-fishID,-year,-length)
head(InchLake2)
## species weight tl
## 1 Black Crappie 37 142
## 2 Black Crappie 37 147
## 3 Black Crappie 46 152
## 4 Black Crappie 47 155
## 5 Black Crappie 102 188
## 6 Black Crappie 275 267
The new function that adds the Gabelhouse length categories is psdAdd(). When used with mutate() from dplyr, psdAdd() takes vectors of the length and species names as the first two arguments.
InchLake2 <- mutate(InchLake2,gcat=psdAdd(tl,species))
InchLake2[seq(1,501,50),] # Examine every 50th row
## species weight tl gcat
## 1 Black Crappie 37.0 142 stock
## 51 Bluegill 1.0 41 zero
## 101 Bluegill 41.0 137 stock
## 151 Bluegill 98.0 170 quality
## 201 Bluegill 170.0 203 preferred
## 251 Bluntnose Minnow 0.9 53 <NA>
## 301 Bluntnose Minnow 1.4 64 <NA>
## 351 Iowa Darter 0.9 46 <NA>
## 401 Largemouth Bass 333.0 297 stock
## 451 Largemouth Bass 624.0 363 quality
## 501 Yellow Perch 150.0 239 quality
From these results, it is seen that a new variable, called gcat contains the length category names for all species for which Gabelhouse length categoies exist, an NA appears for those species for which Gabelhouse length categories do not exist; e.g., Bluntnose Minnow and Iowa Darter), and the category name is “zero” if the fish’s length is below the stock value.
psdAdd() assumes that metric (mm and g) units are used and that the user wants names for the Gabelhouse categories (e.g., “stock”, “quality”, etc.) in the resulting variable. If the data were recorded in English units (inches and lbs) then use units='English'. The minimum values for the categories, illustrated below, will be used if use.names=FALSE.
InchLake2 <- mutate(InchLake2,gcatv=psdAdd(tl,species,use.names=FALSE))
InchLake2[seq(1,501,50),] # Examine every 50th row
## species weight tl gcat gcatv
## 1 Black Crappie 37.0 142 stock 130
## 51 Bluegill 1.0 41 zero 0
## 101 Bluegill 41.0 137 stock 80
## 151 Bluegill 98.0 170 quality 150
## 201 Bluegill 170.0 203 preferred 200
## 251 Bluntnose Minnow 0.9 53 <NA> NA
## 301 Bluntnose Minnow 1.4 64 <NA> NA
## 351 Iowa Darter 0.9 46 <NA> NA
## 401 Largemouth Bass 333.0 297 stock 200
## 451 Largemouth Bass 624.0 363 quality 300
## 501 Yellow Perch 150.0 239 quality 200
The wrAdd() function is used to add a variable with the relative weights for each species. When used with mutate() it requires vectors of observed weights, lengths, and species names as the first three arguments. As with psdAdd(), wrAdd() assumes metric units (mm and g) but can use English units (inches and lbs) with units="metric".
InchLake2 <- mutate(InchLake2,Wr=wrAdd(weight,tl,species))
InchLake2[seq(1,501,50),] # Examine every 50th row
## species weight tl gcat gcatv Wr
## 1 Black Crappie 37.0 142 stock 130 97.01
## 51 Bluegill 1.0 41 zero 0 NA
## 101 Bluegill 41.0 137 stock 80 79.69
## 151 Bluegill 98.0 170 quality 150 93.12
## 201 Bluegill 170.0 203 preferred 200 89.70
## 251 Bluntnose Minnow 0.9 53 <NA> NA NA
## 301 Bluntnose Minnow 1.4 64 <NA> NA NA
## 351 Iowa Darter 0.9 46 <NA> NA NA
## 401 Largemouth Bass 333.0 297 stock 200 90.60
## 451 Largemouth Bass 624.0 363 quality 300 88.03
## 501 Yellow Perch 150.0 239 quality 200 75.83
wrAdd() will not compute a relative weight if the length is less than the minimum or more than the maximum length for which the standard weight should be applied for a species. This can be seen on the second line above where the relative weight for the 41 mm Bluegill is shown as NA. The relative weight value for species without a known standard weight equation will also be NA.
These functions are both general and new and, thus, have some limitations. Both functions require that the species names be recorded as they appear in PSDlit and WRlit, the databases of Gabelhouse lengths and standard weight equations in FSA. The species names in these datafiles are full names and, thus, species codes, as are commonly used by many agencies, cannot be used. However, recodeF() in FSA or recode in car can be used to efficiently create a new variable that converts “non-standard” species names to names accepted by psdAdd() and wrAdd().
Currently, psdAdd() can only be used to add Gabelhouse length categories. Thus, the user currently cannot use a length interval different than what has been defined in the literature for a particular species.
Currently wrAdd() only uses standard weight equations that have been constructed for the 75th percentile of mean lengths. While the vast majority of standard weight equations use the 75th percentile, some recent equations for some species include other percentiles. These other percentiles cannot be used with wrAdd().
The incremental and traditional PSD calculations for all species are constructed most efficiently by first removing all of the sub-stock fish (labeled with “zero”) and all individuals where the length categorization value is NA (i.e., removes species without Gabelhouse length categories). Use droplevels() to remove “zero” and the unused species from the list of possible levels.
Inch4PSD <- InchLake2 %>%
filter(gcat!="zero") %>%
mutate(gcat=droplevels(gcat)) %>%
filter(!is.na(gcat)) %>%
mutate(species=droplevels(species))
The incremental PSD values are then computed, for all species where these calculations are possible, as follows.
freq <- xtabs(~species+gcat,data=Inch4PSD)
iPSDs <- prop.table(freq,margin=1)*100
round(iPSDs,1)
## gcat
## species stock quality preferred memorable
## Black Crappie 20.0 0.0 28.0 52.0
## Bluegill 27.5 43.8 28.7 0.0
## Largemouth Bass 31.0 61.9 7.1 0.0
## Pumpkinseed 12.5 62.5 25.0 0.0
## Yellow Perch 0.0 52.2 43.5 4.3
Thus, for example 20% of Black Crappies are between stock- and quality-length (this is PSD S-Q) and 25% of Pumpkinseeds are between preferred- and memorable-length (this is PSD P-M).
The traditional PSD values are then computed as follows
PSDs <- t(apply(iPSDs,MARGIN=1,FUN=rcumsum))
round(PSDs,1)
##
## species stock quality preferred memorable
## Black Crappie 100 80.0 80.0 52.0
## Bluegill 100 72.5 28.8 0.0
## Largemouth Bass 100 69.0 7.1 0.0
## Pumpkinseed 100 87.5 25.0 0.0
## Yellow Perch 100 100.0 47.8 4.3
Thus, for example 80% of stock-length Black Crappies are quality-length (or greater; i.e., this is the PSD) and 25% of stock-length Pumpkinseeds are preferred-length (or greater; this is PSD-P).
Descriptive statistics for the relative weights of each species can be quickly computed with Summarize() from FSA.
Summarize(Wr~species,data=InchLake2,digits=1)
## species n mean sd min Q1 median Q3 max percZero
## 1 Black Crappie 25 90.2 6.0 75.6 86.4 90.0 94.4 105.0 0
## 2 Bluegill 160 86.2 13.0 42.8 82.3 88.6 94.6 117.0 0
## 3 Largemouth Bass 87 83.8 8.0 63.1 80.2 83.8 87.5 133.0 0
## 4 Pumpkinseed 12 95.3 23.6 43.4 85.0 103.0 111.0 127.0 0
## 5 Yellow Perch 26 70.9 6.9 60.1 66.0 70.5 75.3 90.8 0
Descriptive statistics for the relative weights of each species by Gabelhouse length category can be computed with Summarize().
Summarize(Wr~gcat*species,data=InchLake2,digits=1)
## gcat species n mean sd min Q1 median Q3 max
## 1 stock Black Crappie 5 95.2 6.7 86.4 91.9 96.0 97.0 105.0
## 2 preferred Black Crappie 7 88.0 6.7 75.6 85.7 87.2 93.6 94.4
## 3 memorable Black Crappie 13 89.5 4.5 82.5 87.5 89.7 91.7 97.8
## 4 stock Bluegill 44 76.0 18.7 42.8 56.9 79.6 87.9 117.0
## 5 quality Bluegill 70 88.1 7.1 71.1 84.0 87.9 93.4 102.0
## 6 preferred Bluegill 46 92.9 5.5 82.3 89.2 91.2 97.8 104.0
## 7 zero Largemouth Bass 3 87.6 3.5 84.3 85.7 87.2 89.3 91.3
## 8 stock Largemouth Bass 26 86.4 10.5 75.3 81.7 85.1 87.8 133.0
## 9 quality Largemouth Bass 52 83.4 5.3 68.5 80.4 83.1 86.9 92.5
## 10 preferred Largemouth Bass 6 74.3 9.7 63.1 65.9 76.3 79.5 87.4
## 11 zero Pumpkinseed 4 85.5 13.7 67.2 82.1 87.2 90.6 100.0
## 12 stock Pumpkinseed 1 43.4 NA 43.4 43.4 43.4 43.4 43.4
## 13 quality Pumpkinseed 5 107.1 18.1 78.8 105.0 108.0 117.0 127.0
## 14 preferred Pumpkinseed 2 111.1 0.1 111.0 111.0 111.0 111.0 111.0
## 15 zero Yellow Perch 3 73.3 6.9 65.4 71.1 76.7 77.3 77.9
## 16 quality Yellow Perch 12 71.4 7.4 61.5 68.0 70.7 73.6 90.8
## 17 preferred Yellow Perch 10 68.8 6.6 60.1 64.4 67.4 71.7 81.9
## 18 memorable Yellow Perch 1 76.6 NA 76.6 76.6 76.6 76.6 76.6
## percZero
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
## 7 0
## 8 0
## 9 0
## 10 0
## 11 0
## 12 0
## 13 0
## 14 0
## 15 0
## 16 0
## 17 0
## 18 0
In earlier posts, I briefly introduced the “verbs” in the dplyr package and showed how to use lencat() from FSA with mutate() from dplyr.