Abstract

This paper looked at data that was related to lobsters caught between the years of 1950-2016 in the state of Maine. The data for this paper was found on the website called The Data and Story Library. Through the use of this data, the relationships of Pounds in Millions vs. Year, and Value in Millions vs. Year, were evaluated through the use of polynomial regression models and summary outputs. This allowed for any trends or patterns between the variables to be seen, and allowed for relationships to be established if supported by data.

Introduction

Lobsters are an important part of Maine. Ask anyone what they know about Maine, they will say either one of two things: blueberries or lobsters. Especially considering lobster went from being prison food to the food of the rich and wealthy. Our group wanted to look at was how the lobstering business has changed from the 1950s to today. Lobsters are a type of crustacean that lives within the ocean. Specifically, the lobsters for this assignment look at Atlantic lobster. Atlantic lobster, also known as Homarus americanus, is found along the coast of Maine, and has an extended range that goes from the Canadian Maritimes to the South Carolina (Lobsters). This lobster is very popular in Maine, and is one of the main attractions for tourism. In addition to this, it should also be noted that lobstering has increased over time and it is great that it is honestly a Maine staple. It helps bring more money into Maine and gives us something to be known for.The American lobster can be characterized by two strong claws, which are comprised of a big-toothed crusher claw for pulverizing shells, and a finer-edged ripper claw that is used for tearing into their food (Lobsters). The relationships that will be investigated are Pounds in Millions vs. Year, and Value in Millions vs. Year, were evaluated through the use of polynomial regression models and summary outputs.

Data

The data for this paper was found on the website called The Data and Story Library. Through the use of this website, a data set based off of lobster landings caught in the state of Maine between 1950-2016 was found. While the data for this paper looked only at the before mentioned years, the data itself was derived from a larger data set. That data set came off the State of Maine website, and looked at commercial lobster fishing landings dating back to the year 1880. Overall, the data set used, and the larger one that it was derived from, looked at a specific year, and gave info on the amount of lobsters caught in metric tons, pounds, pounds in millions, the value, value in millions, the price per pound, the number of license holders, the number of traps in millions, and the temperature at Boothbay Harbor Station in Celsius. The specific variables looked for this assignment was Year, Pounds in Millions, and Value in Millions.

The following code was used get a summary output for Pounds in Millions vs. Year:

lobster <- read.csv("C:/Users/nparker97/Desktop/Lobster.csv")
model1 =lm(Pounds_M~Year+I(Year^2), lobster, model=TRUE)
summary(model1)
## 
## Call:
## lm(formula = Pounds_M ~ Year + I(Year^2), data = lobster, model = TRUE)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -18.202  -5.559   0.720   5.449  23.245 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.069e+05  1.159e+04   17.85   <2e-16 ***
## Year        -2.100e+02  1.169e+01  -17.96   <2e-16 ***
## I(Year^2)    5.327e-02  2.947e-03   18.08   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.067 on 64 degrees of freedom
## Multiple R-squared:  0.9391, Adjusted R-squared:  0.9372 
## F-statistic: 493.8 on 2 and 64 DF,  p-value: < 2.2e-16
model1[1]
## $coefficients
##   (Intercept)          Year     I(Year^2) 
##  2.069017e+05 -2.099653e+02  5.327167e-02

Since the relationship for Pounds in Millions and Year hasn’t been determined, a null and alternative hypothesis will be made. The null hypothesis will be that there isn’t a relationship between Pounds in Millions and Year. The alternative hypothesis is that there is a relationship between Pounds in Millions and Year.

The following code was used to create a scatter plot for Pounds in Millions vs. Year:

lobster <- read.csv("C:/Users/nparker97/Desktop/Lobster.csv")
library(ggplot2)
ggplot(data = lobster) +geom_point(mapping = aes(x = Year, y = Pounds_M))+ggtitle("Pounds in Millions vs. Year")+
theme(plot.title = element_text(hjust = 0.5))

FIG 1. The following plot shows the relationship between Pounds in Millions vs. Year. Pounds has the units of lbs.

The following code was used get a summary output for Value in Millions vs. Year:

lobster <- read.csv("C:/Users/nparker97/Desktop/Lobster.csv")
model2 =lm(Value_M~Year+I(Year^2), lobster, model=TRUE)
summary(model2)
## 
## Call:
## lm(formula = Value_M ~ Year + I(Year^2), data = lobster, model = TRUE)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -80.804 -22.500   0.976  20.890  96.605 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.221e+05  4.588e+04   15.74   <2e-16 ***
## Year        -7.341e+02  4.627e+01  -15.87   <2e-16 ***
## I(Year^2)    1.866e-01  1.167e-02   15.99   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.94 on 64 degrees of freedom
## Multiple R-squared:  0.9451, Adjusted R-squared:  0.9434 
## F-statistic: 551.1 on 2 and 64 DF,  p-value: < 2.2e-16
model2[1]
## $coefficients
##   (Intercept)          Year     I(Year^2) 
##  7.221135e+05 -7.341338e+02  1.865869e-01

Since the relationship for Value in Millions and Year hasn’t been determined, a null and alternative hypothesis will be made. The null hypothesis will be that there isn’t a relationship between Value in Millions and Year. The alternative hypothesis is that there is a relationship between Value in Millions and Year.

The following code was used to create a scatter plot for Value in Millions vs. Year:

lobster <- read.csv("C:/Users/nparker97/Desktop/Lobster.csv")
library(ggplot2)
ggplot(data = lobster) +geom_point(mapping = aes(x = Year, y = Value_M))+ggtitle("Value in Millions vs. Year")+
theme(plot.title = element_text(hjust = 0.5))

FIG 2. The following plot shows the relationship between Value in Millions vs. Year. Value has the units of dollars.

Results

FIG. 1 depicted the poundage of lobsters caught over the years. Lobsters were caught in fewer quantities from the 50’s until the 1980’s. However, during the 90’s, there was a pattern of growth found. Lobsters, as mentioned before, started to experience a boom in the 90’s, and started to experience a significant increase. Because the need for lobsters was higher than the total poundage caught, the value of lobsters went up as well. It was also found that the p-values for all three coefficients were less than <2e-16, which means. Due to this, the null hypothesis was rejected, and the alternative hypothesis, which stated that there was a relationship between the variables, was accepted. It should also be noted that the r-squared was 0.9391. This meant that Year accounted for 93.91% of the variability in Pounds_M. This r-value was very high, and was almost at the point of being close to 1. This indicated that there was a strong positive relationship between the two variables, and that Year was able to effectively predict Pounds_M. This data could be modeled by Predicted Pounds in Millions = 2.069e+05 -2.099e+02(Year) + 5.327e-02(Year^2). FIG. 2 depicted the value of lobsters caught over the years. During the 90’s, the value of lobster has increased over the years. This does match the growth of how much lobsters have been caught over the years as well. These graphs were connected quite closely, and show an exponential growth of both value and pounds caught from the 90’s to our current day. It was also found that the p-values for all three coefficients were less than <2e-16, which means. Due to this, the null hypothesis was rejected, and the alternative hypothesis, which stated that there was a relationship between the variables, was accepted. It should also be noted that the r-squared was 0.9451. This meant that Year accounted for 94.51% of the variability in Value_M. This r-value was very high, and was almost at the point of being close to 1. This indicated that there was a strong positive relationship between the two variables, and that Year was able to effectively predict Value_M. This data could be modeled by Predicted Value in Millions = 7.221e+05 -7.341e+02(Year) + 1.865e-01(Year^2). The medians for both graphs was a good tell for similar growth in each graph. The median was close to 0, which showed that the residuals were symmetrically distributed across both graphs (Pounds in Millions being 0.720 and Value in Millions being 0.976).

Conclusions

Through our models, we were able to show a clear connection of how the value and poundage of lobsters have changed. Because their value increased over time, the amount caught and the price for them had increased. This shows a classic example of supply and demand. Usually, having such a high demand for a product means a scarcity of that product has come to be. The data and its results showed that the price of lobsters increased exponentially within the last 20 years or so. In addition to this, the amount of lobsters caught increased as well. Due to this, it was proven that there was a relationship between Year and Pounds in Millions, and Year and Value in Millions. While this data only reported numbers, it should also be noted that lobsters are being impacted by climate change. Due to this, the trends presented by the following models might not be that relevant in the upcoming years. It is possible that climate change could impact the relationships between the variables that were examined here. It should also be noted that the data only went as far as 2016, so the past three years aren’t included in the data. Due to this, this analysis is only valid up to the last year of data. In addition to this, trying to use the models to guess the results from the past three years is considered extrapolation, which would, in turn, give results that assume the conditions, such as the relationship between the variables, remain the same. In addition to this, there is also the possibility of over fishing, which would lead to a decline in the population, and the possibility of making it become endangered.

Sources

“Lobsters.” Gulf of Maine Research Institute: Kinds of Lobsters, Gulf of Maine Research Institute, 27 Feb. 2012, www.gma.org/lobsters/allaboutlobsters/species.html.