# The following code chunk identifies the packages used in the data analysis and identifies how to load the packages into R Studio.
library(tidyverse) # The tidyverse package is crucial as it contains the packages tidyr, ggplot2, dplyr, and more, used throughout this code.
library(RColorBrewer) # Optional package: used to apply colour changes to graphs.
library(PairedData)
library(ggpubr)
library(car)
library(patchwork)
Loading Packages
This section is used to load the R Studio packages used in this data analysis project:
Importing Data and Data Wrangling
This section is used to load in the data and perform data wrangling.
# Loading in the data.
<- read.csv("C:\\Users\\jacob\\OneDrive\\Uni\\Masters\\Sharks_Data\\sharks.csv")
sharks
<- read.csv("C:\\Users\\jacob\\OneDrive\\Uni\\Masters\\Sharks_Data\\sharksub.csv") sharksub
Looking at the data:
str(sharks) # provides a different overview of the data
'data.frame': 500 obs. of 10 variables:
$ ID : chr "SH001" "SH002" "SH003" "SH004" ...
$ sex : chr "Female" "Female" "Female" "Male" ...
$ blotch: num 37.2 34.5 36.3 35.3 37.4 ...
$ BPM : int 148 158 125 161 138 126 166 135 132 127 ...
$ weight: num 74.7 73.4 71.8 104.6 67.1 ...
$ length: num 187 189 284 171 264 ...
$ air : num 37.7 35.7 34.8 36.2 33.6 ...
$ water : num 23.4 21.4 20.1 21.6 21.8 ...
$ meta : num 64.1 73.7 54.4 86.3 108 ...
$ depth : num 53.2 49.6 49.4 50.3 49 ...
str(sharksub) # provides a different overview of the data
'data.frame': 50 obs. of 4 variables:
$ ID : chr "SH269" "SH163" "SH008" "SH239" ...
$ sex : chr "Female" "Female" "Female" "Female" ...
$ blotch1: num 36.1 33.4 36.3 35 35.7 ...
$ blotch2: num 37.2 34.4 36.5 36 36.8 ...
Checking for Null Values:
colSums(is.na(sharks)) # Function to check data for na or null values
ID sex blotch BPM weight length air water meta depth
0 0 0 0 0 0 0 0 0 0
colSums(is.na(sharksub)) # Function to check data for na or null values
ID sex blotch1 blotch2
0 0 0 0
Preparing the Data
# A new data set is created from a wide format to a long format to be able to create a boxplot. This new data is saved as Blotching and contains the data for blotch 1 and 2 and their associated value.
<- sharksub %>% # Identifying data to use, pipe operator (%>%) is used to tell R to use that data or line in the upcoming function.
Blotching pivot_longer( # pivot_longer is used to change the data from a wide to a long format.
cols = c(blotch1, blotch2), # Creates a new column with the selected variables.
names_to = "Observation", # This is naming the new column.
values_to = "BlotchingTime" # This is naming the column with the associated values from the column.
)
Data Analysis
Summary of Data Sets
The data can now be explored. First, the summary function will be used to create an analysis of the data and the variables. This analysis is helpful in determining the minimum, maximum, mean, median, and quartiles of the data.
summary(sharks) # Creates a summary of the data.
ID sex blotch BPM
Length:500 Length:500 Min. :30.78 Min. :119.0
Class :character Class :character 1st Qu.:34.16 1st Qu.:129.0
Mode :character Mode :character Median :35.05 Median :142.0
Mean :35.13 Mean :141.8
3rd Qu.:36.05 3rd Qu.:153.2
Max. :40.08 Max. :166.0
weight length air water
Min. : 65.10 Min. :128.3 Min. :33.00 Min. :20.01
1st Qu.: 75.68 1st Qu.:172.0 1st Qu.:34.42 1st Qu.:21.55
Median : 87.82 Median :211.1 Median :35.43 Median :23.11
Mean : 87.94 Mean :211.0 Mean :35.54 Mean :23.02
3rd Qu.:100.40 3rd Qu.:251.8 3rd Qu.:36.71 3rd Qu.:24.37
Max. :110.94 Max. :291.0 Max. :38.00 Max. :25.99
meta depth
Min. : 50.03 Min. :44.64
1st Qu.: 67.39 1st Qu.:48.90
Median : 82.45 Median :50.14
Mean : 82.04 Mean :50.14
3rd Qu.: 95.97 3rd Qu.:51.35
Max. :112.45 Max. :56.83
summary(sharksub) # Creates a summary of the data.
ID sex blotch1 blotch2
Length:50 Length:50 Min. :32.49 Min. :33.47
Class :character Class :character 1st Qu.:34.38 1st Qu.:35.31
Mode :character Mode :character Median :34.94 Median :35.94
Mean :35.03 Mean :35.96
3rd Qu.:35.90 3rd Qu.:36.78
Max. :37.07 Max. :38.18
Identifying the number of males and females:
table(sharks$sex) # Table of number of males and females from first capture event
Female Male
236 264
table(sharksub$sex) # obtaining the number of males and females of those recorded twice
Female Male
25 25
Looking at the data by sex:
# Creating a summary table by sex for the first capture event.
<- sharks %>% # Creation of new dataset using the sharks data
sharks.by.sex group_by(sex) %>% # grouping the data together by sex
summarize(
mean_blotch = mean(blotch),
sd_blotch = sd(blotch),
mean_weight = mean(weight),
mean_length = mean(length),
mean_BPM = mean(BPM),
mean_meta = mean(meta),
mean_depth = mean(depth) # Summarising the variables by the sex and the means
%>%
) ungroup() # Ungrouping the data, crucial to ensure future code is not grouped together.
# Renaming the column headers:
<- sharks.by.sex %>%
sharks.by.sex rename(
`Sex` = sex,
`Mean Blotch` = mean_blotch,
`SD Blotch` = sd_blotch,
`Mean Weight` = mean_weight,
`Mean Length` = mean_length,
`Mean BPM` = mean_BPM,
`Mean Cortisol` = mean_meta,
`Mean Depth` = mean_depth
)print(sharks.by.sex) # Viewing the data
# A tibble: 2 × 8
Sex `Mean Blotch` `SD Blotch` `Mean Weight` `Mean Length` `Mean BPM`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Female 34.9 1.39 88.1 212. 142.
2 Male 35.3 1.44 87.8 210. 142.
# ℹ 2 more variables: `Mean Cortisol` <dbl>, `Mean Depth` <dbl>
# Creating a table of the second capture event
<- sharksub %>% # Creation of new dataset using the sharksub data
sharksub.by.sex group_by(sex) %>% # grouping the data together by sex
summarize(
mean_blotch1 = mean(blotch1),
sd_blotch1 = sd(blotch1),
mean_blotch2 = mean(blotch2),
sd_blotch2 = sd(blotch2)# Summarising the variables by the sex, with the means and sd of blotch 1 and 2 being created.
%>%
) ungroup() # Ungrouping the data
# Renaming the column headers:
<- sharksub.by.sex %>%
sharksub.by.sex rename(
`Mean Blotch 1` = mean_blotch1,
`SD Blotch 1` = sd_blotch1,
`Mean Blotch 2` = mean_blotch2,
`SD Blotch 2` = sd_blotch2
)print(sharksub.by.sex) # Viewing the data
# A tibble: 2 × 5
sex `Mean Blotch 1` `SD Blotch 1` `Mean Blotch 2` `SD Blotch 2`
<chr> <dbl> <dbl> <dbl> <dbl>
1 Female 34.8 1.08 35.8 1.11
2 Male 35.3 1.09 36.2 1.20
Statistical Analysis
1. Correlation Between Air and Water Temperature
Visualising the Data
As blotch, air and water are quantitative data, a scatter plot is the best graph to use here:
# Creating a scatterplot for Air temp against Blotch time
# Saving graph to air.plot data set to combine with the water.plot below
<- sharks %>% # Calling in data
air.plot ggplot(aes(
x = air, # Defining the x-axis variable
y = blotch, # Defining the y-axis variable
+
)) geom_point(alpha = 1, size = 1) + # Creating a scatter plot
geom_smooth(method = "lm", se = FALSE, color = "darkgrey", linewidth = 1) + # Applying a line of best fit using the linear model for a smooth line.
theme_minimal(base_size = 13) + # Applying the minimal theme to the graph
labs (
x = "Air Temperature (°C)", # Labelling of the x-axis
y = "Blotch Time (s)" # Labelling of the y-axis
)
# Creating a scatterplot for blotch time against water temp.
# Saving graph to water.plot data set to combine with the air.plot below
<- sharks %>%
water.plot ggplot(aes(
x = water,
y = blotch,
+
)) geom_point(alpha = 1, size = 1) +
geom_smooth(method = "lm", se = FALSE, color = "darkgrey", linewidth = 1) +
theme_minimal(base_size = 13) +
labs (
x = "Water Temperature (°C)",
y = "Blotch Time (s)"
)
# Using patchwork package, the 2 graphs can be combined into a single graph:
<- water.plot + air.plot + plot_layout(ncol = 1)
air.water air.water
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
Checking for normality
Before a correlation test can be performed, the data needs to be checked for normality to determine whether Pearson’s or Spearman’s Correlation is used.
# Performing Shapiro-Wilk on variables for normality
shapiro.test(sharks$air)
Shapiro-Wilk normality test
data: sharks$air
W = 0.95885, p-value = 1.338e-10
shapiro.test(sharks$water)
Shapiro-Wilk normality test
data: sharks$water
W = 0.96035, p-value = 2.371e-10
shapiro.test(sharks$blotch)
Shapiro-Wilk normality test
data: sharks$blotch
W = 0.99695, p-value = 0.4769
As water and air is not normally distributed, Spearmans is used
Spearman’s Correlation Test
cor.test(sharks$air, sharks$blotch, method = "spearman") # Performing the correlation test using Spearman's method.
Spearman's rank correlation rho
data: sharks$air and sharks$blotch
S = 21469324, p-value = 0.4956
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
-0.03053167
cor.test(sharks$water, sharks$blotch, method = "spearman") # Performing the correlation test using Spearman's method.
Spearman's rank correlation rho
data: sharks$water and sharks$blotch
S = 21758692, p-value = 0.3214
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
-0.04442139
Both show slight negative correlation but close to 0, so the relationship is very limited.
2. Effect of Multiple Captures on Blotching Time
Research Questions
To check for the impact of multiple captures on the blotching time, a Paired T-test will be used if the data is normally distributed, or a Wilcoxon test if the data is not normal.
First, however, the research question needs to be established.
I want to test whether the mean blotching time on the second capture is significantly different from the first blotch time.
The null hypothesis is if there is no significant difference between the two blotch times.
The alternative hypothesis is that the two blotch times are significantly different.
Preparing and Visualising the Data
Summarising the Data:
%>%
Blotching group_by(Observation) %>% # Grouping this data by blotch1 and blotch 2
summarise(
count = n(), # Calculating the numnber in blotch1 and blotch 2
mean = mean(BlotchingTime), # Calculating the mean time of blotching
sd = sd(BlotchingTime) # Calculating the standard deviation of blotching
)
# A tibble: 2 × 4
Observation count mean sd
<chr> <int> <dbl> <dbl>
1 blotch1 50 35.0 1.10
2 blotch2 50 36.0 1.16
# Output is a summary table.
Visualising the Data
# Creating a boxplot for the blotching of capture 1 and 2.
<- Blotching %>% # saving as new dataset for presentation later
box.plot ggplot(aes(x = Observation, y = BlotchingTime, fill = Observation)) +
geom_boxplot() + # Creating a boxplot graph with the ggplot variables
labs(
x = "Capture Event",
y = "Blotching Time (seconds)"
+
) theme_minimal(base_size = 13) + # Changes the graph aesthetics
theme(legend.position = "none") + # Removing the legend
scale_fill_brewer(palette = "Greys") # setting colour to journal style
This can be put into a paired data plot:
<- Blotching %>% # saving as new dataset for presentation later
paired.plot ggplot(aes(x = Observation, y = BlotchingTime, group = ID, color = Observation)) +
geom_line(linewidth = 1, color = "darkgrey", alpha = 0.7) + # Adding of the lines to the scatter plot
geom_point(size = 1.8) +
theme_minimal(base_size = 13) +
theme(legend.position = "none") +
labs(x = "Capture Event",
y = "Blotching Time (seconds)") +
scale_color_grey(start = 0.6, end = 0.3)
# Using patchwork package, the 2 graphs can be combined into a single graph:
<- box.plot + paired.plot + plot_layout(ncol = 2)
combined combined
Preliminary test to check paired t-test assumptions:
- Are the 2 samples paired? Yes, both are a time for blotching.
- Is this a large sample? No, I will test for normality on data.
Visualising the normality of the data as a histogram:
<- sharksub %>% # Creating a new data set with new column in the sharksub data with the difference between blotch 1 and 2.
sharksub2 mutate(Difference = blotch2 - blotch1)
%>%
sharksub2 ggplot(aes(x = Difference))+
geom_histogram(binwidth = 0.1, fill = "lightgrey", color = "black") + # Creating a histogram
theme_minimal(base_size = 13) +
labs(x = "Difference",
y = "Count")
The histogram hints that the data may not be normally distributed and will be confirmed with the Shapiro-Wilk test.
# This code creates a new data set with the difference between blotch1 and blotch2, which is then tested for normality in the Shapiro-Wilk test.
<- sharksub$blotch1 - sharksub$blotch2
difference shapiro.test(difference)
Shapiro-Wilk normality test
data: difference
W = 0.43157, p-value = 1.212e-12
As the p-value is below our significance level of 0.05, we assume that the data deviates significantly from normality.
Therefore, the Wilcoxon signed-rank test is used.
Wilcoxon Signed-rank Test
# Running of the Wilcoxon Signed-rank test
wilcox.test(sharksub$blotch1, sharksub$blotch2, paired = TRUE)
Wilcoxon signed rank test with continuity correction
data: sharksub$blotch1 and sharksub$blotch2
V = 12, p-value = 1.606e-09
alternative hypothesis: true location shift is not equal to 0
%>%
Blotching ggplot(aes(x = sex, y = BlotchingTime, fill = Observation)) + # Defining the axis
geom_boxplot() + # Creating the boxplot
theme_minimal(base_size = 13)+ # Setting the graph theme
labs(x = "Sex",
y = "Blotching Time (s)") + # Labelling of the axis
scale_fill_brewer(palette = "Greys") # applying a colour scheme
3. Predicting Blotching Time
Setting Up the Model
This question will be used for a linear regression model, so the response and predictors need to be identified.
For this test, the blotch variable from the sharks’ data is the response variable, as it is what we want to predict.
While the predictor variables are the others in the data: sex, weight, length, air, water, meta and depth.
Visualising the Data and Trends
This section was used to visualise if any variables had a relationship with blotching.
Continuous Data
Weight
%>%
sharks ggplot(aes(x = weight, y = blotch)) + # Defining the variables on the axis
geom_point(alpha = 0.6) + # Creating a scatter plot
geom_smooth(method = "lm", se = FALSE) + # Adding a line of best fit into the model
theme_minimal(base_size = 13) + # Setting the theme of the graph
labs(x = "Weight (Kg)",
y = "Blotch Time (seconds)") # Labelling of the axis
`geom_smooth()` using formula = 'y ~ x'
Length
%>% #
sharks ggplot(aes(x = length, y = blotch)) + # Defining the variables on the axis
geom_point(alpha = 0.6) + # Creating a scatter plot
geom_smooth(method = "lm", se = FALSE) + # Adding a line of best fit into the model
theme_minimal(base_size = 13) + # Setting the theme of the graph
labs(x = "Length (cm)",
y = "Blotch Time (seconds)") # Labelling of the axis
`geom_smooth()` using formula = 'y ~ x'
Air and Water Temperature
These variables were plotted against blotch time in question one and were found to have no relationship, therefore, another scatter is not made for them.
Meta
%>%
sharks ggplot(aes(x = meta, y = blotch)) + # Defining the variables on the axis
geom_point(alpha = 0.6) + # Creating a scatter plot
geom_smooth(method = "lm", se = FALSE) + # Adding a line of best fit into the model
theme_minimal(base_size = 13) + # Setting the theme of the graph
labs(x = "Cortisol levels (mcg/dl)",
y = "Blotch Time (seconds)") # Labelling of the axis
`geom_smooth()` using formula = 'y ~ x'
Depth
%>%
sharks ggplot(aes(x = depth, y = blotch)) + # Defining the variables on the axis
geom_point(alpha = 0.6) + # Creating a scatter plot
geom_smooth(method = "lm", se = FALSE, linewidth = 1.2, color = "darkgrey") + # Adding a line of best fit into the model
theme_minimal(base_size = 13) + # Setting the theme of the graph
labs(x = "Depth (m)",
y = "Blotch Time (seconds)") # Labelling of the axis
`geom_smooth()` using formula = 'y ~ x'
The graph shows a potentially good indicator for blotch time on sharks due to the steep slope line of best fit; a correlation test needs to be run to confirm. It identifies that sharks at greater depths took longer to blotch than those at shallower depths.
shapiro.test(sharks$blotch) # Running normality test on Blotch data
Shapiro-Wilk normality test
data: sharks$blotch
W = 0.99695, p-value = 0.4769
shapiro.test(sharks$depth) # Running normality on depth data.
Shapiro-Wilk normality test
data: sharks$depth
W = 0.99746, p-value = 0.6485
cor.test(sharks$depth, sharks$blotch, method = "pearson") # Testing for normality of normally distributed data
Pearson's product-moment correlation
data: sharks$depth and sharks$blotch
t = 22.772, df = 498, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6683963 0.7546509
sample estimates:
cor
0.7142247
Categorical data
Sex
Visualising categorical data:
%>%
sharks ggplot(aes(x = sex, y = blotch, fill = sex)) + # Defining the variable for each axis
geom_boxplot() + # Creating a boxplot
labs(x = "Sex",
y = "Blotch Time (seconds)") + # Labelling of the x- and y-axis
theme_minimal(base_size = 13) + # Setting the theme of the boxplot
scale_fill_brewer(palette = "Greys") # applying a colour scheme
To test to see if sex is a reliable predictor for blotching time, it needs to be tested to see if there is a significant difference between the blotching times; this can be done with a t-test if normally distributed or Wilcoxon Rank-sum test if not:
# Performin Shapiro-Wilk on Male and Female data for normality
shapiro.test(sharks$blotch[sharks$sex == "Male"])
Shapiro-Wilk normality test
data: sharks$blotch[sharks$sex == "Male"]
W = 0.99209, p-value = 0.1701
shapiro.test(sharks$blotch[sharks$sex == "Female"])
Shapiro-Wilk normality test
data: sharks$blotch[sharks$sex == "Female"]
W = 0.99527, p-value = 0.682
As male and female are normally distributed a t-test is used:
t.test(blotch ~ sex, data = sharks) # Performing a T-Test
Welch Two Sample t-test
data: blotch by sex
t = -3.0282, df = 494.67, p-value = 0.002589
alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
95 percent confidence interval:
-0.6322714 -0.1346620
sample estimates:
mean in group Female mean in group Male
34.92294 35.30641
Running LM Model
To get an idea of the predictors for blotching time, a simple LM will be run first:
<- lm(blotch ~ sex + BPM + weight + length + air + water + meta + depth, data = sharks) # Creating a linear model with blotch time as the response variable, with all the other variables as the predictor variables.
blotch.model summary(blotch.model) # Viewing the results of the model.
Call:
lm(formula = blotch ~ sex + BPM + weight + length + air + water +
meta + depth, data = sharks)
Residuals:
Min 1Q Median 3Q Max
-2.97715 -0.66193 -0.00841 0.64123 2.90395
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.1179728 1.8749828 5.930 5.73e-09 ***
sexMale 0.3088617 0.0890602 3.468 0.00057 ***
BPM -0.0020791 0.0031540 -0.659 0.51009
weight 0.0017281 0.0033143 0.521 0.60231
length 0.0013042 0.0009606 1.358 0.17517
air -0.0310068 0.0315302 -0.983 0.32590
water -0.0143878 0.0268112 -0.537 0.59176
meta -0.0011610 0.0025671 -0.452 0.65127
depth 0.5034077 0.0220870 22.792 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.9912 on 491 degrees of freedom
Multiple R-squared: 0.5256, Adjusted R-squared: 0.5178
F-statistic: 67.99 on 8 and 491 DF, p-value: < 2.2e-16
To reduce overfitting of the results, a model with reduced predictors will be used:
<- lm(blotch ~ depth + sex, data = sharks) # performing linear model with only the significant predictors
blotch.clean summary(blotch.clean)
Call:
lm(formula = blotch ~ depth + sex, data = sharks)
Residuals:
Min 1Q Median 3Q Max
-2.96224 -0.64833 -0.01317 0.58882 2.98829
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.80938 1.10029 8.915 < 2e-16 ***
depth 0.50172 0.02194 22.864 < 2e-16 ***
sexMale 0.30379 0.08871 3.424 0.000667 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.9895 on 497 degrees of freedom
Multiple R-squared: 0.5214, Adjusted R-squared: 0.5195
F-statistic: 270.7 on 2 and 497 DF, p-value: < 2.2e-16
# Performing lm model for just sex, to see this variables R-Squared value
<- lm(blotch ~ sex, data = sharks)
sexlm summary(sexlm)
Call:
lm(formula = blotch ~ sex, data = sharks)
Residuals:
Min 1Q Median 3Q Max
-4.1471 -0.9653 -0.0222 0.9889 4.7772
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.92294 0.09217 378.885 < 2e-16 ***
sexMale 0.38347 0.12685 3.023 0.00263 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.416 on 498 degrees of freedom
Multiple R-squared: 0.01802, Adjusted R-squared: 0.01605
F-statistic: 9.139 on 1 and 498 DF, p-value: 0.002632
# Performing lm on just the depth variable to see the influence it has on blotch time.
<- lm(blotch ~ depth, data = sharks)
depthlm summary(depthlm)
Call:
lm(formula = blotch ~ depth, data = sharks)
Residuals:
Min 1Q Median 3Q Max
-2.81869 -0.65427 -0.01035 0.58825 2.83116
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.82178 1.11207 8.832 <2e-16 ***
depth 0.50467 0.02216 22.772 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1 on 498 degrees of freedom
Multiple R-squared: 0.5101, Adjusted R-squared: 0.5091
F-statistic: 518.6 on 1 and 498 DF, p-value: < 2.2e-16
Checking the assumptions and performance of the LM
Histogram:
# Creating a histogram for the residuals of the model.
hist(blotch.clean$residuals,
breaks = 20, # Setting number of bins
col = "lightgrey", # Fill colour
border = "black", # Boarder colour
xlab = "Residual Values",
ylab = "Frequency", # Labelling of axis
main = "") # Removing preset title
Shapiro-Wilk:
Testing for data normality in the residuals
shapiro.test(blotch.clean$residuals) # Running code to test for normality
Shapiro-Wilk normality test
data: blotch.clean$residuals
W = 0.9978, p-value = 0.7661
The residual plots aid in checking the model’s validity, with the normality and showing the residuals follow a linear model.
plot(blotch.clean) # Plotting the residuals
The residuals vs fitted plot shows a reasonably straight, which indicates that the residuals likely follow a linear model, and this is the appropriate test to use. The Q-Q plot or Quantile-Quantile plot identifies that the data follows normality by not straying from the line; this is supported by the Shapiro-Wilk test.
ANOVA
Testing for the overall significance of the model
<- anova(blotch.clean) # Running the ANOVA test on the linear model
blotch.anova # Displaying the results of the ANOVA test blotch.anova
Analysis of Variance Table
Response: blotch
Df Sum Sq Mean Sq F value Pr(>F)
depth 1 518.70 518.70 529.739 < 2.2e-16 ***
sex 1 11.48 11.48 11.727 0.0006671 ***
Residuals 497 486.64 0.98
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This ANOVA test identifies that depth has the most notable impact on the blotching of sharks, producing a far more significant F value than sex. However, sex is still somewhat of a predictor for the effect of blotching on the sharks.