ggplot(ChannelIslands, aes(x = Area, y = Total)) +
geom_point(aes(color = Area, size = Area)) +
geom_smooth() +
xlab("Island Area (km²)") +
ylab("Total Plant Species") +
ggtitle("Total Species on Land Area")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
The plot suggests a positive correlation between island land area and total plant species. As the size of the island increases, the number of plant species also increases.
Santa Barbara has the fewest species. It is also the smallest island.
Santa Cruz has the most species. It is also the largest island.
The smooth line shows there is a positive linear relationship between the two variables.
The slope coefficient is 1.2376, meaning there is a positive correlation between the two variables. It can be concluded that larger land area correlates with more native species.
You would expect to find around 125 native species on an island with a size of 0.0 km² because the intercept is 124.8303.
The t-value is a positive number meaning there is a positive correlation between land area and number of species. The Pr(>|t|) is less than 0.05, meaning the null hypothesis that the relationship between land area and number of species is random can be rejected.
The R-squared value is 0.9033 meaning 90.33% of the native species variation is caused by land area and the rest is caused by other factors.
The total sum of squares is 142434.9.
The value of the error sum of squares is 13767.48.
The proportionate reduction of SSE relative to SSY is 0.9033419.
The value is 56.07448 which corresponds to the F-statistic.
ggplot(data=ChannelIslands) +
geom_point(mapping=aes(x=Area, y=Native), color="forestgreen", shape=15, size=2.5) +
geom_smooth(mapping=aes(x=Area, y=Native), color="forestgreen") +
geom_point(mapping=aes(x=Area, y=Endemic), color="dodgerblue", shape=16, size=2.5) +
geom_smooth(mapping=aes(x=Area, y=Endemic), color="dodgerblue") +
geom_point(mapping=aes(x=Area, y=Exotic), color="firebrick1", shape=17, size=2.5) +
geom_smooth(mapping=aes(x=Area, y=Exotic), color="firebrick1") +
theme_gray()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Native species have the steepest positive slope, exotic species have a more moderate positive slope, and endemic species have the most moderate slope.
Residuals <- mymodel$residuals
Distance <- ChannelIslands$Dist
Island <- ChannelIslands$Island
ResidData <- data.frame(Island, Residuals, Distance)
ggplot(data=ResidData, aes(x=Distance, y=Residuals)) +
geom_point() +
geom_smooth(method="lm")
## `geom_smooth()` using formula = 'y ~ x'
The model shows the relationship between the residuals and the distance the island is from the main land. The negative slope shows that as an island gets further from the mainland, it has fewer native species than expected based on the size of the island. The t value and p value suggest a statistically significant relationship and the r-squared reveals that about 71.51% of the decrease is explained by distance from mainland.
The model suggests a negative relationship between residuals and island distance from mainland. As distance from the mainland increases, the modeled richness based on the model of Native Species on Area decreases.
resid_model <- lm(Residuals ~ Distance, data=ResidData)
summary(resid_model)
##
## Call:
## lm(formula = Residuals ~ Distance, data = ResidData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.849 -18.320 8.098 15.904 29.724
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 71.3151 20.4815 3.482 0.01311 *
## Distance -1.4052 0.3621 -3.880 0.00817 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25.57 on 6 degrees of freedom
## Multiple R-squared: 0.7151, Adjusted R-squared: 0.6676
## F-statistic: 15.06 on 1 and 6 DF, p-value: 0.008167
The t value is -3.880 and the the Pr(>|t|) is 0.00817. The t value and Pr(>|t|) suggest a statistically significant relationship and the r-squared reveals that about 71.51% of the decrease is explained by distance from mainland.The slope is -1.4052 showing a negative relationship between the two variables.
90.33% of total variance in Native Richness was explained by Area and 71.51% of the remaining variance was explained by Distance.