Predicting Volume of a Tree

A Project for 'Developing Data Products' Course

Avinayan

Predicting Tree Volume

  • The aim of this exercise is to predict the volume of the tree given its girth-width and its height.
  • Assuming that the tree is like a cone, one use the geometric formula for cone to calculate the volume of a tree. But the trees are not always a perfect cone.
  • In this exercise, we use the trees dataset from the R 'datasets' package. This dataset contains 31 observations of trees with their girth, height and volume.
  • Pitch:
    1. Let's see if building a Linear model using this dataset gives us a better predictor of the tree volume.
    2. Once we have the model, let's see if we can convert it into an interactive application that predicts the tree volume using girth and height.

Linear Model for Tree Volume

  • From the Geometric formula we know that the volume of the tree is proportional to:
    1. Square of the radius; here that is girth-width.
    2. Height of the tree.
  • So we develop a linear model using these two variables as the explanatory variables
  • The model fit results are shown on the next slide
treesdata <- trees
treesdata$GirthSquared <- treesdata$Girth^2
modelfit <- lm(Volume ~ GirthSquared + Height, data = treesdata)

Linear Model - Results

summary(modelfit)
## 
## Call:
## lm(formula = Volume ~ GirthSquared + Height, data = treesdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.8844 -2.2105  0.1196  2.6134  4.2404 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -27.511603   6.557697  -4.195 0.000248 ***
## GirthSquared   0.168458   0.006679  25.222  < 2e-16 ***
## Height         0.348809   0.093152   3.744 0.000830 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.799 on 28 degrees of freedom
## Multiple R-squared:  0.9729, Adjusted R-squared:  0.971 
## F-statistic: 503.2 on 2 and 28 DF,  p-value: < 2.2e-16

Next we move to the conclusions

Conclusion

  • As can be seen from the the R-squared value of the model, 97% of the variance is explained. The explanatory variables also have low p-values.
  • Hence, overall the linear model is a good predictor of tree volume using the girth-width and the height of the tree.
  • We can then use predict function in R to predict the tree volume for different values of girth and height.
  • A Shiny application that performs this prediction interactively was built and is available here
  • Interacting with this tool it can be observed that the girth-width has a bigger impact on the volume of the tree that the height of the tree.