Analysis of Orange Tree Circumference

Anindito De
6-Feb-2019

Problem Statement

The Orange dataset in R datasets package provides 35 observations with the age (in days) and circumference (in mm) of Orange trees. In this application we have built a linear regression model to predict the circumference of an Orange tree given its age.

The approach followed has the below steps.

  • Exploratory analysis to find patterns in the data
  • Building a linera model
  • Validating the model

Exploratory Analysis

Here we plot the circumference against the age.

plot of chunk unnamed-chunk-1

Fitting Regression Model

Next we determine the correlation coefficient between age and circumference. Then we fit a linear model.

cor(Orange$circumference, Orange$age)
[1] 0.9135189
fit <- lm(circumference ~ age, data = Orange)

Validating the Model

Here we validate the model by checking the ANOVA of the model.

anova(fit)
Analysis of Variance Table

Response: circumference
          Df Sum Sq Mean Sq F value    Pr(>F)    
age        1  93772   93772  166.42 1.931e-14 ***
Residuals 33  18595     563                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1