Primero se grafican los datos
arboles <- data("Loblolly")
plot(height~age,data=Loblolly)
lm.fit=lm(height~age,data=Loblolly)
lm.fit
##
## Call:
## lm(formula = height ~ age, data = Loblolly)
##
## Coefficients:
## (Intercept) age
## -1.312 2.591
summary(lm.fit)
##
## Call:
## lm(formula = height ~ age, data = Loblolly)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.0207 -2.1672 -0.4391 2.0539 6.8545
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.31240 0.62183 -2.111 0.0379 *
## age 2.59052 0.04094 63.272 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.947 on 82 degrees of freedom
## Multiple R-squared: 0.9799, Adjusted R-squared: 0.9797
## F-statistic: 4003 on 1 and 82 DF, p-value: < 2.2e-16
plot(height~age,data=Loblolly)
abline(-1.312 , 2.591 ,col='purple')
Para predecir los 5 nuevos registros valor se utiliza la funcion predict.lm:
x<-c(7,12,18,23,28)
predict.lm(lm.fit,data.frame(age=x))
## 1 2 3 4 5
## 16.82127 29.77388 45.31702 58.26964 71.22225
y = c(16.82127, 29.77388, 45.31702, 58.26964, 71.22225)
newAge = c(Loblolly$age, x)
newHeight = c(Loblolly$height, y)
plot(newHeight~newAge)
lm.fit=lm(newHeight~newAge)
lm.fit
##
## Call:
## lm(formula = newHeight ~ newAge)
##
## Coefficients:
## (Intercept) newAge
## -1.312 2.591
summary(lm.fit)
##
## Call:
## lm(formula = newHeight ~ newAge)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.0207 -2.1392 -0.2702 1.9472 6.8545
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.31240 0.59217 -2.216 0.0293 *
## newAge 2.59052 0.03836 67.527 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.861 on 87 degrees of freedom
## Multiple R-squared: 0.9813, Adjusted R-squared: 0.9811
## F-statistic: 4560 on 1 and 87 DF, p-value: < 2.2e-16
plot(newHeight~newAge)
abline(-1.312,2.591,col='purple')
No se notan cambios en los coeficientes debido a que la aproximacion de puntos a la hora de generarlos ya esta tomando esos coeficientes por lo que los crea consecuentemente.
Loblolly$height[44] = Loblolly$height[44] * 100
Loblolly$age[44] = Loblolly$age[44] * 100
plot(height~age,data=Loblolly)
lm.fit=lm(height~age,data=Loblolly)
lm.fit
##
## Call:
## lm(formula = height ~ age, data = Loblolly)
##
## Coefficients:
## (Intercept) age
## 4.878 2.114
summary(lm.fit)
##
## Call:
## lm(formula = height ~ age, data = Loblolly)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.761 -4.994 1.302 3.611 8.657
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.878332 0.555407 8.783 1.94e-13 ***
## age 2.114216 0.009808 215.555 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.799 on 82 degrees of freedom
## Multiple R-squared: 0.9982, Adjusted R-squared: 0.9982
## F-statistic: 4.646e+04 on 1 and 82 DF, p-value: < 2.2e-16
plot(height~age,data=Loblolly)
abline(4.878,2.114,col='purple')
Se puede observar una aumentos significativo en ambos coeficientes(intersecto y pendiente) esto debido a la exageracion de un dato del dataset estudiado.