library(TeachingDemos)
x=c(2,2,4,5,6,7,8,9,10)
y=c(7,8,6,7,4,6,4,6,3)
plot(x, y)
Equation of the least-squares: y=-0.44x+8.25 (used put.points.demo to obtain equation)
Point I added: (15,20)
x=c(2,2,4,5,6,7,8,9,10,15)
y=c(7,8,6,7,4,6,4,6,3,20)
plot(x, y)
New least-sqaures fit: y=0.62x+2.86 (used put.points.demo to obtain equation)
x <- c(2,2,4,5,6,7,8,9,10)
y <- c(7,8,6,7,4,6,4,6,3)
x_new <- c(x, 15)
y_new <- c(y, 20)
plot(x_new, y_new,
pch=19, col="black",
xlab="x", ylab="y",
main="Effect of Outlier on Least-Squares Line")
abline(a=8.25, b=-0.44, col="blue", lwd=2)
abline(a=2.86, b=0.62, col="red", lwd=2)
The least-squares line changed dramatically after adding the outlier — the slope went from -0.44 to +0.62. This shows that the resistant line is not influenced by extreme points.
The plot shows that the least-squares line is pulled upward by the outlier, while the resistant line still fits the main cluster of points. This visual confirms that resistant fitting is less sensitive to outliers.
library(MASS)
fit_rlm <- rlm(y_new ~ x_new)
fit_lqs <- lqs(y_new ~ x_new)
summary(fit_rlm)
##
## Call: rlm(formula = y_new ~ x_new)
## Residuals:
## Min 1Q Median 3Q Max
## -2.7003 -1.3976 0.1681 0.9215 14.6286
##
## Coefficients:
## Value Std. Error t value
## (Intercept) 6.3581 1.3597 4.6760
## x_new -0.0658 0.1750 -0.3760
##
## Residual standard error: 2.035 on 8 degrees of freedom
summary(fit_lqs)
## Length Class Mode
## crit 1 -none- numeric
## sing 1 -none- character
## coefficients 2 -none- numeric
## bestone 2 -none- numeric
## fitted.values 10 -none- numeric
## residuals 10 -none- numeric
## scale 2 -none- numeric
## terms 3 terms call
## call 2 -none- call
## xlevels 0 -none- list
## model 2 data.frame list
When fitting the new data, both rlm and lqs give results similar to the resistant line and very different from the least-squares line. This demonstrates that all three methods (resistant, rlm, lqs) are against outliers.