Weighted Least Sqaure regression is used for modelling variables where variances increases as the number of cases increases. This produces a “Fan shaped” scatter plot as shown below.
In R programing, a simple weight parameter is added to account for each case.
x= c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 )
y= c(1, 2, 3, 4, 6, 7, 7, 7, 11, 9, 13, 14, 14, 14, 12, 18, 16, 17, 21, 16, 22, 10, 11, 11, 17, 14, 16, 15, 17, 16, 24, 21, 8, 12, 16, 12, 18, 14, 13, 26, 18, 17, 12, 15)
plot(y~x, cex = 1.3, main = 'Fan Shaped Scatter Plot')
Reference: https://www.youtube.com/watch?v=TIHSZPdLpa4
x= 1:5
y= c(1.1,2.5,3.4,3.8,7)
sd = c(0.3,0.2,0.2,0.1,0.5)
w = 1/sd^2
plot(x,y)
model1 = lm(y~x)
model2 = lm(y~x, weights = w)
plot(y~x, cex = 1.3, main = 'Weighted vs. Simple Linear regression')
lines(x, predict(model1), col ="red", lwd =2)
lines(x, predict(model2), col = "blue", lwd =2)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## 1 2 3 4 5
## 0.16 0.25 -0.16 -1.07 0.82
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.3700 0.8414 -0.440 0.6899
## x 1.3100 0.2537 5.163 0.0141 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8023 on 3 degrees of freedom
## Multiple R-squared: 0.8989, Adjusted R-squared: 0.8651
## F-statistic: 26.66 on 1 and 3 DF, p-value: 0.01409
##
## Call:
## lm(formula = y ~ x, weights = w)
##
## Weighted Residuals:
## 1 2 3 4 5
## -1.014 1.184 1.389 -1.812 4.320
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.5453 0.8142 0.670 0.5510
## x 0.8590 0.2319 3.705 0.0342 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.961 on 3 degrees of freedom
## Multiple R-squared: 0.8206, Adjusted R-squared: 0.7608
## F-statistic: 13.72 on 1 and 3 DF, p-value: 0.03417