Homoskedastic Is a term in statistics indicating that the variance of the errors over a sample are similar. This is to say that in a homoskedastic sample the varience of errors will not increase when the variable increase.

\(Var\) \((A_i/X_i)\) = σ\(^2\)

For example, suppose you took a sample of car’s millage used, some with very high millage and others with very low millage. For the varience to be considered homoscedastic, the magnitude of the error for each term compared to the line of best fit would need to be about the same for each regardless of the magnitude of its millage.

here is an example of a homoskedastic graph

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

## 
## Call:
## lm(formula = dist ~ log(speed), data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -26.501 -11.273  -4.466   8.593  53.020 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -80.822     16.004  -5.050 6.79e-06 ***
## log(speed)    46.507      5.942   7.827 4.02e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.26 on 48 degrees of freedom
## Multiple R-squared:  0.5607, Adjusted R-squared:  0.5516 
## F-statistic: 61.27 on 1 and 48 DF,  p-value: 4.02e-10
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
##  studentized Breusch-Pagan test
## 
## data:  dist ~ speed
## BP = 3.2149, df = 1, p-value = 0.07297

As you can see the points above and below the the fit line they are about the same. The opposite of homoskedastic is Heteroskedastic which is when the error terms are different, varying depending on the value of one ormore of the dependable vriables.

To check the homoskedastic’s presence we simple run the bptest() package as you can see the result P-value is greater than 0.05 meaning that we have a strong presence of homoskedasticity in this graph.

The violation of homoscedastic may result in overestimating the goodness of a fit as measured by the “Pearson coefficient” which is a measure of the linear correlation between two variables X and Y given a value between +1 and -1 where 1 is total positive correlation , 0 is no correlation, and -1 is total negative correlation.