Dispersion of Data: Introducing Variance

Consider the three data sets \(X\), \(Y\) and \(Z\)

\[X= \{900,925,950,975,1025,1050,1075,1100 \}\] \[Y=\{900,905,910,920,1080,1090,1095,1100\}\] \[Z=\{900,985,990,995,1005,1010,1015,1100\}\]

For each of the data sets, the following statements can be verified

From the plot below, notice how different the three data sets are in terms of dispersion around the mean value.

            X=c(900,925,950,975,1025,1050,1075,1100)
            Y=c(900,905,910,920,1080,1090,1095,1100)
            Z=c(900,985,990,995,1005,1010,1015,1100)
            # y-values for graph
            Z.y = rep(3,8)
            Y.y = rep(4,8)
            X.y = rep(5,8)
            
            plot(Z,Z.y,pch=16,col="red",ylim=c(2.5,5.5),main=c("Variance") )
            
            points(Y,Y.y,pch=16,col="blue" )
            points(X,X.y,pch=16,col="green" )
            points(c(1000,1000,1000),c(3,4,5),pch=18,cex=1.2)
            lines(c(1000,1000),c(2.75,5.25),lty=3)

var(X)
## [1] 5357.143
sd(X)
## [1] 73.19251
var(Y)
## [1] 9578.571
sd(Y)
## [1] 97.87018
var(Z)
## [1] 2957.143
sd(Z)
## [1] 54.37962