RESEARCH QUESTION:

Notations

– Lets define i the X_i’s value of any group and j the group 1,2,3 of the random vectors gathered in one BIG Random vector : Here j=1:3 as total of 3 group vectors constructed representing a sample of a RV[Random Variable].

MEAN WITH BALANCED GROUP

Usually the mean estimator is the sum of x (Or X as a [RV]) / total of count each Xs (has n1=1,n2=1… as a count. Lets create 3 vector of Xi , Xj, Xk as a RV vectors

xi=c(2,3,5,6,5,5)#6 values)
xj=c(2,3,3,4,7,5)#6 values)
xk=c(4,4,5,2,8,4)#6 values)
xi
## [1] 2 3 5 6 5 5
##create a mega vector of all x's
xijk=c(xi,xj,xk)
xijk
##  [1] 2 3 5 6 5 5 2 3 3 4 7 5 4 4 5 2 8 4
summary(xijk)#GRAND MEAN is 4.278
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   3.000   4.000   4.278   5.000   8.000
length(xijk)#3x6=18 values counted
## [1] 18
#prof by maths def.mean
GRANDMEAN=sum(xi,xj,xk)/18
GRANDMEAN
## [1] 4.277778

Making Mean of each Mean-vector for a Grand mean

  • OK only if balanced, that is each mean as weight of one! :\[\tilde{X_g}=\frac{1}{n_j}\sum\bar{X_j}\]
mean(xi)
## [1] 4.333333
mean(xj)
## [1] 4
mean(xk)
## [1] 4.5
##what we do all: summing the 3 means/3



  
 
  
sum(mean(xi),mean(xj),mean(xk))/3
## [1] 4.277778
#GRAND MEAN is here an unbiased estimator here because == ni in each j groups: cool the maths is correct

MEAN WITH UNBALANCED GROUPS

xi2=c(2,3,5,6)#4 values)
xj2=c(2,3,3,4,7,5)#6 values)
xk2=c(4,2)#2 values)
#12 values in j=3 groups
##create a mega vector of all x's
xi2j2k2=c(xi2,xj2,xk2)
xi2j2k2
##  [1] 2 3 5 6 2 3 3 4 7 5 4 2
summary(xi2j2k2)#GRAND MEAN is 4.278
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   2.750   3.500   3.833   5.000   7.000

Making Mean of each Mean-vector for a Grand mean UNBALANCED GROUPS

GRANDMEAN=sum(mean(xi2),mean(xj2),mean(xk2))/3#3 is the number of mean used in the sum:
GRANDMEAN
## [1] 3.666667

The mean of all x’s/ n isnt equal to making the SUM of all means / nbrs of Means: 3.67 isn’t equal to 3.88.

This demonstrate that such procedure should be avoided and is sometimes overlooked even by profesionals either by lack of knowledge of this properties (nor obvious I recognized) or either for deliberate purposes of economics estimators to be published.

  • A unbiased estimated is given by an –weighted average–

\[\tilde{X_G}=\frac{1}\sum_{w_j}\sum_{w_j=1:ij} w_j*\bar{X}_j\]

df1=data.frame(c(3.833,3.66))
dotplot(df1,cex=3,col=c(3,2),main="UNBIASED VS BIASED G_MEAN (RED)",xlab="",ylab="value of G-mean estimators")