A researcher wants to determine the underlying benefits consumers seek from the purchase of a shampoo. A sample of 30 respondents was interviewed. The respondents were asked to indicate their degree of agreement with the following statements using a 7 point scale (1: strongly disagree, 7: strongly agree)

  1. It is important to buy a shampoo that prevents hair fall
  2. I like a shampoo that gives shiny hair
  3. A shampoo should strengthen the roots of your hair
  4. I prefer shampoo that decelerates the greying of hair
  5. Prevention of hair splitting is not an important as far as shampoo is considered
  6. A shampoo should make hair attractive

So we load the data first.

head(x)
##   Respondent x1 x2 x3 x4 x5 x6
## 1          1  7  3  6  4  2  4
## 2          2  1  3  2  4  5  4
## 3          3  6  2  7  4  1  3
## 4          4  4  5  4  6  2  5
## 5          5  1  2  2  3  6  2
## 6          6  6  3  6  4  2  4
dim(x)
## [1] 30  7

We will remove the respondent variable and do the correlation matrix.

x <- x[, 2:7]
cor(x)
##              x1          x2          x3           x4           x5
## x1  1.000000000 -0.05321785  0.87309020 -0.086162233 -0.857636627
## x2 -0.053217850  1.00000000 -0.15502002  0.572212066  0.019745647
## x3  0.873090198 -0.15502002  1.00000000 -0.247787899 -0.777848036
## x4 -0.086162233  0.57221207 -0.24778790  1.000000000 -0.006581882
## x5 -0.857636627  0.01974565 -0.77784804 -0.006581882  1.000000000
## x6  0.004168129  0.64046495 -0.01806881  0.640464946 -0.136402944
##              x6
## x1  0.004168129
## x2  0.640464946
## x3 -0.018068814
## x4  0.640464946
## x5 -0.136402944
## x6  1.000000000

As we can see correlation between x1, x3, x5 are quite high and good correlation between x2, x4 and x6.

We will run a Principle Component Analysis to determine the number of factors. We will check the summary of PCA, bar plot, screeplot and biplot.

x.pca <- princomp(x)
summary(x.pca)
## Importance of components:
##                           Comp.1    Comp.2     Comp.3     Comp.4
## Standard deviation     3.1971521 2.0467225 0.95990875 0.84064381
## Proportion of Variance 0.6040845 0.2475649 0.05445415 0.04176333
## Cumulative Proportion  0.6040845 0.8516494 0.90610355 0.94786689
##                            Comp.5    Comp.6
## Standard deviation     0.76642000 0.5429094
## Proportion of Variance 0.03471401 0.0174191
## Cumulative Proportion  0.98258090 1.0000000
plot(x.pca)

screeplot(x.pca, type = "line")

biplot(x.pca)

Based on the summary and plots, it appears 2 components exist. So we will do factor analysis with 2 factors.

x.fa <- factanal(x, factors = 2, rotation = "varimax", scores = "regression")
x.fa
## 
## Call:
## factanal(x = x, factors = 2, scores = "regression", rotation = "varimax")
## 
## Uniquenesses:
##    x1    x2    x3    x4    x5    x6 
## 0.063 0.437 0.174 0.378 0.205 0.309 
## 
## Loadings:
##    Factor1 Factor2
## x1  0.968         
## x2          0.749 
## x3  0.898  -0.140 
## x4          0.784 
## x5 -0.887         
## x6          0.830 
## 
##                Factor1 Factor2
## SS loadings      2.542   1.892
## Proportion Var   0.424   0.315
## Cumulative Var   0.424   0.739
## 
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 5.21 on 4 degrees of freedom.
## The p-value is 0.266

So Factor1 has correlation among x1, x3, and x5. Factor2 with x2, x4 and x6.

We interpret in this way for factor1
x1 = It is important to buy a shampoo that prevents hair fall.
x3 = A shampoo should strengthen the roots of your hair.
x5 = Prevention of hair splitting is not an important as far as shampoo is considered.
So Factor1 represents hair care related benefits.

For factor2
x2 = I like a shampoo that gives shiny hair.
x4 = I prefer shampoo that decelerates the greying of hair.
x6 = A shampoo should make hair attractive.
So Factor2 represents hair looks and styles.

We generate the component scores as well.

head(x.fa$scores)
##         Factor1    Factor2
## [1,]  1.3045863 -0.2412923
## [2,] -1.2951658 -0.2556460
## [3,]  1.1628756 -0.7569023
## [4,]  0.1747302  1.0107855
## [5,] -1.4279521 -1.3607723
## [6,]  0.9864023 -0.2510835