Hedonic scales are widely used in sensory evaluation to measure how much a consumer likes or prefers a given product. These scales are essential in product development, reformulation studies, and acceptance testing.

The most commonly used is the 9-point hedonic scale, ranging from “dislike extremely” to “like extremely.” However, other formats such as 7-point, 5-point, or visual analog scales are also used depending on the study goals and the target population.

When analyzing data obtained from hedonic scales, analysis of variance (ANOVA) is commonly applied to determine whether significant differences exist between products. The appropriate ANOVA model depends on the experimental design — particularly on whether subjects evaluate all samples, whether repeated measurements are involved, and how many factors are under consideration (Meilgaard et al., 2016).

Choosing the Right ANOVA Design

Randomized Complete Block Design (RCBD)

This is the simplest and most common approach. Each subject evaluates all samples, and subjects are treated as blocks to account for inter-subject variability.

Use this design when:

Each subject evaluates all samples once.

No additional factors (e.g., time, group conditions) need to be modeled.

Example

30 consumers evaluate 4 different ramen formulations in a single session. An RCBD is used where “Product” is a fixed factor and “Subject” is a blocking factor.

head(ramen)
## # A tibble: 6 × 3
##   Subject Product Likeness
##     <dbl> <chr>      <dbl>
## 1       1 A              8
## 2       2 A              8
## 3       3 A              5
## 4       4 A              6
## 5       5 A              7
## 6       6 A              7
ramen$Subject <- as.factor(ramen$Subject)
ramen$Product <- as.factor(ramen$Product)

model_rcbd <- aov(Likeness ~ Product + Subject, data = ramen)

summary(model_rcbd)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Product      3  38.42  12.808   3.735 0.0141 *
## Subject     29  60.17   2.075   0.605 0.9365  
## Residuals   87 298.32   3.429                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note: It is essential to convert all sources of variation (e.g., panelists and samples) to factors before fitting the ANOVA model. If these variables are left as numeric or character types, R may treat them as continuous covariates or ignore them entirely, leading to incorrect model structures and misleading results. This is particularly critical in sensory data where blocking (e.g., by panelist) must be explicitly modeled.

The model evaluates whether significant differences exist between the sample means while controlling for the variability caused by individual panelists. If the p-value for the Product factor is less than 0.05, we conclude that at least one sample differs significantly in terms of consumer acceptability.

Post-hoc Comparison with agricolae

To determine which specific samples differ, we apply a Tukey HSD test using the HSD.test() function from the agricolae package. This function has the advantage of grouping means using letters, which is more intuitive for interpretation and reporting in sensory studies.

library(agricolae)

HSD_ramen <- HSD.test(model_rcbd, "Product")

print(HSD_ramen$groups)
##   Likeness groups
## A 6.700000      a
## D 6.566667      a
## B 6.366667     ab
## C 5.266667      b

The post-hoc comparison using Tukey’s HSD test groups the samples based on their mean acceptability scores. The grouping letters provide a quick and intuitive way to identify significant differences:

  • Samples that share at least one letter (e.g., “a” and “ab”) are not significantly different from each other.

  • Samples with no letters in common (e.g., “A” and “C”) are significantly different.

Repeated Measures ANOVA

Use this model when the same subjects evaluate the same set of samples under different conditions (e.g., time, session, temperature), introducing within-subject correlation.

Use this design when:

The same panelists evaluate samples multiple times (e.g., over time).

You want to model time or condition as an additional factor.

Example

20 consumers rate the same 3 coffee brands in two different sessions: once before breakfast and once after breakfast

head(datos_rep)
##   Panelist Sample Session Score
## 1        1      A  Before     3
## 2        2      A  Before     3
## 3        3      A  Before     2
## 4        4      A  Before     6
## 5        5      A  Before     5
## 6        6      A  Before     4
modelo_rm <- aov(Score ~ Panelist + Sample/Session, data = datos_rep)

summary(modelo_rm)
##                Df Sum Sq Mean Sq F value Pr(>F)
## Panelist       19   90.0   4.735   0.751  0.757
## Sample          2   12.6   6.308   1.000  0.372
## Sample:Session  3    1.9   0.642   0.102  0.959
## Residuals      95  599.3   6.308

The repeated measures ANOVA showed no significant differences in acceptability scores between samples (p = 0.372), nor between sessions (p = 0.959). Additionally, no significant panelist effect was detected (p = 0.757).

Since the factor Sample did not show significant variation, no post-hoc comparison is required. Post-hoc tests such as Tukey’s HSD are only recommended when the main effect or interaction is statistically significant (typically p < 0.05).

References

Meilgaard, M. C., Civille, G. V., & Carr, B. T. (2016). Sensory evaluation techniques (5th ed.). CRC Press.