Hedonic scales are widely used in sensory evaluation to measure how much a consumer likes or prefers a given product. These scales are essential in product development, reformulation studies, and acceptance testing.
The most commonly used is the 9-point hedonic scale, ranging from “dislike extremely” to “like extremely.” However, other formats such as 7-point, 5-point, or visual analog scales are also used depending on the study goals and the target population.
When analyzing data obtained from hedonic scales, analysis of variance (ANOVA) is commonly applied to determine whether significant differences exist between products. The appropriate ANOVA model depends on the experimental design — particularly on whether subjects evaluate all samples, whether repeated measurements are involved, and how many factors are under consideration (Meilgaard et al., 2016).
This is the simplest and most common approach. Each subject evaluates all samples, and subjects are treated as blocks to account for inter-subject variability.
Use this design when:
Each subject evaluates all samples once.
No additional factors (e.g., time, group conditions) need to be modeled.
30 consumers evaluate 4 different ramen formulations in a single session. An RCBD is used where “Product” is a fixed factor and “Subject” is a blocking factor.
head(ramen)
## # A tibble: 6 × 3
## Subject Product Likeness
## <dbl> <chr> <dbl>
## 1 1 A 8
## 2 2 A 8
## 3 3 A 5
## 4 4 A 6
## 5 5 A 7
## 6 6 A 7
ramen$Subject <- as.factor(ramen$Subject)
ramen$Product <- as.factor(ramen$Product)
model_rcbd <- aov(Likeness ~ Product + Subject, data = ramen)
summary(model_rcbd)
## Df Sum Sq Mean Sq F value Pr(>F)
## Product 3 38.42 12.808 3.735 0.0141 *
## Subject 29 60.17 2.075 0.605 0.9365
## Residuals 87 298.32 3.429
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: It is essential to convert all sources of variation (e.g., panelists and samples) to factors before fitting the ANOVA model. If these variables are left as numeric or character types, R may treat them as continuous covariates or ignore them entirely, leading to incorrect model structures and misleading results. This is particularly critical in sensory data where blocking (e.g., by panelist) must be explicitly modeled.
The model evaluates whether significant differences exist between the sample means while controlling for the variability caused by individual panelists. If the p-value for the Product factor is less than 0.05, we conclude that at least one sample differs significantly in terms of consumer acceptability.
To determine which specific samples differ, we apply a Tukey HSD test using the HSD.test() function from the agricolae package. This function has the advantage of grouping means using letters, which is more intuitive for interpretation and reporting in sensory studies.
library(agricolae)
HSD_ramen <- HSD.test(model_rcbd, "Product")
print(HSD_ramen$groups)
## Likeness groups
## A 6.700000 a
## D 6.566667 a
## B 6.366667 ab
## C 5.266667 b
The post-hoc comparison using Tukey’s HSD test groups the samples based on their mean acceptability scores. The grouping letters provide a quick and intuitive way to identify significant differences:
Samples that share at least one letter (e.g., “a” and “ab”) are not significantly different from each other.
Samples with no letters in common (e.g., “A” and “C”) are significantly different.
Use this model when the same subjects evaluate the same set of samples under different conditions (e.g., time, session, temperature), introducing within-subject correlation.
Use this design when:
The same panelists evaluate samples multiple times (e.g., over time).
You want to model time or condition as an additional factor.
20 consumers rate the same 3 coffee brands in two different sessions: once before breakfast and once after breakfast
head(datos_rep)
## Panelist Sample Session Score
## 1 1 A Before 3
## 2 2 A Before 3
## 3 3 A Before 2
## 4 4 A Before 6
## 5 5 A Before 5
## 6 6 A Before 4
modelo_rm <- aov(Score ~ Panelist + Sample/Session, data = datos_rep)
summary(modelo_rm)
## Df Sum Sq Mean Sq F value Pr(>F)
## Panelist 19 90.0 4.735 0.751 0.757
## Sample 2 12.6 6.308 1.000 0.372
## Sample:Session 3 1.9 0.642 0.102 0.959
## Residuals 95 599.3 6.308
The repeated measures ANOVA showed no significant differences in acceptability scores between samples (p = 0.372), nor between sessions (p = 0.959). Additionally, no significant panelist effect was detected (p = 0.757).
Since the factor Sample did not show significant variation, no post-hoc comparison is required. Post-hoc tests such as Tukey’s HSD are only recommended when the main effect or interaction is statistically significant (typically p < 0.05).