What are Discrimination Tests?

Discrimination sensory tests are designed to determine whether a perceivable difference exists between two products. Unlike descriptive or affective tests, the goal is not to describe or rate the intensity of attributes, but simply to assess whether a difference can be detected by trained or untrained panelists. Common types of Discrimination tests include Triangle Test, Duo-Trio Test, 2-Alternative Forced Choice (2-AFC), 3-AFC, Tetrad, and more (International Organization for Standardization 2004).

Statistical analysis of the results is based on probability theory, using tools like the binomial distribution, chi-squared tests, or z-tests for proportions. For standard test formats, specialized functions in the sensR package streamline the analysis process.

The triangle test

The triangle test is a discrimination sensory method in which each panelist receives three samples. Two are the same and one is different. The panelist must identify the odd sample.

Simulated Data

We simulate responses from 30 panelists for illustration (1 = correct, 0 = incorrect).

set.seed(123)
Triangle <- data.frame(Correct_answer = rbinom(30, 1, prob = 0.5))

Binomial Test

Used to compare the observed number of correct answers to the chance level (1/3 for triangle test).

data <- data.frame(answer = Triangle$Correct_answer)
binom.test(sum(data$answer), nrow(data), p = 1/3)

## 
##  Exact binomial test
## 
## data:  sum(data$answer) and nrow(data)
## number of successes = 19, number of trials = 30, p-value = 0.0008206
## alternative hypothesis: true probability of success is not equal to 0.3333333
## 95 percent confidence interval:
##  0.4385598 0.8007014
## sample estimates:
## probability of success 
##              0.6333333

Chi-Squared Test

Compares observed frequencies to the expected distribution.

observed <- table(factor(data$answer, levels = c(0,1)))
names(observed) <- c("Incorrect", "Correct")

expected <- c("Incorrect" = length(data$answer) * 2 / 3,
               "Correct"   = length(data$answer) * 1 / 3)

chisq.test(x = observed, p = expected / sum(expected))

## 
##  Chi-squared test for given probabilities
## 
## data:  observed
## X-squared = 12.15, df = 1, p-value = 0.0004909

Z-Test for Proportions

Tests whether the observed proportion differs from the expected chance level.

observed_prob <- mean(data$answer)
expected_prob <- 1/3
n <- length(data$answer)

z <- (observed_prob - expected_prob) / sqrt(expected_prob * (1 - expected_prob) / n)
p_value <- 2 * pnorm(-abs(z))  # Two-tailed test

z

## [1] 3.485685

p_value

## [1] 0.0004908786

While the binomial test, z-test for proportions, and chi-squared test all aim to determine whether the observed number of correct responses significantly differs from what is expected by chance, their sensitivity can vary. In most cases, the binomial test is preferred for sensory discrimination tests due to its exact nature, especially with small to moderate sample sizes. The z-test provides a good approximation when the sample size is large enough, while the chi-squared test may be less reliable with only two response categories or low expected frequencies. Additionally, the Yates continuity correction can be applied to the chi-squared test to reduce bias in small samples, although it may lead to more conservative results (Hernández Montes 2016).

Yates’ Continuity Correction

The Yates continuity correction is commonly applied in the chi-squared test with 2 categories (2x1 or 2x2 tables) to adjust for the overestimation of significance due to approximation errors (Agresti 2018).

chisq.test(x = observed, p = expected / sum(expected), correct = TRUE)

## 
##  Chi-squared test for given probabilities
## 
## data:  observed
## X-squared = 12.15, df = 1, p-value = 0.0004909

✅ Use when you have small sample sizes and only two response categories

⚠️ Can make the test more conservative, possibly increasing the p-value

❌ Not needed when using the exact binomial test

Applying the Same Code to Other Tests

The same statistical logic used for the triangle test can be applied to other discrimination tests such as:

Duo-Trio (chance = 1/2)
Tetrad (chance = 1/3)
Hexad (chance = 1/5)
Two-out-of-Five (2-of-5) (chance = 0.206)

To analyze these tests, simply change the value of the chance probability (p) in the binomial test, z-test, and chi-squared test accordingly.

Example: Two-out-of-Five Test Let’s say we tested 25 panelists, and 9 of them correctly identified the two matching samples.

# Observed responses
observed_2of5 <- c(Incorrect = 25 - 9, Correct = 9)

# Expected frequencies under null hypothesis (p ≈ 0.206)
expected_2of5 <- c(Incorrect = 25 * (1 - 0.206),
                   Correct = 25 * 0.206)

# Chi-squared test
chisq.test(x = observed_2of5, p = expected_2of5 / sum(expected_2of5))

## 
##  Chi-squared test for given probabilities
## 
## data:  observed_2of5
## X-squared = 3.6249, df = 1, p-value = 0.05692

This approach allows you to evaluate various test types by modifying only the expected probabilities, making it easy to adapt the analysis for any forced-choice sensory discrimination method.

Using discrim() from the sensR Package

The discrim() function from the sensR package provides a unified and flexible interface for analyzing a wide variety of discrimination protocols. It allows you to test hypotheses about sensory differences using different methods, test types, and statistical frameworks—all with one function.

Example: Triangle Test

library(sensR)

discrim_result <- discrim(19, 30, method = "triangle")
print(discrim_result)

## 
## Estimates for the triangle discrimination protocol with 19 correct
## answers in 30 trials. One-sided p-value and 95 % two-sided confidence
## intervals are based on the 'exact' binomial test. 
## 
##         Estimate Std. Error  Lower  Upper
## pc        0.6333    0.08798 0.4386 0.8007
## pd        0.4500    0.13197 0.1578 0.7011
## d-prime   2.1462    0.45527 1.1264 3.1336
## 
## Result of difference test:
## 'exact' binomial test:  p-value = 0.0007371 
## Alternative hypothesis: d-prime is greater than 0

This function internally accounts for the structure of each test (such as chance level, forced-choice nature, and d-prime calculations), making it ideal for standardized analysis and reporting across different test types.

Testing for Similarity

Most sensory discrimination tests are designed to detect differences between products. In such tests, the focus is on minimizing the Type I error (α-risk)—the probability of concluding that a perceptible difference exists when in fact there is none. This approach assumes that Type II error (β-risk) and the proportion of distinguishers (pd) are either negligible or unimportant. Consequently, sample sizes can be kept relatively small.

However, in many industrial and quality control applications, the goal is not to prove that products are different, but rather that they are similar enough to be used interchangeably—for example, when switching suppliers or reformulating for cost savings.

In similarity testing, the focus shifts. The analyst must define what constitutes a meaningful difference by specifying a value for pd, and then chooses a small value for β-risk (Type II error) to ensure the test has high power to detect differences if they exist. In this case, a larger α-risk is tolerated to avoid requiring an excessively large number of assessors (Meilgaard, Civille, and Carr 2016).

Example: Similarity Test Using discrim()

library(sensR)

# Let's assume we want to demonstrate similarity (not difference)
# pd = 0.30 is the threshold for a meaningful difference (i.e., no more than 30% distinguishers)
# test = "similarity" enables the appropriate hypothesis test

discrim_sim <- discrim(correct = 17,
                       total = 30,
                       pd0 = 0.30,
                       method = "triangle",
                       test = "similarity",
                       statistic = "exact")

print(discrim_sim)

## 
## Estimates for the triangle discrimination protocol with 17 correct
## answers in 30 trials. One-sided p-value and 95 % two-sided confidence
## intervals are based on the 'exact' binomial test. 
## 
##         Estimate Std. Error   Lower  Upper
## pc        0.5667    0.09047 0.37427 0.7454
## pd        0.3500    0.13571 0.06141 0.6181
## d-prime   1.8071    0.45652 0.68033 2.7691
## 
## Result of similarity test:
## 'exact' binomial test:  p-value = 0.7071 
## Alternative hypothesis: pd is less than 0.3

This test evaluates whether the observed proportion of correct responses is low enough to conclude that no meaningful perceptible difference exists between the two samples, given the defined threshold pd0.

References

Agresti, Alan. 2018. Statistical Methods for the Social Sciences. 5th ed. Boston: Pearson.

Hernández Montes, A. 2016. Evaluación Sensorial de Productos Agroalimentarios. Texcoco, México: Universidad Autónoma Chapingo.

International Organization for Standardization. 2004. “ISO 4120:2004 — Sensory Analysis — Methodology — Triangle Test.” https://www.iso.org/standard/33495.html.

Meilgaard, Morten C., Gail Vance Civille, and B. Thomas Carr. 2016. Sensory Evaluation Techniques. 5th ed. CRC Press.

Discrimination Tests in Sensory Evaluation

Alexis Zhaid Carrillo García

2025-03-26