General overview

Binomial GLMM on participants’ image-classification responses (N = 10,368 trials from 54 participants). Model: response_binary ~ image_type * sound_condition * PAS + display_time + (1 | id) with logit link.

The three-way interaction was not significant (Table 1), so the below focuses on the two-way image_type × sound_condition interaction at each level of subjective awareness (PAS).

Table 1: Omnibus (Type III Wald) tests

Effect	χ²	df	p
Table 1. Type III Wald χ² tests for the full model
Binomial GLMM, N = 10,368 trials, 54 participants
image_type	620.21	1	< .001	***
sound_condition	9.36	2	0.009	**
PAS	185.75	3	< .001	***
display_time	13.92	6	0.031	*
image_type:sound_condition	9.56	2	0.008	**
image_type:PAS	501.43	3	< .001	***
sound_condition:PAS	7.16	6	0.307	NA
image_type:sound_condition:PAS	9.75	6	0.136	NA
Model: `response_binary ~ image_type × sound_condition × PAS + display_time + (1 \| id)`. * p < .05, p < .01, * p < .001.

Conclusions. Strong main effects of image type, PAS and sound condition, plus meaningful two-way interactions image_type:sound_condition (p = .008) and image_type:PAS (p < .001). The three-way interaction is not significant (p = .136); we can interpret the sound × image effect without invoking a PAS-level moderator.

Figure 1: Predicted probability of salty response by type, sound, PAS

Table 2: Sound modulation of image discrimination at each PAS level

Difference-of-differences on the probability scale. A positive estimate means that the musical condition increased the gap between responses to salty vs. sweet images (i.e., improved discrimination) relative to the comparison condition.

	Contrast	Δ	SE	CI lower	CI upper	z	p
Table 2. Interaction contrasts: sound modulation of image discrimination
Difference-of-differences (probability scale), by PAS level
1	sweetmusic - silent	−0.019	0.048	−0.112	0.074	−0.400	0.689	NA
	saltymusic - silent	−0.024	0.048	−0.119	0.070	−0.502	0.615	NA
	saltymusic - sweetmusic	−0.005	0.048	−0.099	0.088	−0.108	0.914	NA
2	sweetmusic - silent	0.120	0.060	0.002	0.238	1.987	0.047	*
	saltymusic - silent	0.026	0.063	−0.097	0.149	0.413	0.680	NA
	saltymusic - sweetmusic	−0.094	0.062	−0.216	0.029	−1.502	0.133	NA
3	sweetmusic - silent	0.020	0.032	−0.043	0.083	0.629	0.529	NA
	saltymusic - silent	−0.068	0.036	−0.139	0.003	−1.870	0.061	NA
	saltymusic - sweetmusic	−0.088	0.035	−0.156	−0.020	−2.532	0.011	*
4	sweetmusic - silent	0.003	0.007	−0.011	0.017	0.454	0.650	NA
	saltymusic - silent	−0.007	0.008	−0.022	0.009	−0.879	0.379	NA
	saltymusic - sweetmusic	−0.010	0.008	−0.025	0.005	−1.333	0.183	NA
Positive Δ = music enhanced image discrimination (salty–sweet gap) relative to comparison. p-values uncorrected for multiple comparisons. With 12 contrasts, Bonferroni α ≈ .004; the PAS = 2 and PAS = 3 effects must be suggestive.

Conclusions. Sound has no effect when participants report no awareness (PAS 1) or complete awareness (PAS 4). The only signals are in the partial-awareness range: at PAS 2 sweet music increased discrimination relative to silence (Δ = +0.12, p = .047), and at PAS 3 sweet music also outperformed salty music (Δ = −0.09, p = .011).

Figure 2: Sound modulation of image discrimination by PAS level

Table 3: Simple effects: image discrimination within each sound × PAS cell

	Sound	OR	CI lower	CI upper	z	p	ΔP
Table 3. Image discrimination (salty vs. sweet) within each sound condition
Odds ratios and Δ-probability, by PAS level
1	silent	1.06	0.81	1.40	0.456	0.648	0.016
	sweetmusic	0.99	0.75	1.29	−0.106	0.915	−0.004
	saltymusic	0.97	0.74	1.26	−0.255	0.799	−0.009
2	silent	3.00	2.08	4.31	5.913	< .001	0.268
	sweetmusic	5.16	3.46	7.69	8.067	< .001	0.388
	saltymusic	3.37	2.26	5.02	5.953	< .001	0.294
3	silent	154.79	77.56	308.90	14.302	< .001	0.850
	sweetmusic	223.86	105.31	475.85	14.064	< .001	0.870
	saltymusic	68.53	38.67	121.45	14.480	< .001	0.783
4	silent	5,991.67	2,638.36	13,606.99	20.785	< .001	0.973
	sweetmusic	7,591.89	3,253.00	17,718.06	20.663	< .001	0.976
	saltymusic	3,331.80	1,669.54	6,649.07	23.008	< .001	0.966
OR > 1 → higher probability of ‘salty’ response for salty images (correct discrimination). ΔP = difference in predicted probability (salty − sweet image). At PAS 1 discrimination is near chance (ΔP ≈ 0); at PAS 4 discrimination is at ceiling in every sound condition.

Conclusions. Discrimination climbs with PAS in every sound condition. The huge ORs at PAS 4 reflect near-complete separation (ceiling): ΔP is the more interpretable quantity at that level.

Figure 3: Image discrimninaton by sound and PAS

Table 4: Model-predicted cell means

Predicted probability of a ‘salty’ response, back-transformed from the logit scale and averaged over display-time levels.

	Image	Sound	P(salty)	SE	CI lower	CI upper
Table 4. Predicted probability of a ‘salty’ response
Averaged over display_time, by PAS × image × sound
1	sweet	silent	0.424	0.040	0.348	0.505
	salty	silent	0.440	0.041	0.362	0.520
	sweet	sweetmusic	0.426	0.040	0.349	0.506
	salty	sweetmusic	0.422	0.040	0.347	0.501
	sweet	saltymusic	0.464	0.041	0.385	0.544
	salty	saltymusic	0.455	0.041	0.377	0.535
2	sweet	silent	0.362	0.042	0.285	0.448
	salty	silent	0.630	0.039	0.550	0.703
	sweet	sweetmusic	0.280	0.040	0.210	0.364
	salty	sweetmusic	0.668	0.038	0.589	0.739
	sweet	saltymusic	0.330	0.043	0.251	0.418
	salty	saltymusic	0.623	0.041	0.540	0.699
3	sweet	silent	0.066	0.018	0.038	0.111
	salty	silent	0.916	0.019	0.870	0.947
	sweet	sweetmusic	0.048	0.015	0.025	0.088
	salty	sweetmusic	0.918	0.018	0.875	0.947
	sweet	saltymusic	0.093	0.022	0.059	0.146
	salty	saltymusic	0.876	0.023	0.824	0.914
4	sweet	silent	0.009	0.003	0.004	0.018
	salty	silent	0.982	0.005	0.970	0.989
	sweet	sweetmusic	0.008	0.003	0.004	0.017
	salty	sweetmusic	0.984	0.004	0.973	0.991
	sweet	saltymusic	0.017	0.004	0.010	0.029
	salty	saltymusic	0.983	0.005	0.971	0.990
Cell shading: pink → ‘sweet’ response bias, blue → ‘salty’ response bias. At PAS 1 all cells sit near 0.45 (chance), at PAS 4 they saturate near 0 or 1 as expected.

Table 5: Response bias at 0 ms

Sound	N	P(salty) - observed	SE	P(salty) - model
Table 5. Response bias at 0 ms display time
Raw proportions vs. model-estimated P(salty)
silent	864	0.471	0.017	0.439
sweetmusic	864	0.459	0.017	0.424
saltymusic	864	0.494	0.017	0.469
At 0 ms no image is shown; image type was randomly assigned. Observed proportions hover around chance (0.47–0.49); no significant sound-driven response bias.

Conclusions. The sound conditions do not produce a response bias when there is nothing to see. The effects in Table 2 seem to be perceptual enhancements, not response-level priming.

Analysis approach. Classification responses were coded 1 = “salty”, 0 = “sweet” and analysed with a binomial generalized linear mixed-effects model (lme4::glmer, logit link). Fixed effects were image_type, sound_condition (silent / sweet music / salty music), PAS (ordered factor, levels 1–4) and their three-way interaction, plus an additive 7-level covariate for display_time (0–50 ms); participants contributed a random intercept. Type III Wald χ² tests (car::Anova) tested omnibus effects, and follow-up contrasts on the probability scale were computed with emmeans. All p-values are two-tailed and uncorrected unless noted.

Omnibus effects. Image type (χ²(1) = 620.21, p < .001) and PAS (χ²(3) = 185.75, p < .001) had the largest effects, with a large image_type × PAS interaction (χ²(3) = 501.43, p < .001) reflecting the steep rise in discrimination with subjective awareness (Figure 1, Table 3). Sound condition produced a small overall effect (χ²(2) = 9.36, p = .009) that interacted with image type (χ²(2) = 9.56, p = .008). The three-way interaction was not significant (χ²(6) = 9.75, p = .136); given a priori expectations of null modulation at the PAS extremes, we decomposed the image × sound interaction at each PAS level.

Sound modulation by awareness. Figure 2 and Table 2 present the interaction contrasts at each PAS level. Contrasts were negligible at PAS 1 (|Δ| ≤ 0.024, p ≥ .61) and PAS 4 (|Δ| ≤ 0.011, p ≥ .18), consistent with floor and ceiling discrimination (Figure 3, Table 3). All reliable modulation was confined to partial awareness. At PAS 2, sweet music widened the salty–sweet discrimination gap by 12.0 percentage points relative to silence (Δ = +0.120, 95% CI [+0.002, +0.238], z = 1.99, p = .047). At PAS 3, sweet music outperformed salty music (Δ = −0.088 for saltymusic − sweetmusic, 95% CI [−0.156, −0.020], z = −2.53, p = .011), because of reduced discrimination under salty music (ΔP = 0.783) relative to silent (0.850) and sweet music (0.870). Across twelve contrasts neither effect survives Holm or Benjamini–Hochberg correction (best adjusted p = BH q = 0.132); we treat them as suggestive and in need of replication.

Response-bias control. A separate GLMM on 0-ms trials, where no image was shown and image type was randomly assigned, found no reliable effect of sound (observed P(salty) = 0.471, 0.459, 0.494 for silent, sweet and salty music; all pairwise p ≥ .15; Table 5). The PAS-2/PAS-3 effects are therefore unlikely to reflect a simple response-priming account. The additive display-time specification held empirically: within-panel curves in Figure 4 run approximately parallel across 0–50 ms, so the findings do not depend on any particular display-time subset.

Supplement: Display Time (Figure 4)

Table 2b: Multiple-comparison correction for the interaction contrasts

	Contrast	raw p	Holm p	BH q	Bonferroni p
Table 2b. Multiple-comparison correction across the 12 interaction contrasts
Holm, Benjamini–Hochberg (q) and Bonferroni adjustments applied to the family in Table 2
PAS 1	sweetmusic − silent	0.689	1.000	0.752	1.000
	saltymusic − silent	0.615	1.000	0.752	1.000
	saltymusic − sweetmusic	0.914	1.000	0.914	1.000
PAS 2	sweetmusic − silent	0.047	0.517	0.244	0.564
	saltymusic − silent	0.680	1.000	0.752	1.000
	saltymusic − sweetmusic	0.133	1.000	0.399	1.000
PAS 3	sweetmusic − silent	0.529	1.000	0.752	1.000
	saltymusic − silent	0.061	0.610	0.244	0.732
	saltymusic − sweetmusic	0.011	0.132	0.132	0.132
PAS 4	sweetmusic − silent	0.650	1.000	0.752	1.000
	saltymusic − silent	0.379	1.000	0.752	1.000
	saltymusic − sweetmusic	0.183	1.000	0.439	1.000
Rows highlighted in lavender had raw p < .05. Family of 12 contrasts (4 PAS levels × 3 pairwise sound comparisons). Neither nominally-significant effect survives any standard family-wise or false-discovery correction: the best adjusted value is BH q = 0.132 (PAS 3, saltymusic − sweetmusic).

Conclusions. The two contrasts that were nominally significant in Table 2 (PAS 2 sweetmusic − silent; PAS 3 saltymusic − sweetmusic) do not survive correction under any of the three standard procedures. The strongest effect — PAS 3 saltymusic − sweetmusic — reaches Holm-adjusted p = .132 / BH q = .132, i.e. well above the conventional α = .05 threshold. A narrower, a priori confirmatory family (e.g. the four “music vs. silent” contrasts at PAS 2 and PAS 3 only, α_Bonferroni = .0125) would still leave only PAS 3 saltymusic − sweetmusic approaching significance, and even that contrast is not in the restricted family. It therefore seems that the PAS-2/PAS-3 pattern is suggestive and requires replication.

Limitations and methodological considerations

Random-effects structure. The reported GLMM includes only a by-participant random intercept, (1 | id). For a fully within-subjects design the principled specification is a random-slopes model (at minimum (1 + image_type | id) and ideally (1 + image_type + sound_condition | id)) so that by-participant variability in the very effects being tested is not absorbed into the residual term. With 192 trials per participant there is in principle sufficient data to support random slopes.

Multiple-comparison correction. As shown in Table 2b, none of the twelve interaction contrasts survives correction under Holm, Benjamini–Hochberg or Bonferroni procedures applied to the full family. Two paths forward are 1) a possible pre-registering a targeted replication with the PAS-2/PAS-3 “music vs. silent” contrasts as the primary confirmatory tests, or 2) narrowing the confirmatory family a priori (e.g. the four “music vs. silent” contrasts at PAS 2–3, α_Bonferroni = .0125). The present effects are therefore described as suggestive rather than confirmatory.

Theoretical framing vs. empirical pattern. The observed pattern does not match a classical semantic congruence account. Under congruence, sweet music should selectively facilitate “sweet” responses to sweet images (and salty music should do the same for salty images). Instead, sweet music was associated with enhanced discrimination generally, and salty music was neutral at PAS 2 and mildly suppressive at PAS 3 (Table 3). This is more consistent with a general-enhancement or arousal-based account (e.g. one in which the sweet-music clip raises attentional engagement across image types). An acoustic/affective comparison of the two music clips (tempo, loudness, valence, arousal etc.) would help distinguish these accounts.

0-ms trials and the PAS = 1 cell. Approximately 14% of trials were presented at 0 ms (no image shown), and these trials almost exclusively produce PAS = 1. Because image_type was randomly assigned on 0-ms trials, the PAS-1 row of the analysis is due to trials without a genuine stimulus. The present specification absorbs this through the additive display_time covariate, and the PAS-1 contrasts are reported as null (consistent with expectation). As a robustness check, one could refit the model excluding 0-ms trials to confirm the PAS-2/PAS-3 contrasts are unchanged.

Convergence diagnostics. The reported model converged under the bobyqa optimizer without warnings. Extended models (e.g., random slopes) will require additional tuning.

Other. The present analysis does not 1) include a variance-explained measure beyond ΔP (e.g. marginal/conditional pseudo-R² via MuMIn::r.squaredGLMM); 2) analyse reaction-time data; or 3) document participant- or trial-level exclusions.

Sweet vs. Salty: Results Summary

Daniel Tchemerinsky Konieczny