Study 4: Words-in-a-Word Game Data Analysis

Author

Jamie C. Lee

Published

April 3, 2026

Summary

This document presents descriptive analyses for Study 4, in which participants completed a words-in-a-word game under one of three progress visualization conditions (Horizontal Bar, Vertical Bar, Ring). The primary outcomes of interest are perceived distance to goal, motivation to complete the goal, and how these vary across visualization formats.

Sample. XXX participants were recruited at a large Midwestern university for a 3-condition between-subjects design.

Exclusions. Participants with 3 or more perceived distance reversals (i.e., reporting decreasing perceived distance as actual remaining distance increases) were excluded from the primary analyses, leaving N = XXX participants.

Perceived distance. Participants showed a systematic pattern of underestimating remaining distance early in the task and overestimating it later, with some variation across conditions in when this crossover occurred.

Motivation to complete goal. To be populated.

1 Descriptive Checks

1.1 Distance Reversals by Condition

Participants were asked to report perceived distance to the goal at multiple points during the task. A “distance reversal” occurs when a participant reports a lower perceived distance at a point where actual remaining distance is greater. The figure below shows the distribution of reversal counts by condition.

There are a notable number of participants with high reversal counts, suggesting some participants may have misunderstood the perceived distance question or responded inattentively. Importantly, the distribution of reversals looks similar across conditions, suggesting this pattern is not specific to any one visualization format.

Figure 1: Distribution of Distance Reversals by Condition

Table 1

Condition	Main Sample (neg_count < 3)	Distance Reversers (neg_count > 6)
Horizontal Bar	88	50
Vertical Bar	102	33
Ring	106	31
Total	296	114

1.2 Goal Completion by Condition

The table below shows the proportion of participants who reached the goal in each condition.

Table 2

Condition	Reached Goal	N	Proportion
Horizontal Bar	no	7	0.05
Horizontal Bar	yes	144	0.95
Vertical Bar	no	12	0.08
Vertical Bar	yes	142	0.92
Ring	no	9	0.06
Ring	yes	144	0.94

1.3 Time Spent on Task by Condition

The figures below show the distribution of time spent on the game page by condition, first on the raw scale and then on a log scale to better visualize the spread given right skew.

Figure 2: Distribution of Time on Game Page by Condition (raw scale)

Figure 3: Distribution of Time on Game Page by Condition (log scale)

1.4 Condition Differences on Motivation and Task Appraisal Measures

1.4.1 Factor Structure

Before computing composites, we examined the factor structure of the value and progress items using parallel analysis. Results suggested two factors for the value items and one factor for the progress items. All value items loaded onto a single dominant factor (loadings: value_1 = .85, value_2 = .87, value_3_rs = .76, value_4_rs = .61), so they were combined into a single composite after reverse-scoring value_3 and value_4. The three progress items loaded onto a single factor and were combined into a composite after reverse-scoring progress_3.

1.4.2 Means and Standard Deviations by Condition

The table below presents means and standard deviations for the composite measures and individual task appraisal items by condition.

Table 3

Condition	Progress M	Progress SD	Value M	Value SD
Horizontal Bar	5.32	1.16	5.19	1.38
Vertical Bar	5.36	1.16	5.14	1.06
Ring	5.65	1.08	5.25	1.23

Table 4

cond	challenging_mean	challenging_sd	attainable_mean	attainable_sd	enjoyable_mean	enjoyable_sd	effort_mean	effort_sd
Horizontal Bar	4.25	1.61	6.10	0.93	5.20	1.53	6.28	1.01
Vertical Bar	4.26	1.69	5.98	1.05	5.37	1.38	6.35	0.86
Ring	3.81	1.57	6.01	1.19	5.04	1.54	6.16	1.14

1.4.3 Pairwise Comparisons Across Conditions

Table 5


    Pairwise comparisons using t tests with pooled SD 

data:  df_analyses$composite_progress and df_analyses$cond 

             Horizontal Bar Vertical Bar
Vertical Bar 0.80           -           
Ring         0.11           0.12        

P value adjustment method: holm

Table 6


    Pairwise comparisons using t tests with pooled SD 

data:  df_analyses$composite_value and df_analyses$cond 

             Horizontal Bar Vertical Bar
Vertical Bar 1              -           
Ring         1              1           

P value adjustment method: holm

2 Perceived Distance vs. Actual Distance to Goal

The analyses in this section are restricted to participants with fewer than 3 distance reversals (N = XXX). Perceived distance was assessed via probe ratings at up to 10 points during the task. Actual remaining distance was computed as the percentage of the task remaining at each probe point.

2.1 Raw Perceived vs. Actual Remaining Distance

The figure below plots probe ratings (perceived remaining distance) against actual remaining distance (%) for each condition. The dashed identity line indicates perfect accuracy — points above the line indicate overestimation and points below indicate underestimation.

Figure 4: Perceived vs. Actual Remaining Distance by Condition

2.1.1 Distance Reversers

Figure 5: Perceived vs. Actual Remaining Distance by Condition (Reversed Sample)

2.2 Signed Deviations from Actual Remaining Distance

To more clearly examine systematic biases, the figure below plots signed deviations (probe rating minus actual remaining %) against actual remaining distance. Positive values indicate overestimation; negative values indicate underestimation. The dashed line at zero represents perfect accuracy.

Figure 6: Signed Deviations from Actual Remaining Distance by Condition

2.2.1 Distance Reversers

Figure 7: Signed Deviations from Actual Remaining Distance (Reversed Sample)

2.3 Mean Deviations Across Conditions

Figure 8: Mean Signed Deviation by Actual Remaining Distance and Condition

2.3.1 Distance Reversers

Figure 9: Mean Signed Deviation by Actual Remaining Distance and Condition (Reversed Sample)

2.4 Binned Mean Deviations

To summarize the above, the figure below bins actual remaining distance into quartile ranges and plots mean deviations by condition. A consistent pattern emerges: participants underestimate remaining distance when the task is far from complete and overestimate it as they get closer, though the point at which this crossover occurs appears to vary by condition.

Figure 10: Mean Signed Deviation by Binned Remaining Distance and Condition

2.4.1 Distance Reversers

Figure 11: Mean Signed Deviation by Binned Remaining Distance (Reversed Sample)

2.5 Zoomed View Around Key Progress Thresholds

The figure below zooms into ±5% windows around the 25%, 50%, and 75% remaining distance marks to examine whether the crossover from underestimation to overestimation differs across conditions. Preliminary inspection suggests the crossover occurs earliest for the Vertical Bar (around 75% remaining), later for the Horizontal Bar (around 60% remaining), and latest for the Ring (around 50% remaining).

Figure 12: Mean Deviations Around Key Progress Thresholds by Condition

2.5.1 Distance Reversers

Figure 13: Mean Deviations Around Key Progress Thresholds (Reversed Sample)

2.6 Mean Absolute Deviation (MAD) by Condition

Mean absolute deviation (MAD) captures overall inaccuracy in perceived distance regardless of direction. The table and pairwise tests below compare MAD across conditions for participants with fewer than 3 distance reversals.

Table 7

Condition	Mean MAD	SD
Horizontal Bar	4.85	4.61
Vertical Bar	5.53	3.83
Ring	3.55	2.81

Table 8


    Pairwise comparisons using t tests with pooled SD 

data:  df_mad$mad and df_mad$cond 

             Horizontal Bar Vertical Bar
Vertical Bar 0.20103        -           
Ring         0.03042        0.00036     

P value adjustment method: holm

2.6.1 Distance Reversers

Table 9

Condition	Mean MAD	SD
Horizontal Bar	6.09	4.32
Vertical Bar	6.33	4.39
Ring	4.51	4.01

Table 10


    Pairwise comparisons using t tests with pooled SD 

data:  df_mad_rev$mad and df_mad_rev$cond 

             Horizontal Bar Vertical Bar
Vertical Bar 0.81           -           
Ring         0.27           0.27        

P value adjustment method: holm

2.7 Mixed Model: Condition x Remaining Distance Interaction

To formally test whether the trajectory of perceived distance deviations across the task differs by condition, we fit a mixed model with a natural cubic spline for remaining distance, condition, and their interaction. Participant ID was included as a random effect to account for the repeated measures structure. Models are restricted to participants who reached the goal.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: probe_dev ~ ns(prog_rem_perc, df = 3) * cond + (1 | id)
   Data: df_long_dev_goal

REML criterion at convergence: 20282.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-10.5439  -0.3514  -0.0144   0.3864   6.0915 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 13.34    3.653   
 Residual             50.33    7.094   
Number of obs: 2950, groups:  id, 295

Fixed effects:
                                             Estimate Std. Error        df
(Intercept)                                    5.3092     0.9846 2550.1547
ns(prog_rem_perc, df = 3)1                    -1.4524     0.9651 2649.5067
ns(prog_rem_perc, df = 3)2                   -10.4814     2.2158 2651.1846
ns(prog_rem_perc, df = 3)3                    -5.5346     0.7647 2648.2713
condVertical Bar                              -1.4010     1.3410 2538.5146
condRing                                      -2.1374     1.3301 2545.9300
ns(prog_rem_perc, df = 3)1:condVertical Bar    4.1751     1.3185 2649.8355
ns(prog_rem_perc, df = 3)2:condVertical Bar    2.1175     3.0177 2652.2416
ns(prog_rem_perc, df = 3)3:condVertical Bar   -1.1170     1.0472 2648.3329
ns(prog_rem_perc, df = 3)1:condRing           -0.6084     1.3042 2649.9169
ns(prog_rem_perc, df = 3)2:condRing            4.4842     2.9920 2652.0760
ns(prog_rem_perc, df = 3)3:condRing            0.5973     1.0406 2648.2424
                                            t value Pr(>|t|)    
(Intercept)                                   5.392 7.60e-08 ***
ns(prog_rem_perc, df = 3)1                   -1.505  0.13247    
ns(prog_rem_perc, df = 3)2                   -4.730 2.36e-06 ***
ns(prog_rem_perc, df = 3)3                   -7.238 5.95e-13 ***
condVertical Bar                             -1.045  0.29625    
condRing                                     -1.607  0.10819    
ns(prog_rem_perc, df = 3)1:condVertical Bar   3.167  0.00156 ** 
ns(prog_rem_perc, df = 3)2:condVertical Bar   0.702  0.48292    
ns(prog_rem_perc, df = 3)3:condVertical Bar  -1.067  0.28622    
ns(prog_rem_perc, df = 3)1:condRing          -0.466  0.64090    
ns(prog_rem_perc, df = 3)2:condRing           1.499  0.13406    
ns(prog_rem_perc, df = 3)3:condRing           0.574  0.56606    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) ns(__,d=3)1 ns(__,d=3)2 ns(__,d=3)3 cndVrB cndRng
ns(__,d=3)1 -0.218                                                  
ns(__,d=3)2 -0.870  0.062                                           
ns(__,d=3)3 -0.207  0.023       0.276                               
condVrtclBr -0.734  0.160       0.638       0.152                   
condRing    -0.740  0.162       0.644       0.154       0.543       
n(__,d=3)1B  0.160 -0.732      -0.045      -0.017      -0.218 -0.118
n(__,d=3)2B  0.638 -0.045      -0.734      -0.203      -0.868 -0.473
n(__,d=3)3B  0.151 -0.016      -0.202      -0.730      -0.206 -0.112
n(__,d=3)1:  0.162 -0.740      -0.046      -0.017      -0.119 -0.222
n(__,d=3)2:  0.644 -0.046      -0.741      -0.204      -0.473 -0.869
n(__,d=3)3:  0.152 -0.017      -0.203      -0.735      -0.112 -0.209
            n(__,d=3)1B n(__,d=3)2B n(__,d=3)3B n(__,d=3)1: n(__,d=3)2:
ns(__,d=3)1                                                            
ns(__,d=3)2                                                            
ns(__,d=3)3                                                            
condVrtclBr                                                            
condRing                                                               
n(__,d=3)1B                                                            
n(__,d=3)2B  0.061                                                     
n(__,d=3)3B  0.022       0.275                                         
n(__,d=3)1:  0.542       0.034       0.012                             
n(__,d=3)2:  0.033       0.544       0.149       0.064                 
n(__,d=3)3:  0.012       0.149       0.537       0.017       0.280

The figure below compares the raw loess curves (dashed) against the model fitted values (solid) to assess how well the spline captures the observed patterns.

Figure 14: Model Fit vs. Raw Loess Curves by Condition

2.7.1 Distance Reversers

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: probe_dev ~ ns(prog_rem_perc, df = 3) * cond + (1 | id)
   Data: df_long_dev_rev

REML criterion at convergence: 8891.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.3868 -0.3919 -0.0677  0.1955  7.6198 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept)   5.729   2.393  
 Residual             144.569  12.024  
Number of obs: 1140, groups:  id, 114

Fixed effects:
                                             Estimate Std. Error        df
(Intercept)                                   13.2222     1.9065 1125.3148
ns(prog_rem_perc, df = 3)1                    -2.4578     2.1624 1021.7439
ns(prog_rem_perc, df = 3)2                   -23.6792     4.6493 1021.9012
ns(prog_rem_perc, df = 3)3                    -5.3997     1.7322 1019.6413
condVertical Bar                               4.3867     3.0719 1126.0116
condRing                                       3.6978     3.0934 1125.4777
ns(prog_rem_perc, df = 3)1:condVertical Bar    4.2325     3.4323 1021.9315
ns(prog_rem_perc, df = 3)2:condVertical Bar   -9.2434     7.4732 1025.5782
ns(prog_rem_perc, df = 3)3:condVertical Bar    3.1396     2.7506 1019.0970
ns(prog_rem_perc, df = 3)1:condRing           -5.3844     3.5130 1021.3940
ns(prog_rem_perc, df = 3)2:condRing           -9.6029     7.5251 1023.1885
ns(prog_rem_perc, df = 3)3:condRing           -0.1657     2.7900 1020.3052
                                            t value Pr(>|t|)    
(Intercept)                                   6.935 6.82e-12 ***
ns(prog_rem_perc, df = 3)1                   -1.137  0.25598    
ns(prog_rem_perc, df = 3)2                   -5.093 4.19e-07 ***
ns(prog_rem_perc, df = 3)3                   -3.117  0.00188 ** 
condVertical Bar                              1.428  0.15356    
condRing                                      1.195  0.23219    
ns(prog_rem_perc, df = 3)1:condVertical Bar   1.233  0.21781    
ns(prog_rem_perc, df = 3)2:condVertical Bar  -1.237  0.21642    
ns(prog_rem_perc, df = 3)3:condVertical Bar   1.141  0.25396    
ns(prog_rem_perc, df = 3)1:condRing          -1.533  0.12565    
ns(prog_rem_perc, df = 3)2:condRing          -1.276  0.20221    
ns(prog_rem_perc, df = 3)3:condRing          -0.059  0.95266    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) ns(__,d=3)1 ns(__,d=3)2 ns(__,d=3)3 cndVrB cndRng
ns(__,d=3)1 -0.217                                                  
ns(__,d=3)2 -0.922  0.027                                           
ns(__,d=3)3 -0.201  0.012       0.259                               
condVrtclBr -0.621  0.134       0.572       0.125                   
condRing    -0.616  0.134       0.568       0.124       0.382       
n(__,d=3)1B  0.137 -0.630      -0.017      -0.007      -0.222 -0.084
n(__,d=3)2B  0.574 -0.017      -0.622      -0.161      -0.925 -0.354
n(__,d=3)3B  0.127 -0.007      -0.163      -0.630      -0.207 -0.078
n(__,d=3)1:  0.133 -0.616      -0.017      -0.007      -0.083 -0.220
n(__,d=3)2:  0.570 -0.017      -0.618      -0.160      -0.354 -0.923
n(__,d=3)3:  0.125 -0.007      -0.161      -0.621      -0.078 -0.207
            n(__,d=3)1B n(__,d=3)2B n(__,d=3)3B n(__,d=3)1: n(__,d=3)2:
ns(__,d=3)1                                                            
ns(__,d=3)2                                                            
ns(__,d=3)3                                                            
condVrtclBr                                                            
condRing                                                               
n(__,d=3)1B                                                            
n(__,d=3)2B  0.035                                                     
n(__,d=3)3B  0.011       0.265                                         
n(__,d=3)1:  0.388       0.010       0.005                             
n(__,d=3)2:  0.011       0.384       0.101       0.033                 
n(__,d=3)3:  0.005       0.100       0.391       0.006       0.264

Figure 15: Model Fit vs. Raw Loess Curves by Condition (Reversed Sample)

3 Keystroke Analyses

Average keystrokes per interval are used as a behavioral measure of effort or engagement during the task. Analyses are restricted to participants with fewer than 3 distance reversals who reached the goal. Keystrokes are examined separately for each of the 10 intervals to capture how engagement evolves over the course of the task.

Table 11: Mean Keystrokes by Interval and Condition

Condition	Interval	Mean Keystrokes	SD
Horizontal Bar	1	18.80	26.90
Horizontal Bar	2	25.40	31.50
Horizontal Bar	3	14.80	12.04
Horizontal Bar	4	18.62	16.95
Horizontal Bar	5	19.09	14.65
Horizontal Bar	6	20.43	22.36
Horizontal Bar	7	23.65	24.85
Horizontal Bar	8	22.31	19.02
Horizontal Bar	9	21.34	22.21
Horizontal Bar	10	21.72	16.62
Vertical Bar	1	16.51	14.92
Vertical Bar	2	24.08	38.20
Vertical Bar	3	18.35	18.31
Vertical Bar	4	21.17	31.58
Vertical Bar	5	19.99	17.64
Vertical Bar	6	22.66	24.23
Vertical Bar	7	24.39	28.67
Vertical Bar	8	22.19	29.58
Vertical Bar	9	24.96	31.72
Vertical Bar	10	25.22	26.79
Ring	1	15.04	7.34
Ring	2	19.75	16.68
Ring	3	15.36	16.22
Ring	4	18.76	15.14
Ring	5	18.28	17.72
Ring	6	19.60	19.01
Ring	7	24.56	26.80
Ring	8	21.00	18.23
Ring	9	25.22	25.12
Ring	10	23.25	24.68

Figure 16: Mean Keystrokes per Interval by Condition