Lack of integrated number sense among college students: Study 1

Databases

Var1	Freq
[Decline to Answer]	7
18	54
19	55
20	35
21	27
22	18
23	6
24	5
25	1
27	1
29	4
33	3
39	1
46	1
55	1
N/A	1

## [1] 220

## [1] 220

## [1] 220

## Warning in mean(as.numeric(as.character(sequential_participant$age)), na.rm =
## T): NAs introduced by coercion

## [1] 20.49528

## Warning in is.data.frame(x): NAs introduced by coercion

## [1] 4.141453

## 
## 1 Female   2 Male  3 Other 
##      161       56        3

## 
##                             [Decline to Answer] 
##                                               7 
##                     1 Black or African American 
##                                              38 
##                             10 Another identity 
##                                               3 
##              2 American Indian or Alaska Native 
##                                               1 
##           3 Native Hawaiian or Pacific Islander 
##                                               1 
##                       4 Asian or Asian American 
##                                              32 
##                    5 White or European American 
##                                              21 
## 6 Latino or Hispanic or Chicano or Puerto Rican 
##                                              66 
##               7 Middle Eastern or North African 
##                                              26 
##                   8 South Asian or Asian Indian 
##                                              19 
##                                   9 Multiracial 
##                                               6

## 
##                 [Decline to Answer]                 1 Arts and Sciences 
##                                   9                                  66 
##          11 Undeclared or Uncertain                          2 Business 
##                                  20                                  15 
##                        3 Psychology                         4 Education 
##                                  55                                   1 
##                         5 Sociology                           6 Nursing 
##                                   3                                  25 
##                               7 Law                  8 Criminal Justice 
##                                   1                                  24 
## 9 Public Affairs and Administration 
##                                   1

Group Level Analyses

Contrasting Performance on Within and Cross-Notation Comparison

agg_all_participants8_accuracy_gather_both = aggregate(accuracy ~ participant * cross, agg_all_participants8_accuracy_gatherv2_withdem, mean)
summarySE(agg_all_participants8_accuracy_gather_both, "accuracy","cross")

##    cross   N  accuracy        sd         se         ci
## 1  cross 220 0.8448232 0.1669978 0.01125899 0.02218984
## 2 within 220 0.8722222 0.1697333 0.01144342 0.02255332

t.test(accuracy ~ cross, agg_all_participants8_accuracy_gather_both, paired =T)

## 
##  Paired t-test
## 
## data:  accuracy by cross
## t = -3.9641, df = 219, p-value = 9.974e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.04102104 -0.01377694
## sample estimates:
## mean difference 
##     -0.02739899

cohensD(accuracy ~  cross, data = agg_all_participants8_accuracy_gather_both, method = "paired")

## Warning in cohensD(accuracy ~ cross, data =
## agg_all_participants8_accuracy_gather_both, : calculating paired samples
## Cohen's d using formula input. Results will be incorrect if cases do not appear
## in the same order for both levels of the grouping factor

## [1] 0.2672609

agg_all_participants8_accuracy_gather_both_spread = spread(agg_all_participants8_accuracy_gather_both, cross, accuracy)

cor.test(agg_all_participants8_accuracy_gather_both_spread$cross, agg_all_participants8_accuracy_gather_both_spread$within)

## 
##  Pearson's product-moment correlation
## 
## data:  agg_all_participants8_accuracy_gather_both_spread$cross and agg_all_participants8_accuracy_gather_both_spread$within
## t = 20.747, df = 218, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7648997 0.8548836
## sample estimates:
##       cor 
## 0.8147403

agg_all_participants8_accuracy_gather_both_spread$crossmedian = ifelse(agg_all_participants8_accuracy_gather_both_spread$cross>.9167, "high", "low")

Within-Notation Magnitude Comparison

agg_all_participants8_accuracy_gather_within = subset(agg_all_participants8_accuracy_gatherv2_withdem, cross == "within")
agg_all_participants8_accuracy_gather_within = aggregate(accuracy ~ participant * type, agg_all_participants8_accuracy_gather_within,  mean)
ezANOVA(agg_all_participants8_accuracy_gather_within, dv = .(accuracy), wid = .(participant), within  = .c(type))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Converting "type" to factor for ANOVA.

## $ANOVA
##   Effect DFn DFd        F            p p<.05        ges
## 2   type   2 438 50.11716 2.512447e-20     * 0.08389823
## 
## $`Mauchly's Test for Sphericity`
##   Effect         W         p p<.05
## 2   type 0.9838079 0.1687427      
## 
## $`Sphericity Corrections`
##   Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF] p[HF]<.05
## 2   type 0.9840659 4.815195e-20         * 0.9929164 3.354957e-20         *

summarySE(agg_all_participants8_accuracy_gather_within, "accuracy", "type")

##   type   N  accuracy        sd          se         ci
## 1  dvd 220 0.8401515 0.2594319 0.017490894 0.03447202
## 2  fvf 220 0.8121212 0.2489033 0.016781058 0.03307304
## 3  pvp 220 0.9643939 0.1217990 0.008211686 0.01618405

agg_all_participants8_accuracy_gather_within_nodvd = subset(agg_all_participants8_accuracy_gather_within, type != "dvd")
t.test(accuracy ~ type, agg_all_participants8_accuracy_gather_within_nodvd, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by type
## t = -9.1216, df = 219, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.1851734 -0.1193720
## sample estimates:
## mean difference 
##      -0.1522727

cohensD(accuracy ~  type, data = agg_all_participants8_accuracy_gather_within_nodvd, method = "paired")

## Warning in cohensD(accuracy ~ type, data =
## agg_all_participants8_accuracy_gather_within_nodvd, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.6149793

agg_all_participants8_accuracy_gather_within_nofvf = subset(agg_all_participants8_accuracy_gather_within, type != "fvf")
t.test(accuracy ~ type, agg_all_participants8_accuracy_gather_within_nofvf, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by type
## t = -7.4395, df = 219, p-value = 2.271e-12
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.15715626 -0.09132858
## sample estimates:
## mean difference 
##      -0.1242424

cohensD(accuracy ~  type, data = agg_all_participants8_accuracy_gather_within_nofvf, method = "paired")

## Warning in cohensD(accuracy ~ type, data =
## agg_all_participants8_accuracy_gather_within_nofvf, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.5015737

agg_all_participants8_accuracy_gather_within_nopvp = subset(agg_all_participants8_accuracy_gather_within, type != "pvp")
t.test(accuracy ~ type, agg_all_participants8_accuracy_gather_within_nopvp, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by type
## t = 1.8533, df = 219, p-value = 0.06519
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.001778429  0.057839035
## sample estimates:
## mean difference 
##       0.0280303

t.test(accuracy ~ type, agg_all_participants8_accuracy_gather_within_nopvp, paired = T)$statistic

##        t 
## 1.853271

cohensD(accuracy ~  type, data = agg_all_participants8_accuracy_gather_within_nopvp, method = "paired")

## Warning in cohensD(accuracy ~ type, data =
## agg_all_participants8_accuracy_gather_within_nopvp, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.1249475

Cross-Notation Magnitude Comparison

agg_all_participants8_accuracy_gather_cross = subset(agg_all_participants8_accuracy_gatherv2_withdem, cross == "cross")

agg_all_participants8_accuracy_gather_cross$components = ifelse(agg_all_participants8_accuracy_gather_cross$type == "dgtf" | 
                                                                  agg_all_participants8_accuracy_gather_cross$type == "fgtd", "df",
                                                                ifelse(agg_all_participants8_accuracy_gather_cross$type == "dgtp" |
                                                                         agg_all_participants8_accuracy_gather_cross$type == "pgtd", "dp","fp"))

agg_all_participants8_accuracy_gather_cross = separate(agg_all_participants8_accuracy_gather_cross, type, c("greater","other"), sep =  "t", remove = F)
agg_all_participants8_accuracy_gather_cross = aggregate(accuracy ~ participant * greater * components, agg_all_participants8_accuracy_gather_cross, mean)

summarySE(agg_all_participants8_accuracy_gather_cross, "accuracy", c("components","greater"))

##   components greater   N  accuracy        sd         se         ci
## 1         df      dg 220 0.8416667 0.2358233 0.01589920 0.03133503
## 2         df      fg 220 0.8234848 0.2478297 0.01670868 0.03293038
## 3         dp      dg 220 0.8143939 0.2573523 0.01735069 0.03419570
## 4         dp      pg 220 0.9363636 0.1543909 0.01040903 0.02051469
## 5         fp      fg 220 0.7439394 0.2985482 0.02012812 0.03966961
## 6         fp      pg 220 0.9090909 0.1761191 0.01187395 0.02340183

Pair-wise Comparsions

Fractions vs. Percentages

rutgers_dataset_between_gather_fp = subset(agg_all_participants8_accuracy_gather_cross, components =="fp")
rutgers_dataset_between_gather_fp$greater = as.factor(as.character(rutgers_dataset_between_gather_fp$greater))
rutgers_dataset_between_gather_fp$greater <- factor(rutgers_dataset_between_gather_fp$greater, levels=c("pg","fg"))
rutgers_dataset_between_gather_fp$comparison = "Percent vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_fp, "accuracy", "greater")

##   greater   N  accuracy        sd         se         ci
## 1      pg 220 0.9090909 0.1761191 0.01187395 0.02340183
## 2      fg 220 0.7439394 0.2985482 0.02012812 0.03966961

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 8.403, df = 219, p-value = 5.517e-15
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.1264167 0.2038863
## sample estimates:
## mean difference 
##       0.1651515

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp, paired = T)$statistic

##        t 
## 8.403026

cohensD(accuracy ~  greater, data = rutgers_dataset_between_gather_fp, method = "paired")

## Warning in cohensD(accuracy ~ greater, data =
## rutgers_dataset_between_gather_fp, : calculating paired samples Cohen's d using
## formula input. Results will be incorrect if cases do not appear in the same
## order for both levels of the grouping factor

## [1] 0.5665319

Decimals vs. Percentages

rutgers_dataset_between_gather_dp = subset(agg_all_participants8_accuracy_gather_cross, components =="dp")
rutgers_dataset_between_gather_dp$greater = as.factor(as.character(rutgers_dataset_between_gather_dp$greater))
rutgers_dataset_between_gather_dp$greater <- factor(rutgers_dataset_between_gather_dp$greater, levels=c("pg","dg"))
rutgers_dataset_between_gather_dp$comparison = "Percent vs. Decimal \nComparisons"
summarySE(rutgers_dataset_between_gather_dp, "accuracy", "greater")

##   greater   N  accuracy        sd         se         ci
## 1      pg 220 0.9363636 0.1543909 0.01040903 0.02051469
## 2      dg 220 0.8143939 0.2573523 0.01735069 0.03419570

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 6.6862, df = 219, p-value = 1.877e-10
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.08601756 0.15792183
## sample estimates:
## mean difference 
##       0.1219697

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp, paired = T)$statistic

##       t 
## 6.68624

cohensD(accuracy ~  greater, data = rutgers_dataset_between_gather_dp, method = "paired")

## Warning in cohensD(accuracy ~ greater, data =
## rutgers_dataset_between_gather_dp, : calculating paired samples Cohen's d using
## formula input. Results will be incorrect if cases do not appear in the same
## order for both levels of the grouping factor

## [1] 0.4507862

Decimals vs. Fractions

rutgers_dataset_between_gather_df = subset(agg_all_participants8_accuracy_gather_cross, components =="df")
rutgers_dataset_between_gather_df$greater = as.factor(as.character(rutgers_dataset_between_gather_df$greater))
rutgers_dataset_between_gather_df$greater <- factor(rutgers_dataset_between_gather_df$greater, levels=c("dg","fg"))
rutgers_dataset_between_gather_df$comparison = "Decimal vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_df, "accuracy", "greater")

##   greater   N  accuracy        sd         se         ci
## 1      dg 220 0.8416667 0.2358233 0.01589920 0.03133503
## 2      fg 220 0.8234848 0.2478297 0.01670868 0.03293038

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 0.983, df = 219, p-value = 0.3267
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.01827151  0.05463515
## sample estimates:
## mean difference 
##      0.01818182

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df, paired = T)$statistic

##        t 
## 0.983003

t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df, paired = T)$p.value

## [1] 0.3266907

cohensD(accuracy ~  greater, data = rutgers_dataset_between_gather_df, method = "paired")

## Warning in cohensD(accuracy ~ greater, data =
## rutgers_dataset_between_gather_df, : calculating paired samples Cohen's d using
## formula input. Results will be incorrect if cases do not appear in the same
## order for both levels of the grouping factor

## [1] 0.06627405

Figure 2

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Figure 2. Percent correct for cross-notation magnitude comparison: (A) percent vs. fraction comparisons (e.g., 2/5 vs. 25%), (B) percent vs. decimal comparisons (e.g., 40% vs. .25), and (C) decimal vs. fraction comparisons (e.g., .40 vs. 1/4). Participants exhibited a bias to select the percentages as larger than fractions and decimals; however, there was no bias among the fraction vs. decimal comparisons. Gray lines represent individual participants’ average scores in each of the conditions. Thicker gray lines indicate more participants with the same scores. Error bars represent ± 1 Standard Error. Note. ***p < .001.

Cluster Analyses

Determining the number of clusters

rutgers_dataset_between_gather_fp_spread = rutgers_dataset_between_gather_fp[c("participant","greater","accuracy")]

rutgers_dataset_between_gather_fp_spread = spread(data = rutgers_dataset_between_gather_fp_spread, value = accuracy, key = greater)
set.seed(240)
rutgers_n_clust <- n_clusters(rutgers_dataset_between_gather_fp_spread[-1],
                      package = "all",
                      standardize = FALSE, n_max = 10)
rutgers_n_clust

## # Method Agreement Procedure:
## 
## The choice of 4 clusters is supported by 7 (25.00%) methods out of 28 (Gap_Maechler2012, Gap_Dudoit2002, trcovw, Ratkowsky, PtBiserial, Mcclain, SDindex).

kmax = 10 # the maximum number of clusters we will examine; you can change this
totwss = rep(0,kmax) # will be filled with total sum of within group sum squares
kmfit = list() # create and empty list
for (i in 1:kmax){
  kclus = kmeans(rutgers_dataset_between_gather_fp_spread[-1],centers=i,iter.max=20)
  totwss[i] = kclus$tot.withinss
  kmfit[[i]] = kclus
}

kmeansAIC = function(fit){
  m = ncol(fit$centers)
  n = length(fit$cluster)
  k = nrow(fit$centers)
  D = fit$tot.withinss
  return(D + 2*m*k)
}

aic=sapply(kmfit,kmeansAIC)
#mult.fig(1,main="Simulated data with two clusters")
plot(seq(1,kmax),aic,xlab="Number of clusters",ylab="AIC",pch=20,cex=2)

n = nrow(rutgers_dataset_between_gather_fp_spread[-1])
rsq = 1-(totwss*(n-1))/(totwss[1]*(n-seq(1,kmax)))
cbind(aic,rsq)

##            aic       rsq
##  [1,] 30.31263 0.0000000
##  [2,] 18.56122 0.5967841
##  [3,] 19.30173 0.7199433
##  [4,] 20.94566 0.8094317
##  [5,] 23.58011 0.8614082
##  [6,] 27.02278 0.8824365
##  [7,] 31.03269 0.8814972
##  [8,] 34.43551 0.9043834
##  [9,] 38.41118 0.9048899
## [10,] 42.44341 0.9031594

set.seed(240)
rutgers_kmeans.re <- kmeans(rutgers_dataset_between_gather_fp_spread[-c(1)], centers = 4, nstart = 30, iter.max=500)
rutgers_kmeans.re

## K-means clustering with 4 clusters of sizes 126, 35, 37, 22
## 
## Cluster means:
##          pg        fg
## 1 0.9788360 0.9589947
## 2 0.9666667 0.6047619
## 3 0.8603604 0.2117117
## 4 0.5000000 0.6287879
## 
## Clustering vector:
##   [1] 1 4 1 1 2 2 1 1 2 1 1 1 3 1 1 4 2 3 2 3 1 1 3 2 3 1 1 3 1 1 2 4 2 1 1 1 1
##  [38] 1 1 1 4 1 1 3 1 2 4 1 1 1 2 1 3 1 3 1 1 3 2 3 2 2 1 1 1 1 1 1 1 1 1 3 3 1
##  [75] 3 3 1 1 1 1 1 1 2 2 1 1 4 4 1 2 4 3 4 1 3 3 1 2 1 1 1 2 3 2 3 1 3 1 1 2 3
## [112] 3 4 1 1 3 3 1 1 2 4 3 1 1 2 1 1 2 1 1 1 1 1 4 1 1 3 1 3 3 1 4 1 2 2 1 1 1
## [149] 1 1 1 1 1 1 1 3 2 1 1 1 1 2 1 3 1 3 1 1 2 2 1 4 1 1 1 1 1 4 1 3 3 2 1 1 1
## [186] 1 4 1 1 2 4 1 2 1 3 4 4 1 1 2 4 4 3 1 3 1 1 4 1 1 1 1 1 1 2 1 1 1 2 1
## 
## Within cluster sum of squares by cluster:
## [1] 1.2594797 0.3825397 1.8423423 1.0517677
##  (between_SS / total_SS =  82.8 %)
## 
## Available components:
## 
## [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
## [6] "betweenss"    "size"         "iter"         "ifault"

rutgers_clusterclass = as.data.frame(rutgers_kmeans.re$cluster)
names(rutgers_clusterclass) ="cluster"
rutgers_clusterclass = cbind(rutgers_dataset_between_gather_fp_spread[1],rutgers_clusterclass)


rutgers_clusterclass$cluster = as.factor(as.character(rutgers_clusterclass$cluster))
levels(rutgers_clusterclass$cluster)[levels(rutgers_clusterclass$cluster ) == "1"]  <- "High Performing"
levels(rutgers_clusterclass$cluster)[levels(rutgers_clusterclass$cluster ) == "2"]  <- "Moderate Percentage Bias"
levels(rutgers_clusterclass$cluster)[levels(rutgers_clusterclass$cluster ) == "3"]  <- "Strong Percentage Bias"
levels(rutgers_clusterclass$cluster)[levels(rutgers_clusterclass$cluster ) == "4"]  <- "Fraction Bias"

rutgers_clusterclass$cluster <- factor(rutgers_clusterclass$cluster, levels=c("High Performing","Strong Percentage Bias", "Moderate Percentage Bias", "Fraction Bias"))


agg_all_participants8_accuracy_gather_cross_cluster = agg_all_participants8_accuracy_gather_cross %>%
  left_join(rutgers_clusterclass, by = "participant")

Cross Notation Magnitude Comparison

Fractions vs. Percentages

rutgers_dataset_between_gather_fp = subset(agg_all_participants8_accuracy_gather_cross_cluster, components =="fp")
rutgers_dataset_between_gather_fp$greater = as.factor(as.character(rutgers_dataset_between_gather_fp$greater))
rutgers_dataset_between_gather_fp$greater <- factor(rutgers_dataset_between_gather_fp$greater, levels=c("pg","fg"))
rutgers_dataset_between_gather_fp$comparison = "Percent vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_fp, "accuracy", c("greater","cluster"))

##   greater                  cluster   N  accuracy         sd          se
## 1      pg          High Performing 126 0.9788360 0.06987012 0.006224525
## 2      pg   Strong Percentage Bias  37 0.8603604 0.17792318 0.029250390
## 3      pg Moderate Percentage Bias  35 0.9666667 0.06763995 0.011433239
## 4      pg            Fraction Bias  22 0.5000000 0.13608276 0.029012943
## 5      fg          High Performing 126 0.9589947 0.07206944 0.006420456
## 6      fg   Strong Percentage Bias  37 0.2117117 0.13971227 0.022968556
## 7      fg Moderate Percentage Bias  35 0.6047619 0.08170682 0.013810973
## 8      fg            Fraction Bias  22 0.6287879 0.17766726 0.037878788
##           ci
## 1 0.01231911
## 2 0.05932254
## 3 0.02323514
## 4 0.06033572
## 5 0.01270688
## 6 0.04658239
## 7 0.02806727
## 8 0.07877325

ezANOVA(rutgers_dataset_between_gather_fp, dv = .(accuracy), wid = .(participant), within  = .c(greater), between = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## $ANOVA
##            Effect DFn DFd        F            p p<.05       ges
## 2         cluster   3 216 415.1468 2.295809e-89     * 0.7616300
## 3         greater   1 216 320.4293 1.508078e-44     * 0.3981025
## 4 cluster:greater   3 216 259.2712 2.692032e-71     * 0.6161994

ezANOVA(subset(rutgers_dataset_between_gather_fp, greater=="pg"), dv = .(accuracy), wid = .(participant),  between = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## Coefficient covariances computed by hccm()

## $ANOVA
##    Effect DFn DFd        F            p p<.05       ges
## 1 cluster   3 216 141.1753 1.176313e-50     * 0.6622498
## 
## $`Levene's Test for Homogeneity of Variance`
##   DFn DFd       SSn      SSd        F            p p<.05
## 1   3 216 0.4804341 2.046839 16.89985 6.745513e-10     *

ezANOVA(subset(rutgers_dataset_between_gather_fp, greater=="fg"), dv = .(accuracy), wid = .(participant),  between = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## Coefficient covariances computed by hccm()

## $ANOVA
##    Effect DFn DFd        F             p p<.05       ges
## 1 cluster   3 216 554.9106 3.452573e-101     * 0.8851511
## 
## $`Levene's Test for Homogeneity of Variance`
##   DFn DFd       SSn      SSd        F            p p<.05
## 1   3 216 0.2804062 1.908483 10.57869 1.612863e-06     *

pairwise.t.test(subset(rutgers_dataset_between_gather_fp, greater=="pg")$accuracy, subset(rutgers_dataset_between_gather_fp, greater=="pg")$cluster, p.adj = "none", paired = F)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_fp, greater == "pg")$accuracy and subset(rutgers_dataset_between_gather_fp, greater == "pg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   3.7e-09         -                     
## Moderate Percentage Bias 0.54            1.9e-05               
## Fraction Bias            < 2e-16         < 2e-16               
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            < 2e-16                 
## 
## P value adjustment method: none

pairwise.t.test(subset(rutgers_dataset_between_gather_fp, greater=="fg")$accuracy, subset(rutgers_dataset_between_gather_fp, greater=="fg")$cluster, p.adj = "none")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_fp, greater == "fg")$accuracy and subset(rutgers_dataset_between_gather_fp, greater == "fg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   <2e-16          -                     
## Moderate Percentage Bias <2e-16          <2e-16                
## Fraction Bias            <2e-16          <2e-16                
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            0.39                    
## 
## P value adjustment method: none

rutgers_dataset_between_gather_fp_pbs = subset(rutgers_dataset_between_gather_fp, cluster == "Strong Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp_pbs, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 21.519, df = 36, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.5875170 0.7097803
## sample estimates:
## mean difference 
##       0.6486486

cohensD(accuracy ~  greater, data = rutgers_dataset_between_gather_fp_pbs, method = "paired")

## Warning in cohensD(accuracy ~ greater, data =
## rutgers_dataset_between_gather_fp_pbs, : calculating paired samples Cohen's d
## using formula input. Results will be incorrect if cases do not appear in the
## same order for both levels of the grouping factor

## [1] 3.537776

rutgers_dataset_between_gather_fp_pbm = subset(rutgers_dataset_between_gather_fp, cluster == "Moderate Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp_pbm, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 19.359, df = 34, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.3239136 0.3998959
## sample estimates:
## mean difference 
##       0.3619048

cohen.d(accuracy ~ greater, data = rutgers_dataset_between_gather_fp_pbm, paired = T)

## Warning in cohen.d.formula(accuracy ~ greater, data =
## rutgers_dataset_between_gather_fp_pbm, : Trying to compute paired samples
## Cohen's d using formula input. Results may be incorrect if cases do not appear
## in the same order for both levels of the grouping factor. Use the format 'value
## ~ treatment | Subject(id)' to specify a subject id variable.

## 
## Cohen's d
## 
## d estimate: 4.828607 (large)
## 95 percent confidence interval:
##    lower    upper 
## 3.057857 6.599357

rutgers_dataset_between_gather_fp_nb = subset(rutgers_dataset_between_gather_fp, cluster == "High Performing")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp_nb, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 2.1297, df = 125, p-value = 0.03516
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.001402706 0.038279834
## sample estimates:
## mean difference 
##      0.01984127

cohen.d(accuracy ~ greater, data = rutgers_dataset_between_gather_fp_nb, paired = T)

## Warning in cohen.d.formula(accuracy ~ greater, data =
## rutgers_dataset_between_gather_fp_nb, : Trying to compute paired samples
## Cohen's d using formula input. Results may be incorrect if cases do not appear
## in the same order for both levels of the grouping factor. Use the format 'value
## ~ treatment | Subject(id)' to specify a subject id variable.

## 
## Cohen's d
## 
## d estimate: 0.2795452 (small)
## 95 percent confidence interval:
##      lower      upper 
## 0.01602419 0.54306627

rutgers_dataset_between_gather_fp_fb = subset(rutgers_dataset_between_gather_fp, cluster == "Fraction Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_fp_fb, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = -2.6992, df = 21, p-value = 0.01343
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.22801300 -0.02956276
## sample estimates:
## mean difference 
##      -0.1287879

cohen.d(accuracy ~ greater, data = rutgers_dataset_between_gather_fp_fb, paired = T)

## Warning in cohen.d.formula(accuracy ~ greater, data =
## rutgers_dataset_between_gather_fp_fb, : Trying to compute paired samples
## Cohen's d using formula input. Results may be incorrect if cases do not appear
## in the same order for both levels of the grouping factor. Use the format 'value
## ~ treatment | Subject(id)' to specify a subject id variable.

## 
## Cohen's d
## 
## d estimate: -0.8138413 (large)
## 95 percent confidence interval:
##     lower     upper 
## -1.515877 -0.111806

Decimal vs. Percentages

rutgers_dataset_between_gather_dp = subset(agg_all_participants8_accuracy_gather_cross_cluster, components =="dp")
rutgers_dataset_between_gather_dp$greater = as.factor(as.character(rutgers_dataset_between_gather_dp$greater))
rutgers_dataset_between_gather_dp$greater <- factor(rutgers_dataset_between_gather_dp$greater, levels=c("pg","dg"))
rutgers_dataset_between_gather_dp$comparison = "Percent vs. Decimal \nComparisons"
summarySE(rutgers_dataset_between_gather_dp, "accuracy", c("greater","cluster"))

##   greater                  cluster   N  accuracy         sd          se
## 1      pg          High Performing 126 0.9814815 0.05665577 0.005047297
## 2      pg   Strong Percentage Bias  37 0.8783784 0.19895037 0.032707239
## 3      pg Moderate Percentage Bias  35 0.9476190 0.13266640 0.022424714
## 4      pg            Fraction Bias  22 0.7575758 0.28511240 0.060786168
## 5      dg          High Performing 126 0.9219577 0.15302361 0.013632427
## 6      dg   Strong Percentage Bias  37 0.5540541 0.29672610 0.048781471
## 7      dg Moderate Percentage Bias  35 0.8238095 0.23550411 0.039807460
## 8      dg            Fraction Bias  22 0.6212121 0.28721348 0.061234119
##            ci
## 1 0.009989228
## 2 0.066333354
## 3 0.045572503
## 4 0.126411757
## 5 0.026980265
## 6 0.098933408
## 7 0.080898492
## 8 0.127343323

ezANOVA(rutgers_dataset_between_gather_dp, dv = .(accuracy), wid = .(participant), within  = .c(greater), between = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## $ANOVA
##            Effect DFn DFd        F            p p<.05        ges
## 2         cluster   3 216 51.32500 4.379918e-25     * 0.26030702
## 3         greater   1 216 50.41754 1.769163e-11     * 0.10569317
## 4 cluster:greater   3 216 10.32666 2.221744e-06     * 0.06770394

pairwise.t.test(subset(rutgers_dataset_between_gather_dp, greater=="pg")$accuracy, subset(rutgers_dataset_between_gather_dp, greater=="pg")$cluster, p.adj = "none")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_dp, greater == "pg")$accuracy and subset(rutgers_dataset_between_gather_dp, greater == "pg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   9.2e-05         -                     
## Moderate Percentage Bias 0.2014          0.0349                
## Fraction Bias            3.1e-11         0.0014                
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            9.4e-07                 
## 
## P value adjustment method: none

pairwise.t.test(subset(rutgers_dataset_between_gather_dp, greater=="dg")$accuracy, subset(rutgers_dataset_between_gather_dp, greater=="dg")$cluster, p.adj = "none")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_dp, greater == "dg")$accuracy and subset(rutgers_dataset_between_gather_dp, greater == "dg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   < 2e-16         -                     
## Moderate Percentage Bias 0.01625         1.8e-07               
## Fraction Bias            4.0e-09         0.24079               
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            0.00054                 
## 
## P value adjustment method: none

rutgers_dataset_between_gather_dp_pbs = subset(rutgers_dataset_between_gather_dp, cluster == "Strong Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp_pbs, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 5.355, df = 36, p-value = 5.059e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.2014934 0.4471552
## sample estimates:
## mean difference 
##       0.3243243

rutgers_dataset_between_gather_dp_pbm = subset(rutgers_dataset_between_gather_dp, cluster == "Strong Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp_pbm, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 5.355, df = 36, p-value = 5.059e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.2014934 0.4471552
## sample estimates:
## mean difference 
##       0.3243243

rutgers_dataset_between_gather_dp_nb = subset(rutgers_dataset_between_gather_dp, cluster == "High Performing")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp_nb, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 4.3756, df = 125, p-value = 2.528e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.03260050 0.08644712
## sample estimates:
## mean difference 
##      0.05952381

rutgers_dataset_between_gather_dp_fb = subset(rutgers_dataset_between_gather_dp, cluster == "Fraction Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_dp_fb, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 1.4692, df = 21, p-value = 0.1566
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.05665188  0.32937915
## sample estimates:
## mean difference 
##       0.1363636

Decimals vs. Fractions

rutgers_dataset_between_gather_df = subset(agg_all_participants8_accuracy_gather_cross_cluster, components =="df")
rutgers_dataset_between_gather_df$greater = as.factor(as.character(rutgers_dataset_between_gather_df$greater))
rutgers_dataset_between_gather_df$greater <- factor(rutgers_dataset_between_gather_df$greater, levels=c("dg","fg"))
rutgers_dataset_between_gather_df$comparison = "Decimal vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_df, "accuracy", c("greater","cluster"))

##   greater                  cluster   N  accuracy        sd          se
## 1      dg          High Performing 126 0.9312169 0.1567303 0.013962641
## 2      dg   Strong Percentage Bias  37 0.6846847 0.2799465 0.046022919
## 3      dg Moderate Percentage Bias  35 0.8857143 0.1704062 0.028803909
## 4      dg            Fraction Bias  22 0.5227273 0.2259340 0.048169292
## 5      fg          High Performing 126 0.9616402 0.0969827 0.008639905
## 6      fg   Strong Percentage Bias  37 0.5810811 0.2822463 0.046401012
## 7      fg Moderate Percentage Bias  35 0.6952381 0.2540062 0.042934881
## 8      fg            Fraction Bias  22 0.6439394 0.2535226 0.054051192
##           ci
## 1 0.02763380
## 2 0.09333881
## 3 0.05853659
## 4 0.10017353
## 5 0.01709945
## 6 0.09410561
## 7 0.08725418
## 8 0.11240561

ezANOVA(rutgers_dataset_between_gather_df, dv = .(accuracy), wid = .(participant), within  = .c(greater), between = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## $ANOVA
##            Effect DFn DFd         F            p p<.05         ges
## 2         cluster   3 216 78.591621 2.092638e-34     * 0.368702893
## 3         greater   1 216  1.087231 2.982511e-01       0.002334823
## 4 cluster:greater   3 216 10.136291 2.831273e-06     * 0.061434469

pairwise.t.test(subset(rutgers_dataset_between_gather_df, greater=="fg")$accuracy, subset(rutgers_dataset_between_gather_df, greater=="fg")$cluster, p.adj = "none")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_df, greater == "fg")$accuracy and subset(rutgers_dataset_between_gather_df, greater == "fg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   < 2e-16         -                     
## Moderate Percentage Bias 2.4e-12         0.01                  
## Fraction Bias            4.4e-12         0.21                  
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            0.32                    
## 
## P value adjustment method: none

pairwise.t.test(subset(rutgers_dataset_between_gather_df, greater=="dg")$accuracy, subset(rutgers_dataset_between_gather_df, greater=="dg")$cluster, p.adj = "none")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  subset(rutgers_dataset_between_gather_df, greater == "dg")$accuracy and subset(rutgers_dataset_between_gather_df, greater == "dg")$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   6.7e-11         -                     
## Moderate Percentage Bias 0.216           1.4e-05               
## Fraction Bias            < 2e-16         0.002                 
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            4.2e-11                 
## 
## P value adjustment method: none

rutgers_dataset_between_gather_df_pbs = subset(rutgers_dataset_between_gather_df, cluster == "Strong Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df_pbs, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 1.6014, df = 36, p-value = 0.118
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.0276047  0.2348119
## sample estimates:
## mean difference 
##       0.1036036

rutgers_dataset_between_gather_df_pbm = subset(rutgers_dataset_between_gather_df, cluster == "Moderate Percentage Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df_pbm, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by greater
## t = 4.1959, df = 34, p-value = 0.0001841
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.0982198 0.2827326
## sample estimates:
## mean difference 
##       0.1904762

rutgers_dataset_between_gather_df_nb = subset(rutgers_dataset_between_gather_df, cluster == "High Performing")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df_nb, paired = T)$p.value

## [1] 0.04733538

rutgers_dataset_between_gather_df_fb = subset(rutgers_dataset_between_gather_df, cluster == "Fraction Bias")
t.test(accuracy  ~ greater, rutgers_dataset_between_gather_df_fb, paired = T)$p.value

## [1] 0.1336147

Figure 3

Figure 3. Cross-notation comparison accuracy for (A) percent vs. fraction comparisons (B) percent vs. decimal comparisons and (C) decimal vs. fraction comparisons, based on the four-cluster model: high performing profile (n = 126), strong percentage bias profile (n = 37), moderate percentage bias profile (n = 35), and fraction bias profile (n = 22). Gray lines represent individual participants’ average scores in each of the conditions. Thicker gray lines indicate more participants with the same scores. Error bars represent ± 1 Standard Error. Note. *p<.05, **p<.01, ***p<.001

Within vs Cross Cluster Comparisons (all)

agg_all_participants8_accuracy_gather_both_nofp = agg_all_participants8_accuracy_gatherv2_withdem

agg_all_participants8_accuracy_gather_both_nofp = aggregate(accuracy ~ participant * cross, agg_all_participants8_accuracy_gather_both_nofp, mean)

agg_all_participants8_accuracy_gather_both_nofp = agg_all_participants8_accuracy_gather_both_nofp %>%
  left_join(rutgers_clusterclass, by = "participant")

agg_all_participants8_accuracy_gather_both_nofp$cluster_bin = ifelse(agg_all_participants8_accuracy_gather_both_nofp$cluster == "High Performing", "High", "Biased")

ezANOVA(agg_all_participants8_accuracy_gather_both_nofp, dv = .(accuracy), wid = .(participant), within  = .c(cross), between = .(cluster_bin))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Converting "cross" to factor for ANOVA.

## Warning: Converting "cluster_bin" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## $ANOVA
##              Effect DFn DFd         F            p p<.05        ges
## 2       cluster_bin   1 218 212.09111 5.216670e-34     * 0.45151557
## 3             cross   1 218  17.33693 4.509089e-05     * 0.01208811
## 4 cluster_bin:cross   1 218  23.61434 2.248726e-06     * 0.01639326

summarySE(agg_all_participants8_accuracy_gather_both_nofp, "accuracy", c("cluster_bin","cross"))

##   cluster_bin  cross   N  accuracy         sd          se         ci
## 1      Biased  cross  94 0.6962175 0.14691891 0.015153529 0.03009191
## 2      Biased within  94 0.7606383 0.18774101 0.019364007 0.03845308
## 3        High  cross 126 0.9556878 0.06168204 0.005495073 0.01087543
## 4        High within 126 0.9554674 0.08889440 0.007919343 0.01567336

agg_all_participants8_accuracy_gather_both_cluster_high = subset(agg_all_participants8_accuracy_gather_both_nofp, cluster_bin == "High")
t.test(accuracy ~ cross, agg_all_participants8_accuracy_gather_both_cluster_high, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by cross
## t = 0.036394, df = 125, p-value = 0.971
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.01176815  0.01220907
## sample estimates:
## mean difference 
##    0.0002204586

cohensD(accuracy ~  cross, data = agg_all_participants8_accuracy_gather_both_cluster_high, method = "paired")

## Warning in cohensD(accuracy ~ cross, data =
## agg_all_participants8_accuracy_gather_both_cluster_high, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.003242245

agg_all_participants8_accuracy_gather_both_cluster_biased = subset(agg_all_participants8_accuracy_gather_both_nofp, cluster_bin == "Biased")
t.test(accuracy ~ cross, agg_all_participants8_accuracy_gather_both_cluster_biased, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by cross
## t = -4.92, df = 93, p-value = 3.726e-06
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.09042241 -0.03841920
## sample estimates:
## mean difference 
##      -0.0644208

cohensD(accuracy ~  cross, data = agg_all_participants8_accuracy_gather_both_cluster_biased, method = "paired")

## Warning in cohensD(accuracy ~ cross, data =
## agg_all_participants8_accuracy_gather_both_cluster_biased, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.5074555

Within vs Cross Cluster Comparisons (no fp)

agg_all_participants8_accuracy_gather_both_nofp = agg_all_participants8_accuracy_gatherv2_withdem
agg_all_participants8_accuracy_gather_both_nofp = subset(agg_all_participants8_accuracy_gather_both_nofp, type != "fgtp")
agg_all_participants8_accuracy_gather_both_nofp = subset(agg_all_participants8_accuracy_gather_both_nofp, type != "pgtf")

agg_all_participants8_accuracy_gather_both_nofp = aggregate(accuracy ~ participant * cross, agg_all_participants8_accuracy_gather_both_nofp, mean)

agg_all_participants8_accuracy_gather_both_nofp = agg_all_participants8_accuracy_gather_both_nofp %>%
  left_join(rutgers_clusterclass, by = "participant")

agg_all_participants8_accuracy_gather_both_nofp$cluster_bin = ifelse(agg_all_participants8_accuracy_gather_both_nofp$cluster == "High Performing", "High", "Biased")

ezANOVA(agg_all_participants8_accuracy_gather_both_nofp, dv = .(accuracy), wid = .(participant), within  = .c(cross), between = .(cluster_bin))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Converting "cross" to factor for ANOVA.

## Warning: Converting "cluster_bin" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## $ANOVA
##              Effect DFn DFd          F            p p<.05         ges
## 2       cluster_bin   1 218 161.383001 4.869303e-28     * 0.385386475
## 3             cross   1 218   6.976675 8.855837e-03     * 0.004872018
## 4 cluster_bin:cross   1 218   3.946059 4.823371e-02     * 0.002761494

summarySE(agg_all_participants8_accuracy_gather_both_nofp, "accuracy", c("cluster_bin","cross"))

##   cluster_bin  cross   N  accuracy         sd          se         ci
## 1      Biased  cross  94 0.7265071 0.16080166 0.016585424 0.03293537
## 2      Biased within  94 0.7606383 0.18774101 0.019364007 0.03845308
## 3        High  cross 126 0.9490741 0.08043094 0.007165357 0.01418113
## 4        High within 126 0.9554674 0.08889440 0.007919343 0.01567336

agg_all_participants8_accuracy_gather_both_cluster_high = subset(agg_all_participants8_accuracy_gather_both_nofp, cluster_bin == "High")
t.test(accuracy ~ cross, agg_all_participants8_accuracy_gather_both_cluster_high, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by cross
## t = -0.95187, df = 125, p-value = 0.343
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.019686172  0.006899576
## sample estimates:
## mean difference 
##    -0.006393298

cohensD(accuracy ~  cross, data = agg_all_participants8_accuracy_gather_both_cluster_high, method = "paired")

## Warning in cohensD(accuracy ~ cross, data =
## agg_all_participants8_accuracy_gather_both_cluster_high, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.0847996

agg_all_participants8_accuracy_gather_both_cluster_biased = subset(agg_all_participants8_accuracy_gather_both_nofp, cluster_bin == "Biased")
t.test(accuracy ~ cross, agg_all_participants8_accuracy_gather_both_cluster_biased, paired = T)

## 
##  Paired t-test
## 
## data:  accuracy by cross
## t = -2.5406, df = 93, p-value = 0.01272
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.060809427 -0.007452985
## sample estimates:
## mean difference 
##     -0.03413121

cohensD(accuracy ~  cross, data = agg_all_participants8_accuracy_gather_both_cluster_biased, method = "paired")

## Warning in cohensD(accuracy ~ cross, data =
## agg_all_participants8_accuracy_gather_both_cluster_biased, : calculating paired
## samples Cohen's d using formula input. Results will be incorrect if cases do
## not appear in the same order for both levels of the grouping factor

## [1] 0.2620395

SAT scores

Analyses

all_participants2_SAT = all_participants2
all_participants3_SAT = all_participants2_SAT %>% 
  dplyr::select(c("participant","Q840.1", "Q841.1","Q842", "Q843", "Q844")) 
all_participants4_SAT = subset(all_participants3_SAT, Q840.1!="") #removes blanks
all_participants4_SAT = subset(all_participants4_SAT, Q842!="") #removes blanks from all those who did not enter a Math SCore
#next I want to find all those that are numeric SAT math scores 
#remove all the ACT scores
all_participants4_SAT = subset(all_participants4_SAT, Q840.1!="ACT")

all_participants5_SAT = all_participants4_SAT[which(all_participants4_SAT$participant %in% sequential_participant$participant),]


all_participants5_SAT$numericSAT = as.numeric(as.character(all_participants5_SAT$Q842))

## Warning: NAs introduced by coercion

all_participants6_SAT = subset(all_participants5_SAT, numericSAT >200 )
all_participants6_SAT = subset(all_participants6_SAT, numericSAT <800)

all_participants6_SAT_clusters = all_participants6_SAT %>%
  left_join(rutgers_clusterclass, by = "participant")

all_participants6_SAT_clusters = all_participants6_SAT_clusters %>%
  left_join(rutgers_dataset_between_gather_fp_spread, by = "participant")
all_participants6_SAT_clusters$fp = (all_participants6_SAT_clusters$pg + all_participants6_SAT_clusters$fg)/2
cor.test(all_participants6_SAT_clusters$numericSAT, all_participants6_SAT_clusters$fp)

## 
##  Pearson's product-moment correlation
## 
## data:  all_participants6_SAT_clusters$numericSAT and all_participants6_SAT_clusters$fp
## t = 3.826, df = 123, p-value = 0.000206
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1596560 0.4745506
## sample estimates:
##       cor 
## 0.3261202

summarySE(all_participants6_SAT_clusters, "numericSAT", "cluster")

##                    cluster  N numericSAT        sd       se        ci
## 1          High Performing 84   575.0595  85.39015  9.31683  18.53080
## 2   Strong Percentage Bias 17   525.9353 114.25190 27.71016  58.74291
## 3 Moderate Percentage Bias 20   534.8500 109.38550 24.45934  51.19399
## 4            Fraction Bias  4   389.0000  64.94100 32.47050 103.33562

ezANOVA(all_participants6_SAT_clusters, dv = .(numericSAT), wid = .(participant), between  = .c(cluster))

## Warning: Converting "participant" to factor for ANOVA.

## Warning: Data is unbalanced (unequal N per group). Make sure you specified a
## well-considered value for the type argument to ezANOVA().

## Coefficient covariances computed by hccm()

## $ANOVA
##    Effect DFn DFd        F            p p<.05       ges
## 1 cluster   3 121 6.366721 0.0004821138     * 0.1363322
## 
## $`Levene's Test for Homogeneity of Variance`
##   DFn DFd     SSn      SSd        F         p p<.05
## 1   3 121 14255.8 429117.3 1.339922 0.2646477

pairwise.t.test(all_participants6_SAT_clusters$numericSAT, all_participants6_SAT_clusters$cluster, p.adj = "none", paired = F)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  all_participants6_SAT_clusters$numericSAT and all_participants6_SAT_clusters$cluster 
## 
##                          High Performing Strong Percentage Bias
## Strong Percentage Bias   0.05008         -                     
## Moderate Percentage Bias 0.08591         0.77266               
## Fraction Bias            0.00016         0.00938               
##                          Moderate Percentage Bias
## Strong Percentage Bias   -                       
## Moderate Percentage Bias -                       
## Fraction Bias            0.00509                 
## 
## P value adjustment method: none

all_participants6_SAT_clusters$cluster_bin = ifelse(all_participants6_SAT_clusters$cluster == "High Performing","High Performing", "Biased")
all_participants6_SAT_clusters$cluster_bin  = as.factor(as.character(all_participants6_SAT_clusters$cluster_bin ))
all_participants6_SAT_clusters$cluster_bin <- factor(all_participants6_SAT_clusters$cluster_bin, levels=c("High Performing", "Biased"))

t.test(numericSAT  ~ cluster_bin, all_participants6_SAT_clusters, paired = F, var.equal = T )

## 
##  Two Sample t-test
## 
## data:  numericSAT by cluster_bin
## t = 3.1875, df = 123, p-value = 0.00182
## alternative hypothesis: true difference in means between group High Performing and group Biased is not equal to 0
## 95 percent confidence interval:
##  22.03294 94.23733
## sample estimates:
## mean in group High Performing          mean in group Biased 
##                      575.0595                      516.9244

cohensD(numericSAT ~  cluster_bin, data = all_participants6_SAT_clusters, method = "pooled")

## [1] 0.6072543

summarySE(all_participants6_SAT_clusters, "numericSAT", "cluster_bin")

##       cluster_bin  N numericSAT        sd       se       ci
## 1 High Performing 84   575.0595  85.39015  9.31683 18.53080
## 2          Biased 41   516.9244 114.24885 17.84267 36.06139

Figure

## Warning in geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 0.5, :
## Ignoring unknown parameters: `shape`

## Bin width defaults to 1/30 of the range of the data. Pick better value with
## `binwidth`.
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?

Figure 5A Descriptive statistics for (A) self-reported SAT scores collected in Study 1 only

Linear Regressions

agg_all_participants8_accuracy_within = subset(agg_all_participants8_accuracy_gatherv2_withdem, cross == "within")
agg_all_participants8_accuracy_within = aggregate(accuracy ~ participant, agg_all_participants8_accuracy_within,  mean)
names(agg_all_participants8_accuracy_within)[2] = "acc_within"

agg_all_participants8_accuracy_cross = subset(agg_all_participants8_accuracy_gatherv2_withdem, cross == "cross")
agg_all_participants8_accuracy_cross = aggregate(accuracy ~ participant, agg_all_participants8_accuracy_cross, mean)
names(agg_all_participants8_accuracy_cross)[2] = "acc_cross"


all_participants6_SAT_cross = all_participants6_SAT %>%
  left_join(agg_all_participants8_accuracy_within, by = "participant")

all_participants6_SAT_cross = all_participants6_SAT_cross %>%
  left_join(agg_all_participants8_accuracy_cross, by = "participant")

colMeans(all_participants6_SAT_cross[c("numericSAT","acc_within","acc_cross")])

##  numericSAT  acc_within   acc_cross 
## 555.9912000   0.9022222   0.8844444

sd(all_participants6_SAT_cross$numericSAT)

## [1] 99.20742

cor.test(all_participants6_SAT_cross$numericSAT, all_participants6_SAT_cross$acc_within)

## 
##  Pearson's product-moment correlation
## 
## data:  all_participants6_SAT_cross$numericSAT and all_participants6_SAT_cross$acc_within
## t = 2.8363, df = 123, p-value = 0.005339
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.07544116 0.40572101
## sample estimates:
##       cor 
## 0.2477663

cor.test(all_participants6_SAT_cross$numericSAT, all_participants6_SAT_cross$acc_cross)

## 
##  Pearson's product-moment correlation
## 
## data:  all_participants6_SAT_cross$numericSAT and all_participants6_SAT_cross$acc_cross
## t = 3.8077, df = 123, p-value = 0.0002201
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1581372 0.4733424
## sample estimates:
##      cor 
## 0.324727

model1 = lm(numericSAT ~ acc_within, all_participants6_SAT_cross)
summary(model1)

## 
## Call:
## lm(formula = numericSAT ~ acc_within, data = all_participants6_SAT_cross)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -253.819  -62.869   -1.919   58.081  243.330 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   409.02      52.53   7.786 2.44e-12 ***
## acc_within    162.89      57.43   2.836  0.00534 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 96.5 on 123 degrees of freedom
## Multiple R-squared:  0.06139,    Adjusted R-squared:  0.05376 
## F-statistic: 8.045 on 1 and 123 DF,  p-value: 0.005339

model1_std = lm.beta(model1)
summary(model1_std)

## 
## Call:
## lm(formula = numericSAT ~ acc_within, data = all_participants6_SAT_cross)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -253.819  -62.869   -1.919   58.081  243.330 
## 
## Coefficients:
##             Estimate Standardized Std. Error t value Pr(>|t|)    
## (Intercept) 409.0241           NA    52.5305   7.786 2.44e-12 ***
## acc_within  162.8946       0.2478    57.4321   2.836  0.00534 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 96.5 on 123 degrees of freedom
## Multiple R-squared:  0.06139,    Adjusted R-squared:  0.05376 
## F-statistic: 8.045 on 1 and 123 DF,  p-value: 0.005339

model2 = lm(numericSAT ~ acc_within + acc_cross , all_participants6_SAT_cross) 
summary(model2)

## 
## Call:
## lm(formula = numericSAT ~ acc_within + acc_cross, data = all_participants6_SAT_cross)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -237.99  -62.44   -0.44   60.82  215.02 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   372.63      53.55   6.958 1.86e-10 ***
## acc_within    -24.47      94.56  -0.259   0.7963    
## acc_cross     232.28      94.21   2.466   0.0151 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 94.57 on 122 degrees of freedom
## Multiple R-squared:  0.1059, Adjusted R-squared:  0.09128 
## F-statistic: 7.228 on 2 and 122 DF,  p-value: 0.00108

#ols_vif_tol(model2)

model2_std = lm.beta(model2)
summary(model2_std)

## 
## Call:
## lm(formula = numericSAT ~ acc_within + acc_cross, data = all_participants6_SAT_cross)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -237.99  -62.44   -0.44   60.82  215.02 
## 
## Coefficients:
##              Estimate Standardized Std. Error t value Pr(>|t|)    
## (Intercept) 372.62777           NA   53.55312   6.958 1.86e-10 ***
## acc_within  -24.47009     -0.03722   94.56423  -0.259   0.7963    
## acc_cross   232.28240      0.35464   94.20960   2.466   0.0151 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 94.57 on 122 degrees of freedom
## Multiple R-squared:  0.1059, Adjusted R-squared:  0.09128 
## F-statistic: 7.228 on 2 and 122 DF,  p-value: 0.00108

anova(model1, model2, test="Chisq")

## Analysis of Variance Table
## 
## Model 1: numericSAT ~ acc_within
## Model 2: numericSAT ~ acc_within + acc_cross
##   Res.Df     RSS Df Sum of Sq Pr(>Chi)  
## 1    123 1145503                        
## 2    122 1091133  1     54370  0.01368 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(model1, model2) ## This is the F test

## Analysis of Variance Table
## 
## Model 1: numericSAT ~ acc_within
## Model 2: numericSAT ~ acc_within + acc_cross
##   Res.Df     RSS Df Sum of Sq      F  Pr(>F)  
## 1    123 1145503                              
## 2    122 1091133  1     54370 6.0791 0.01507 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Study 1 Supplementary Analyses

Within Notation Magnitude Comparison

agg_all_participants8_accuracy_gather_within_cluster = agg_all_participants8_accuracy_gather_within %>%
  left_join(rutgers_clusterclass, by = "participant")
agg_all_participants8_accuracy_gather_within_cluster$type = as.factor(as.character(agg_all_participants8_accuracy_gather_within_cluster$type))

agg_all_participants8_accuracy_gather_within_cluster$type <- factor(agg_all_participants8_accuracy_gather_within_cluster$type, levels=c("pvp", "fvf","dvd"))
levels(agg_all_participants8_accuracy_gather_within_cluster$type)[levels(agg_all_participants8_accuracy_gather_within_cluster$type ) == "pvp"]  <- "Percent to \nPercent"
levels(agg_all_participants8_accuracy_gather_within_cluster$type)[levels(agg_all_participants8_accuracy_gather_within_cluster$type ) == "fvf"]  <- "Fraction to \nFraction"
levels(agg_all_participants8_accuracy_gather_within_cluster$type)[levels(agg_all_participants8_accuracy_gather_within_cluster$type ) == "dvd"]  <- "Decimal to \nDecimal"

summarySE(agg_all_participants8_accuracy_gather_within_cluster, "accuracy", c("type","cluster"))

##                      type                  cluster   N  accuracy         sd
## 1    Percent to \nPercent          High Performing 126 0.9854497 0.07309002
## 2    Percent to \nPercent   Strong Percentage Bias  37 0.9594595 0.09942276
## 3    Percent to \nPercent Moderate Percentage Bias  35 0.9714286 0.09467621
## 4    Percent to \nPercent            Fraction Bias  22 0.8409091 0.26961305
## 5  Fraction to \nFraction          High Performing 126 0.9417989 0.12349982
## 6  Fraction to \nFraction   Strong Percentage Bias  37 0.5855856 0.26241289
## 7  Fraction to \nFraction Moderate Percentage Bias  35 0.8000000 0.21693046
## 8  Fraction to \nFraction            Fraction Bias  22 0.4696970 0.21600242
## 9    Decimal to \nDecimal          High Performing 126 0.9391534 0.16000441
## 10   Decimal to \nDecimal   Strong Percentage Bias  37 0.6441441 0.32432780
## 11   Decimal to \nDecimal Moderate Percentage Bias  35 0.8380952 0.24416630
## 12   Decimal to \nDecimal            Fraction Bias  22 0.6060606 0.29790030
##             se         ci
## 1  0.006511377 0.01288682
## 2  0.016345001 0.03314920
## 3  0.016003201 0.03252242
## 4  0.057481696 0.11953973
## 5  0.011002238 0.02177479
## 6  0.043140414 0.08749282
## 7  0.036667940 0.07451822
## 8  0.046051871 0.09577011
## 9  0.014254326 0.02821108
## 10 0.053319162 0.10813627
## 11 0.041271637 0.08387406
## 12 0.063512557 0.13208159

ezANOVA(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Fraction Bias"), dv = .(accuracy), wid = .(participant), within  = .c(type))

## Warning: Converting "participant" to factor for ANOVA.

## $ANOVA
##   Effect DFn DFd        F            p p<.05       ges
## 2   type   2  42 15.26746 1.039313e-05     * 0.2619945
## 
## $`Mauchly's Test for Sphericity`
##   Effect        W         p p<.05
## 2   type 0.922879 0.4481749      
## 
## $`Sphericity Corrections`
##   Effect       GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2   type 0.9284008 1.929931e-05         * 1.014711 1.039313e-05         *

pairwise.t.test(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Fraction Bias")$accuracy, subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Fraction Bias")$type, p.adj = "none", paired = T)

## 
##  Pairwise comparisons using paired t tests 
## 
## data:  subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Fraction Bias")$accuracy and subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Fraction Bias")$type 
## 
##                        Percent to \nPercent Fraction to \nFraction
## Fraction to \nFraction 8.8e-05              -                     
## Decimal to \nDecimal   0.0015               0.0383                
## 
## P value adjustment method: none

ezANOVA(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="High Performing"), dv = .(accuracy), wid = .(participant), within  = .c(type))

## Warning: Converting "participant" to factor for ANOVA.

## $ANOVA
##   Effect DFn DFd        F            p p<.05        ges
## 2   type   2 250 7.574371 0.0006401837     * 0.02865372
## 
## $`Mauchly's Test for Sphericity`
##   Effect         W            p p<.05
## 2   type 0.8512183 4.597687e-05     *
## 
## $`Sphericity Corrections`
##   Effect       GGe       p[GG] p[GG]<.05       HFe       p[HF] p[HF]<.05
## 2   type 0.8704874 0.001176647         * 0.8817319 0.001115961         *

pairwise.t.test(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="High Performing")$accuracy, subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="High Performing")$type, p.adj = "none", paired = T)

## 
##  Pairwise comparisons using paired t tests 
## 
## data:  subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "High Performing")$accuracy and subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "High Performing")$type 
## 
##                        Percent to \nPercent Fraction to \nFraction
## Fraction to \nFraction 5.8e-05              -                     
## Decimal to \nDecimal   0.0016               0.8583                
## 
## P value adjustment method: none

ezANOVA(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Strong Percentage Bias"), dv = .(accuracy), wid = .(participant), within  = .c(type))

## Warning: Converting "participant" to factor for ANOVA.

## $ANOVA
##   Effect DFn DFd        F            p p<.05       ges
## 2   type   2  72 31.54683 1.449145e-10     * 0.3112603
## 
## $`Mauchly's Test for Sphericity`
##   Effect         W         p p<.05
## 2   type 0.9731703 0.6213049      
## 
## $`Sphericity Corrections`
##   Effect       GGe       p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2   type 0.9738713 2.39421e-10         * 1.028808 1.449145e-10         *

pairwise.t.test(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Strong Percentage Bias")$accuracy, subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Strong Percentage Bias")$type, p.adj = "none", paired = T)

## 
##  Pairwise comparisons using paired t tests 
## 
## data:  subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Strong Percentage Bias")$accuracy and subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Strong Percentage Bias")$type 
## 
##                        Percent to \nPercent Fraction to \nFraction
## Fraction to \nFraction 1.4e-09              -                     
## Decimal to \nDecimal   7.5e-07              0.27                  
## 
## P value adjustment method: none

ezANOVA(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Moderate Percentage Bias"), dv = .(accuracy), wid = .(participant), within  = .c(type))

## Warning: Converting "participant" to factor for ANOVA.

## $ANOVA
##   Effect DFn DFd        F            p p<.05       ges
## 2   type   2  68 10.27283 0.0001263617     * 0.1260732
## 
## $`Mauchly's Test for Sphericity`
##   Effect         W        p p<.05
## 2   type 0.9779745 0.692475      
## 
## $`Sphericity Corrections`
##   Effect       GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2   type 0.9784491 0.0001445318         * 1.037531 0.0001263617         *

pairwise.t.test(subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Moderate Percentage Bias")$accuracy, subset(agg_all_participants8_accuracy_gather_within_cluster, cluster=="Moderate Percentage Bias")$type, p.adj = "none", paired = T)

## 
##  Pairwise comparisons using paired t tests 
## 
## data:  subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Moderate Percentage Bias")$accuracy and subset(agg_all_participants8_accuracy_gather_within_cluster, cluster == "Moderate Percentage Bias")$type 
## 
##                        Percent to \nPercent Fraction to \nFraction
## Fraction to \nFraction 8.2e-05              -                     
## Decimal to \nDecimal   0.0036               0.3244                
## 
## P value adjustment method: none

#levels(agg_all_participants8_accuracy_gather_within_cluster$type)

graph_rutgers_within = ggplot(agg_all_participants8_accuracy_gather_within_cluster, aes(x = interaction(type), y = accuracy)) +
  geom_bar(stat = "identity", data = summarySE(agg_all_participants8_accuracy_gather_within_cluster, "accuracy", c("type","cluster")),
           fill = NA, aes(color = as.factor(type)), size = 1, width = 0.55) +
  stat_summary(fun.data = data_summary, geom = "errorbar",
               position = position_dodge(width = 0.10), width = .05, colour = "black", size =0.5)+
  scale_y_continuous(breaks=seq(0, 1, .25), limits=c(0,1.3),trans = shift_trans(0), expand = c(0,0))+
  scale_color_manual(values = c("#1b7837","#e08214","#40004b"))+
  #scale_color_manual(values = c("#1b7837","#40004b"))+
  geom_line(aes(group = interaction (participant)),
            alpha = 0.15,
            size = .25, colour = "#737373") +
  geom_hline(yintercept = .5, linetype = 2, size = .5)+
  ylab("Accuracy")+
  facet_grid(.~cluster)+
  #scale_x_discrete(labels=c("dg" = "Decimal \n> \nFraction", "fg" = "Fraction \n> \nDecimal"))+
  stat_summary(fun.data = data_summary, geom = "errorbar",
               position = position_dodge(width = 0.10), width = 0.001, colour = "black", size =.5)+
  theme_bw()+
  theme(legend.position="none",
        axis.title.x=element_blank(),
        axis.text.x =  element_text(size=9),
        #axis.title.x =  element_text(size = size_text),
        panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
        panel.background = element_rect(fill = "white", colour = "grey50"),
        strip.background =element_rect(fill="#f0f0f0"),
        strip.text = element_text(size = size_textb),
        axis.text.y =  element_text(size=size_text),
        axis.title.y =  element_text(size=size_text),
        legend.text=element_text(size=size_text))
graph_rutgers_within

Figure 1. Within-Notation comparison accuracy based on the four-cluster model: (A.) high performing profile (n = 126), (B.) strong percentage bias profile (n = 37), (C.) moderate percentage bias profile (n = 35), (C.), and (D.) fraction bias profile (n = 22). Gray lines represent individual participants’ average scores in each of the conditions. Thicker gray lines indicate more participants with the same scores. Error bars represent ± 1 Standard Error. Note. *p<.05, **p<.01, ***p<.001

Cluster Analyses - All trial types

Determining the number of clusters (all trials)

agg_all_participants8_accuracy_gather_cross_spread = agg_all_participants8_accuracy_gather_cross
agg_all_participants8_accuracy_gather_cross_spread$type = paste(agg_all_participants8_accuracy_gather_cross_spread$components, 
                                                                agg_all_participants8_accuracy_gather_cross_spread$greater, sep = "_")
agg_all_participants8_accuracy_gather_cross_spread = spread(agg_all_participants8_accuracy_gather_cross_spread[c("participant","type","accuracy")], type, accuracy)


set.seed(240)
rutgers_n_clust2 <- n_clusters(agg_all_participants8_accuracy_gather_cross_spread[-1],
                      package = "all",
                      standardize = FALSE, n_max = 10)
rutgers_n_clust2

## # Method Agreement Procedure:
## 
## The choice of 2 clusters is supported by 12 (41.38%) methods out of 29 (Elbow, Silhouette, kl, Ch, CCC, DB, Duda, Pseudot2, Beale, Ratkowsky, Frey, Mcclain).

set.seed(240)
rutgers_kmeans.re2 <- kmeans(agg_all_participants8_accuracy_gather_cross_spread[-c(1)], centers = 2, nstart = 30, iter.max=500)
rutgers_kmeans.re2

## K-means clustering with 2 clusters of sizes 152, 68
## 
## Cluster means:
##       df_dg     df_fg     dp_dg     dp_pg     fp_fg     fp_pg
## 1 0.9429825 0.9331140 0.9254386 0.9747807 0.9013158 0.9692982
## 2 0.6151961 0.5784314 0.5661765 0.8504902 0.3921569 0.7745098
## 
## Clustering vector:
##   [1] 1 2 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2 2 1 2 1 1 2 2 1 1 2 2 1 1 2 2 1 1 1 1 1
##  [38] 1 1 1 2 2 1 2 1 1 2 1 1 1 2 1 2 1 2 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 2 2 1
##  [75] 2 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 2 1 2 2 1 2 1 1 1 1 2 2 2 1 2 1 1 1 2
## [112] 2 2 1 1 2 2 1 1 1 2 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 1 2 1 1 1 1 1 1
## [149] 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 2 1 2 1 1 1 2 1 2 1 1 1 1 1 2 1 2 2 1 1 1 1
## [186] 1 2 1 1 1 2 1 2 1 2 2 2 1 1 1 2 2 2 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1
## 
## Within cluster sum of squares by cluster:
## [1] 13.81615 26.14093
##  (between_SS / total_SS =  44.2 %)
## 
## Available components:
## 
## [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
## [6] "betweenss"    "size"         "iter"         "ifault"

rutgers_clusterclass2 = as.data.frame(rutgers_kmeans.re2$cluster)
names(rutgers_clusterclass2) ="cluster"
rutgers_clusterclass2 = cbind(agg_all_participants8_accuracy_gather_cross_spread[1],rutgers_clusterclass2)


rutgers_clusterclass2$cluster = as.factor(as.character(rutgers_clusterclass2$cluster))
levels(rutgers_clusterclass2$cluster)[levels(rutgers_clusterclass2$cluster ) == "1"]  <- "High Performing"
levels(rutgers_clusterclass2$cluster)[levels(rutgers_clusterclass2$cluster ) == "2"]  <- "Percentage Bias"

agg_all_participants8_accuracy_gather_cross_cluster2 = agg_all_participants8_accuracy_gather_cross %>%
  left_join(rutgers_clusterclass2, by = "participant")


rutgers_dataset_between_gather_fp = subset(agg_all_participants8_accuracy_gather_cross_cluster2, components =="fp")
rutgers_dataset_between_gather_fp$greater = as.factor(as.character(rutgers_dataset_between_gather_fp$greater))
rutgers_dataset_between_gather_fp$greater <- factor(rutgers_dataset_between_gather_fp$greater, levels=c("pg","fg"))
rutgers_dataset_between_gather_fp$comparison = "Percent vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_fp, "accuracy", c("greater","cluster"))

##   greater         cluster   N  accuracy        sd          se         ci
## 1      pg High Performing 152 0.9692982 0.0947889 0.007688395 0.01519072
## 2      pg Percentage Bias  68 0.7745098 0.2335193 0.028318375 0.05652371
## 3      fg High Performing 152 0.9013158 0.1477763 0.011986244 0.02368241
## 4      fg Percentage Bias  68 0.3921569 0.2456330 0.029787378 0.05945586

rutgers_dataset_between_gather_dp = subset(agg_all_participants8_accuracy_gather_cross_cluster2, components =="dp")
rutgers_dataset_between_gather_dp$greater = as.factor(as.character(rutgers_dataset_between_gather_dp$greater))
rutgers_dataset_between_gather_dp$greater <- factor(rutgers_dataset_between_gather_dp$greater, levels=c("pg","dg"))
rutgers_dataset_between_gather_dp$comparison = "Percent vs. Decimal \nComparisons"
summarySE(rutgers_dataset_between_gather_dp, "accuracy", c("greater","cluster"))

##   greater         cluster   N  accuracy         sd          se         ci
## 1      pg High Performing 152 0.9747807 0.09349692 0.007583601 0.01498367
## 2      pg Percentage Bias  68 0.8504902 0.21766838 0.026396168 0.05268697
## 3      dg High Performing 152 0.9254386 0.14067727 0.011410433 0.02254472
## 4      dg Percentage Bias  68 0.5661765 0.28526515 0.034593481 0.06904888

rutgers_dataset_between_gather_df = subset(agg_all_participants8_accuracy_gather_cross_cluster2, components =="df")
rutgers_dataset_between_gather_df$greater = as.factor(as.character(rutgers_dataset_between_gather_df$greater))
rutgers_dataset_between_gather_df$greater <- factor(rutgers_dataset_between_gather_df$greater, levels=c("dg","fg"))
rutgers_dataset_between_gather_df$comparison = "Decimal vs. Fraction \nComparisons"
summarySE(rutgers_dataset_between_gather_df, "accuracy", c("greater","cluster"))

##   greater         cluster   N  accuracy        sd          se         ci
## 1      dg High Performing 152 0.9429825 0.1184044 0.009603863 0.01897530
## 2      dg Percentage Bias  68 0.6151961 0.2735695 0.033175172 0.06621792
## 3      fg High Performing 152 0.9331140 0.1346238 0.010919433 0.02157460
## 4      fg Percentage Bias  68 0.5784314 0.2677692 0.032471780 0.06481394

Lack of integrated number sense among college students: Study 1

RAM + LS

2024-07-23

Databases

Group Level Analyses

Contrasting Performance on Within and Cross-Notation Comparison

Within-Notation Magnitude Comparison

Cross-Notation Magnitude Comparison

Pair-wise Comparsions

Fractions vs. Percentages

Decimals vs. Percentages

Decimals vs. Fractions

Figure 2

Cluster Analyses

Determining the number of clusters

Cross Notation Magnitude Comparison

Fractions vs. Percentages

Decimal vs. Percentages

Decimals vs. Fractions

Figure 3

Within vs Cross Cluster Comparisons (all)

Within vs Cross Cluster Comparisons (no fp)

SAT scores

Analyses

Figure

Linear Regressions

Study 1 Supplementary Analyses

Within Notation Magnitude Comparison

Cluster Analyses - All trial types

Determining the number of clusters (all trials)

Figure