This is an analysis of Harmony’s improved PDF extraction model. You can try Harmony at https://harmonydata.ac.uk/
By Thomas Wood, https://fastdatascience.com/
First we load the data. We have the data frame in both long and wide format, processed by Python
data.wide <- read.csv("test_pdf_models_wide.csv")
data.wide
## questionnaire total orig new
## 1 academic motivation scale 28 7 24
## 2 adverse childhood experience questionnaire for adults 10 10 10
## 3 questions about behavioral function (qabf) 25 25 24
## 4 athletic coping skills inventory 28 0 27
## 5 attitudes toward seeking professional psychological help 10 1 10
## 6 benevolent childhood experiences (bce) scale 10 0 10
## 7 the big five personality test 50 2 50
## 8 brief big five personality inventory 10 0 10
## 9 carol dweck’s growth vs. fixed mindset assessment 20 17 20
## 10 aggression questionnaire 29 0 29
## 11 cohen percieved stress 10 10 10
## 12 gad-7 7 7 7
## 13 the trait hope scale 12 0 12
## 14 davis' interpersonal reactivity index (iri) 28 22 28
## 15 patient health questionnaire 9 9 9
## 16 post traumatic growth inventory 21 20 21
## 17 rosenberg self-esteem scale (rse) 10 3 10
## 18 the satisfaction with life scale 5 0 5
## 19 neff’s self-compassion scale (short-form) 12 5 12
## 20 general self-efficacy scale 10 0 10
## 21 social dominance orientation scale 16 0 16
## 22 ten-item personality inventory-(tipi) 10 0 10
## 23 the sport motivation scale 28 9 26
## 24 the warwick–edinburgh mental well-being scale 14 3 14
## 25 brief resilience scale 6 6 0
data.long <- read.csv("test_pdf_models_long.csv")
data.long
## questionnaire model total
## 1 academic motivation scale orig 28
## 2 academic motivation scale new 28
## 3 adverse childhood experience questionnaire for adults orig 10
## 4 adverse childhood experience questionnaire for adults new 10
## 5 questions about behavioral function (qabf) orig 25
## 6 questions about behavioral function (qabf) new 25
## 7 athletic coping skills inventory orig 28
## 8 athletic coping skills inventory new 28
## 9 attitudes toward seeking professional psychological help orig 10
## 10 attitudes toward seeking professional psychological help new 10
## 11 benevolent childhood experiences (bce) scale orig 10
## 12 benevolent childhood experiences (bce) scale new 10
## 13 the big five personality test orig 50
## 14 the big five personality test new 50
## 15 brief big five personality inventory orig 10
## 16 brief big five personality inventory new 10
## 17 carol dweck’s growth vs. fixed mindset assessment orig 20
## 18 carol dweck’s growth vs. fixed mindset assessment new 20
## 19 aggression questionnaire orig 29
## 20 aggression questionnaire new 29
## 21 cohen percieved stress orig 10
## 22 cohen percieved stress new 10
## 23 gad-7 orig 7
## 24 gad-7 new 7
## 25 the trait hope scale orig 12
## 26 the trait hope scale new 12
## 27 davis' interpersonal reactivity index (iri) orig 28
## 28 davis' interpersonal reactivity index (iri) new 28
## 29 patient health questionnaire orig 9
## 30 patient health questionnaire new 9
## 31 post traumatic growth inventory orig 21
## 32 post traumatic growth inventory new 21
## 33 rosenberg self-esteem scale (rse) orig 10
## 34 rosenberg self-esteem scale (rse) new 10
## 35 the satisfaction with life scale orig 5
## 36 the satisfaction with life scale new 5
## 37 neff’s self-compassion scale (short-form) orig 12
## 38 neff’s self-compassion scale (short-form) new 12
## 39 general self-efficacy scale orig 10
## 40 general self-efficacy scale new 10
## 41 social dominance orientation scale orig 16
## 42 social dominance orientation scale new 16
## 43 ten-item personality inventory-(tipi) orig 10
## 44 ten-item personality inventory-(tipi) new 10
## 45 the sport motivation scale orig 28
## 46 the sport motivation scale new 28
## 47 the warwick–edinburgh mental well-being scale orig 14
## 48 the warwick–edinburgh mental well-being scale new 14
## 49 brief resilience scale orig 6
## 50 brief resilience scale new 6
## num_correct
## 1 7
## 2 24
## 3 10
## 4 10
## 5 25
## 6 24
## 7 0
## 8 27
## 9 1
## 10 10
## 11 0
## 12 10
## 13 2
## 14 50
## 15 0
## 16 10
## 17 17
## 18 20
## 19 0
## 20 29
## 21 10
## 22 10
## 23 7
## 24 7
## 25 0
## 26 12
## 27 22
## 28 28
## 29 9
## 30 9
## 31 20
## 32 21
## 33 3
## 34 10
## 35 0
## 36 5
## 37 5
## 38 12
## 39 0
## 40 10
## 41 0
## 42 16
## 43 0
## 44 10
## 45 9
## 46 26
## 47 3
## 48 14
## 49 6
## 50 0
Plot a histogram of the results data
p <- ggplot(data.wide, aes(x=x) ) +
# Top
geom_density( aes(x = new, y = ..density..), fill="#69b3a2" ) +
geom_label( aes(x=4.5, y=0.25, label="new"), color="#69b3a2") +
# Bottom
geom_density( aes(x = orig, y = -..density..), fill= "#404080") +
geom_label( aes(x=4.5, y=-0.25, label="orig"), color="#404080") +
xlab("Number of questions correct in questionnaire")
p
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
We will do a t-test to see if the scores on the new model are significantly better than the scores on the old model. We will use a 2-tailed t-test at p=0.05.
model <- lm(formula = num_correct ~ model, data=data.long)
summary(model)
##
## Call:
## lm(formula = num_correct ~ model, data = data.long)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.16 -6.22 -3.70 3.82 33.84
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16.160 1.839 8.788 1.46e-11 ***
## modelorig -9.920 2.601 -3.814 0.00039 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.195 on 48 degrees of freedom
## Multiple R-squared: 0.2326, Adjusted R-squared: 0.2166
## F-statistic: 14.55 on 1 and 48 DF, p-value: 0.0003899
The new model is significantly better, with p = 0.00039
Let’s do the analysis on the proportion correct in each questionnaire.
data.long$proportion_correct = data.long$num_correct / data.long$total
model <- lm(formula = num_correct ~ model, data=data.long)
summary(model)
##
## Call:
## lm(formula = num_correct ~ model, data = data.long)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.16 -6.22 -3.70 3.82 33.84
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16.160 1.839 8.788 1.46e-11 ***
## modelorig -9.920 2.601 -3.814 0.00039 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.195 on 48 degrees of freedom
## Multiple R-squared: 0.2326, Adjusted R-squared: 0.2166
## F-statistic: 14.55 on 1 and 48 DF, p-value: 0.0003899
We got the same p-value, 0.00039.
Let’s find the average score
Average proportion correct for original model:
mean(data.wide$orig/data.wide$total)
## [1] 0.409219
Average proportion correct for original model:
mean(data.wide$new/data.wide$total)
## [1] 0.9484
Therefore, the old model had accuracy 41% and the new model had accuracy 95%, with p = 0.00039.