Analysis of answer accuracy by the difficulty level of questions

Are workers more accurate in their answers when they consider questions less difficult?

## Warning: package 'ggplot2' was built under R version 3.3.3

How do the accuracy of answers options (YES,NO) relate with the difficulty level of the questions?

## [1] "All Answers - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8446115 0.4487179 0.6481481     399
## 2          2 0.7840671 0.4117647 0.4941176     477
## 3          3 0.6784203 0.3073394 0.4652778     709
## 4          4 0.6439232 0.3072289 0.4951456     469
## 5          5 0.5954545 0.3222222 0.5087719     220
## Total answers:2274
## [1] "Only YES Answers - difficulty"
##   Difficulty  Accuracy Precision Recall Answers
## 1          1 0.4487179 0.4487179      1      78
## 2          2 0.4117647 0.4117647      1     102
## 3          3 0.3073394 0.3073394      1     218
## 4          4 0.3072289 0.3072289      1     166
## 5          5 0.3222222 0.3222222      1      90
## Total answers:654
## [1] "Only NO Answers - difficulty"
##   Difficulty  Accuracy Precision Recall Answers
## 1          1 0.9408100         0      0     321
## 2          2 0.8853333         0      0     375
## 3          3 0.8431772         0      0     491
## 4          4 0.8283828         0      0     303
## 5          5 0.7846154         0      0     130
## Total answers:1620

For all metrics, answer accuracy is negatively proportionate to difficulty level (need to confirm that by computing correlation).

How do the accuracy of answers by worker profession relate with the difficulty level of the questions?

## [1] "Only PROFESSIONAL_DEVELOPERS - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8333333 0.4473684 0.7391304     162
## 2          2 0.7783784 0.4000000 0.4848485     185
## 3          3 0.6889632 0.3068182 0.4576271     299
## 4          4 0.6406250 0.2647059 0.4864865     192
## 5          5 0.6551724 0.5000000 0.5666667      87
## Total answers:925
## [1] "Only HOBBYIST - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8552632 0.2727273 0.5000000      76
## 2          2 0.8108108 0.5200000 0.5909091     111
## 3          3 0.6827586 0.3958333 0.5277778     145
## 4          4 0.6571429 0.3421053 0.5416667     105
## 5          5 0.6440678 0.3200000 0.6666667      59
## Total answers:496
## [1] "Only GRADUATE_STUDENT - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.9000000 0.6000000 0.7500000      30
## 2          2 0.8529412 0.5714286 0.6666667      34
## 3          3 0.5609756 0.1764706 0.4285714      82
## 4          4 0.7142857 0.3684211 0.5384615      63
## 5          5 0.5666667 0.2307692 0.5000000      30
## Total answers:239
## [1] "Only Hobbyist - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8552632 0.2727273 0.5000000      76
## 2          2 0.8108108 0.5200000 0.5909091     111
## 3          3 0.6827586 0.3958333 0.5277778     145
## 4          4 0.6571429 0.3421053 0.5416667     105
## 5          5 0.6440678 0.3200000 0.6666667      59
## Total answers:496
## [1] "Only OTHER - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8571429 0.6000000 0.7500000      42
## 2          2 0.7631579 0.5000000 0.5555556      38
## 3          3 0.7500000 0.2000000 0.4000000      44
## 4          4 0.7407407 0.5833333 0.7777778      27
## 5          5 0.5000000 0.0000000       NaN      10
## Total answers:161
## Warning: Removed 1 rows containing missing values (geom_path).

As the charts above show, even across professions accuracy and difficulty follows are inversely proportionate.

Is accuracy of answers also inverly related to question difficulty across different Java methods?

Yes, it is as the charts below show.

## [1] "Only HIT01_8 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.9677419 1.0000000 0.8666667      62
## 2          2 0.8627451 0.5000000 0.5714286      51
## 3          3 0.7073171 0.3888889 0.8750000      41
## 4          4 0.6086957 0.2857143 0.3333333      23
## 5          5 0.6875000 0.2500000 0.3333333      16
## Total answers:193
## [1] "Only HIT02_24 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.6842105 0.2000000 0.3333333      19
## 2          2 0.6296296 0.5555556 0.4545455      27
## 3          3 0.5675676 0.5294118 0.5294118      37
## 4          4 0.5238095 0.1666667 1.0000000      21
## 5          5 0.5714286 0.3333333 0.5000000       7
## Total answers:111
## [1] "Only HIT03_6 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8387097 0.3333333 0.2500000      31
## 2          2 0.6862745 0.2142857 0.3750000      51
## 3          3 0.6436782 0.2400000 0.3333333      87
## 4          4 0.6500000 0.4666667 0.5384615      80
## 5          5 0.6111111 0.4117647 0.6363636      36
## Total answers:285
## [1] "Only HIT04_7 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8085106 0.2000000 0.6666667      47
## 2          2 0.8547009 0.2941176 0.5000000     117
## 3          3 0.7028302 0.2459016 0.4687500     212
## 4          4 0.7225806 0.2857143 0.6363636     155
## 5          5 0.5625000 0.2105263 0.6153846      80
## Total answers:611
## [1] "Only HIT05_35 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.5882353 0.3333333 0.7500000      17
## 2          2 0.7142857 0.6363636 0.7777778      21
## 3          3 0.6129032 0.4545455 0.4545455      62
## 4          4 0.5116279 0.3846154 0.6666667      43
## 5          5 0.5625000 0.4285714 0.5000000      16
## Total answers:159
## [1] "Only HIT06_51 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8571429 0.5000000 0.4000000      35
## 2          2 0.7916667 0.3333333 0.2500000      48
## 3          3 0.7043478 0.1851852 0.2941176     115
## 4          4 0.6582278 0.1818182 0.3076923      79
## 5          5 0.6562500 0.3636364 0.5000000      32
## Total answers:309
## [1] "Only HIT07_33 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.9600000 1.0000000 0.8571429      25
## 2          2 0.8055556 0.7777778 0.5833333      36
## 3          3 0.6382979 0.6153846 0.4000000      47
## 4          4 0.4242424 0.2500000 0.2307692      33
## 5          5 0.3333333 0.5000000 0.2500000       6
## Total answers:147
## [1] "Only HIT08_54 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8343558 0.2500000 0.5384615     163
## 2          2 0.7619048 0.3214286 0.4500000     126
## 3          3 0.7129630 0.2000000 0.7000000     108
## 4          4 0.7142857 0.2500000 0.3333333      35
## 5          5 0.6296296 0.5000000 0.4000000      27
## Total answers:459

Considering only professional programmers, is accuracy of answers also inversely related to question difficulty across different Java methods ?

Yes, it is as the charts below show

## [1] "Only HIT01_8 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.9565217 1.0000000 0.8333333      23
## 2          2 0.8421053 0.3333333 0.5000000      19
## 3          3 0.8000000 0.4000000 1.0000000      15
## 4          4 0.3333333 0.2000000 0.3333333       9
## 5          5 0.8571429 1.0000000 0.5000000       7
## Total answers:73
## [1] "Only HIT02_24 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8000000 0.0000000 0.0000000       5
## 2          2 0.7500000 0.7500000 0.6000000      12
## 3          3 0.6666667 0.5454545 0.8571429      18
## 4          4 0.6666667 0.3333333 1.0000000      12
## 5          5 0.8000000 1.0000000 0.5000000       5
## Total answers:52
## [1] "Only HIT03_6 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8461538 0.5000000 0.5000000      13
## 2          2 0.6190476 0.2857143 0.4000000      21
## 3          3 0.6428571 0.2142857 0.4285714      42
## 4          4 0.5833333 0.2727273 0.3000000      36
## 5          5 0.5909091 0.4000000 0.5714286      22
## Total answers:134
## [1] "Only HIT04_7 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.7333333 0.2000000 1.0000000      15
## 2          2 0.9000000 0.3333333 0.6666667      50
## 3          3 0.7674419 0.3684211 0.4666667      86
## 4          4 0.7352941 0.2272727 0.8333333      68
## 5          5 0.6315789 0.5000000 0.7142857      19
## Total answers:238
## [1] "Only HIT05_35 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.6250000 0.4000000 1.0000000       8
## 2          2 0.5714286 0.5000000 1.0000000       7
## 3          3 0.5555556 0.2857143 0.2222222      27
## 4          4 0.4761905 0.3571429 0.7142857      21
## 5          5 0.6363636 0.5000000 0.5000000      11
## Total answers:74
## [1] "Only HIT06_51 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8235294 0.3333333 0.5000000      17
## 2          2 0.6666667 0.2500000 0.3333333      15
## 3          3 0.6538462 0.1428571 0.2500000      52
## 4          4 0.7037037 0.2000000 0.2000000      27
## 5          5 0.6470588 0.3333333 0.5000000      17
## Total answers:128
## [1] "Only HIT07_33 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 1.0000000 1.0000000 1.0000000      11
## 2          2 0.8333333 0.6666667 0.5000000      18
## 3          3 0.6000000 0.5000000 0.3750000      20
## 4          4 0.6666667 0.5000000 0.3333333       9
## 5          5       NaN 0.0000000       NaN       0
## Total answers:58
## [1] "Only HIT08_54 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8142857 0.2666667 0.6666667      70
## 2          2 0.7441860 0.2857143 0.2500000      43
## 3          3 0.7179487 0.1666667 0.6666667      39
## 4          4 0.6000000 0.0000000 0.0000000      10
## 5          5 0.6666667 1.0000000 0.5000000       6
## Total answers:168
## Warning: Removed 1 rows containing missing values (geom_path).

Considering only undergraduate students, is accuracy of answers also inversely related to question difficulty across different Java methods ?

Yes, for 7 out of 8 Java Methods as the charts below show

## [1] "Only HIT01_8 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.9411765       1.0 0.8000000      17
## 2          2 0.9285714       1.0 0.5000000      14
## 3          3 0.7500000       0.4 0.6666667      16
## 4          4 0.5000000       0.0 0.0000000       2
## 5          5 0.5000000       0.0       NaN       4
## Total answers:53
## [1] "Only HIT02_24 - difficulty"
##   Difficulty  Accuracy Precision Recall Answers
## 1          1 0.6000000       0.0    NaN       5
## 2          2 0.4000000       0.0    0.0       5
## 3          3 0.4285714       0.5    0.5       7
## 4          4 0.5000000       0.0    NaN       2
## 5          5 0.0000000       0.0    NaN       1
## Total answers:20
## [1] "Only HIT03_6 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 1.0000000 0.0000000       NaN       5
## 2          2 0.7272727 0.0000000       NaN      11
## 3          3 0.7058824 0.6666667 0.3333333      17
## 4          4 0.4615385 0.4000000 0.3333333      13
## 5          5 0.0000000 0.0000000 0.0000000       2
## Total answers:48
## [1] "Only HIT04_7 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8666667 0.0000000 0.0000000      15
## 2          2 0.6956522 0.1666667 0.3333333      23
## 3          3 0.6410256 0.1875000 0.7500000      39
## 4          4 0.6250000 0.2000000 0.3333333      32
## 5          5 0.4444444 0.1250000 0.2500000      18
## Total answers:127
## [1] "Only HIT05_35 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.8000000 0.0000000 0.0000000       5
## 2          2 0.7500000 0.0000000 0.0000000       4
## 3          3 0.5454545 0.4000000 0.5000000      11
## 4          4 0.3333333 0.3333333 0.3333333       6
## 5          5 0.3333333 0.0000000 0.0000000       3
## Total answers:29
## [1] "Only HIT06_51 - difficulty"
##   Difficulty  Accuracy Precision Recall Answers
## 1          1 0.8888889       1.0    0.5       9
## 2          2 0.8571429       0.5    0.5      14
## 3          3 0.7333333       0.0    0.0      15
## 4          4 0.5238095       0.0    0.0      21
## 5          5 0.5000000       0.0    0.0       2
## Total answers:61
## [1] "Only HIT07_33 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.7500000       1.0 0.5000000       4
## 2          2 0.8750000       0.0 0.0000000       8
## 3          3 0.6666667       1.0 0.3333333       6
## 4          4 0.5000000       0.5 0.5000000       4
## 5          5 0.0000000       0.0 0.0000000       1
## Total answers:23
## [1] "Only HIT08_54 - difficulty"
##   Difficulty  Accuracy Precision    Recall Answers
## 1          1 0.7586207 0.0000000 0.0000000      29
## 2          2 0.7000000 0.1428571 0.2500000      30
## 3          3 0.8571429 0.3333333 0.3333333      28
## 4          4 1.0000000 0.0000000       NaN       2
## 5          5 0.6666667 0.0000000 0.0000000       3
## Total answers:92