Week 10: CTT Assignment

Load packages

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(rio)
library(psych)

## 
## Attaching package: 'psych'
## 
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

library(CTT)

## 
## Attaching package: 'CTT'
## 
## The following objects are masked from 'package:psych':
## 
##     polyserial, reliability

Load Dataset

Context of the Dataset

This dataset represents responses from 100 students who completed a 20 item multiple-choice test designed to measure their understanding of basic statistics concepts.

Each item is scored dichotomously, meaning:

1 = Correct answer / 0 = Incorrect answer

The purpose of this dataset is to help students practice Classical Test Theory (CTT) analysis.

data <- import("CTt_week10_assignment_dataset.xlsx")

Check descriptive statistics

Please use head() function and then check the data quality

head(data)

##   StudentID Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9 Item10 Item11
## 1      S001     0     0     0     0     0     0     1     0     1      0      0
## 2      S002     1     1     0     1     1     1     1     1     1      0      0
## 3      S003     1     1     1     1     1     1     1     1     0      0      0
## 4      S004     0     0     1     0     1     1     0     0     0      0      0
## 5      S005     1     0     0     1     1     1     0     1     0      0      0
## 6      S006     0     1     1     1     1     1     1     1     1      1      1
##   Item12 Item13 Item14 Item15 Item16 Item17 Item18 Item19 Item20
## 1      1      1      0      0      0      0      0      0      0
## 2      0      0      0      1      1      0      0      1      1
## 3      1      0      0      1      1      0      0      0      1
## 4      0      0      0      0      0      0      0      0      0
## 5      1      0      0      1      0      0      0      0      1
## 6      1      0      1      1      1      1      1      0      0

please use describe() function and then calculate descriptive statistics

describe(data)

##            vars   n  mean    sd median trimmed   mad min max range  skew
## StudentID*    1 100 50.50 29.01   50.5   50.50 37.06   1 100    99  0.00
## Item1         2 100  0.67  0.47    1.0    0.71  0.00   0   1     1 -0.71
## Item2         3 100  0.65  0.48    1.0    0.69  0.00   0   1     1 -0.62
## Item3         4 100  0.61  0.49    1.0    0.64  0.00   0   1     1 -0.44
## Item4         5 100  0.68  0.47    1.0    0.72  0.00   0   1     1 -0.76
## Item5         6 100  0.63  0.49    1.0    0.66  0.00   0   1     1 -0.53
## Item6         7 100  0.65  0.48    1.0    0.69  0.00   0   1     1 -0.62
## Item7         8 100  0.54  0.50    1.0    0.55  0.00   0   1     1 -0.16
## Item8         9 100  0.47  0.50    0.0    0.46  0.00   0   1     1  0.12
## Item9        10 100  0.56  0.50    1.0    0.58  0.00   0   1     1 -0.24
## Item10       11 100  0.51  0.50    1.0    0.51  0.00   0   1     1 -0.04
## Item11       12 100  0.56  0.50    1.0    0.58  0.00   0   1     1 -0.24
## Item12       13 100  0.39  0.49    0.0    0.36  0.00   0   1     1  0.44
## Item13       14 100  0.42  0.50    0.0    0.40  0.00   0   1     1  0.32
## Item14       15 100  0.43  0.50    0.0    0.41  0.00   0   1     1  0.28
## Item15       16 100  0.40  0.49    0.0    0.38  0.00   0   1     1  0.40
## Item16       17 100  0.40  0.49    0.0    0.38  0.00   0   1     1  0.40
## Item17       18 100  0.36  0.48    0.0    0.32  0.00   0   1     1  0.57
## Item18       19 100  0.34  0.48    0.0    0.30  0.00   0   1     1  0.67
## Item19       20 100  0.30  0.46    0.0    0.25  0.00   0   1     1  0.86
## Item20       21 100  0.34  0.48    0.0    0.30  0.00   0   1     1  0.67
##            kurtosis   se
## StudentID*    -1.24 2.90
## Item1         -1.51 0.05
## Item2         -1.63 0.05
## Item3         -1.82 0.05
## Item4         -1.44 0.05
## Item5         -1.74 0.05
## Item6         -1.63 0.05
## Item7         -1.99 0.05
## Item8         -2.01 0.05
## Item9         -1.96 0.05
## Item10        -2.02 0.05
## Item11        -1.96 0.05
## Item12        -1.82 0.05
## Item13        -1.92 0.05
## Item14        -1.94 0.05
## Item15        -1.86 0.05
## Item16        -1.86 0.05
## Item17        -1.69 0.05
## Item18        -1.57 0.05
## Item19        -1.27 0.05
## Item20        -1.57 0.05

Interpretation:

The dataset contains responses from 100 students across 20 items. The head() output confirms the dataset loaded correctly, and describe() shows item means ranging approximately from 0.30 to 0.68, indicating variation in item difficulty. The dataset appears clean and appropriate for CTT analysis.

Item analysis with CTT model

select only response data

response_data <- data %>%
  select(Item1:Item20)

Run the item analysis and save it as itemanalysis_ctt

itemanalysis_ctt <- itemAnalysis(response_data)

Print the item characteristics report

itemanalysis_ctt

## 
##  Number of Items 
##  20 
## 
##  Number of Examinees 
##  100 
## 
##  Coefficient Alpha 
##  0.812

1. Item Mean (Item Difficulty)

(Task 1-1)

Three items with highest item difficulty:

Item4 (0.68), Item1 (0.67), Item2 (0.65)

Three items with lowest difficulty:

Item19 (0.30), Item18 (0.34), Item20 (0.34)

(Task 1-2)

Item difficulty looks good overall, most items fall within the acceptable range of 0.30 to 0.80, with a healthy mix of easier and harder questions to measure a range of student abilities.

2. Point-biserial Correlation (Item Discrimination)

(Task 2-1)

Three items with strongest item discrimination:

Item9 (0.60), Item8 (0.51), Item15 (0.49)

Three items with weakest item discrimination:

Item13 (0.21), Item7 (0.25), Item6 (0.25)

(Task 2-2)

Discrimination looks strong overall — most items scored above 0.30, and none fell below the 0.20 threshold. A few items on the lower end could use a closer look, but the test holds up well across the board.

3. Overall Interpretation

Overall, the test holds up well — difficulty values are nicely spread out, and most items do a good job separating higher and lower performers. A few items could be tweaked, but the test is solid overall.