QUESTION 1

Simulate a datafile containing two groups with different means and standard deviations.
Conduct the IRT analysis but don’t model groups in the IRT analysis.
Compare the results to the simulated values.

To answer the first question, the following simulation was set up:
set.seed=123
N = 3000
gender 1: M=0, SD=1.5 (n=1500)
gender 2: M=0.5, SD=1 (n=1500)
I=20
delta=seq(-2,2,len=I)

Using the set seed above, a simulated response pattern was generated using the logistic function (Rasch) and the random value from the uniform distribution function (runif). The generated thetas were plotted on the x axis, whilst the estimated first plausible vaues (using tam.mml, without the “group = gender”, and the tam.pv functions) was plotted on the y axis (Figure 1). For gender 1, the mean and SD of 0 and 1.5 became -0.20 and 1.36, respectively. For gender 2, the mean and SD of 0.5 and 1 became 0.20 and 1.07, respectively.


Figure 1.

To complete the second part of the first question, the tam.mml function was used with the “group=gender” option. Again, the generated thetas were plotted on the x axis and the estimated plausible values were plotted on the y axis (Figure 2). For gender 1, the mean and SD of 0 and 1.5 became 0.01 and 1.46, respectively. For gender 2, the mean and SD of 0.5 and 1 became 0.54 and 1.02, respectively. When gender is modelled as a conditioning variable, the known parameters are better recovered.


Figure 2.

To summarise, when respondents’ distributive patterns differ by some key demographic, the conditioning option (group=gender) recovers the group parameters more accurately.

QUESTION 2

Simulate a data file containing two groups with the same means and standard deviations.
Conduct IRT analysis where group is modelled in the IRT analysis.
Compare the results to the simulated values.

To answer the second question, the following simulation was set up:
set.seed=123
N = 3000
gender 1: M=0, SD=1.5 (n=1500) # the same distribution
gender 2: M=0, SD=1.5 (n=1500) # the same distribution
I=20
delta=seq(-2,2,len=I)

Using the set seed above, a simulated response pattern was generated using the logistic function (Rasch) and the random value from the uniform distribution function (runif). The generated thetas were plotted on the x axis, whilst the estimated first plausible vaues (using tam.mml, without the “group = gender”, and the tam.pv functions) was plotted on the y axis (Figure 1). For gender 1, the mean and SD of 0.00 and 1.50 became 0.01 and 1.51, respectively. For gender 2, the same mean and SD of 0.00 and 1.5 became -0.11 and 1.44, respectively.


Figure 3.

To complete the second part of the second question, the tam.mml function was used with the “group=gender” option. Again, the generated thetas were plotted on the x axis and the estimated plausible values were plotted on the y axis (Figure 4). For gender 1, the mean and SD of 0.00 and 1.50 became 0.00 and 1.44, respectively. For gender 2, the same mean and SD of 0.00 and 1.5 became -0.03 and 1.44, respectively.

Figure 4.

The two graphs above illustrate two approaches to the analysis of response data when the population parameters are of interest. The first graph illustrates the distributive properties of the plausible values when the two gender groups (who happen to have identical known means and SDs) are not modelled as discrete conditioning variables. The second graph illustrates the distributive properties when the gender groups are modelled. A comparison between the results in the latter two graphs suggests that the inclusion of the conditioning variable makes no substantive difference in estimation plausible values.