Latent Trait Model

概略

今回は項目反応理論Item Response Theoryです。

すでに説明したように，この理論は

0/1の二値変数に対して
単一の潜在変数（能力）を想定し
潜在変数の推定に特化した

因子分析モデルであるということができます。

実際には，1PL-,2PL-,3PL-ロジスティックモデルがあるのでした。それぞれのモデル式は，

1PL Logistic model…\[p(\theta)=\frac{1}{1+exp(-1.7*(x-b))}\]
2PL Logistic model…\[p(\theta)=\frac{1}{1+exp(-1.7*a*(x-b))}\]
3PL Logistic model…\[p(\theta)=c+\frac{1-c}{1+exp(-1.7*a*(x-b))}\]

となっています。

実際にやってみましょう。分析にはltmパッケージを使います。

library(ltm)

## Loading required package: MASS
## Loading required package: msm
## Loading required package: polycor
## Loading required package: mvtnorm
## Loading required package: sfsmisc

また，データはltmパッケージのMobilityを使います。女性の社会移動に関する調査データです。データが0/1の形式になっていることに注目してください。

data(Mobility)
help(Mobility)
head(Mobility)

##   Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8
## 1      1      1      1      1      0      0      0      0
## 2      0      0      0      0      0      0      0      0
## 3      0      0      0      0      0      0      0      0
## 4      0      0      0      0      0      0      0      0
## 5      0      0      0      0      0      0      0      0
## 6      0      0      0      0      0      0      0      0

各モデルを当てはめる

まずは1PLモデルから。1PLモデルは別名ラッシュモデルともいい，この名前で関数が用意されています。

result.1pl <- rasch(Mobility)
result.1pl

## 
## Call:
## rasch(data = Mobility)
## 
## Coefficients:
## Dffclt.Item 1  Dffclt.Item 2  Dffclt.Item 3  Dffclt.Item 4  Dffclt.Item 5  
##        -1.010          0.556         -0.826          0.388          1.852  
## Dffclt.Item 6  Dffclt.Item 7  Dffclt.Item 8         Dscrmn  
##         1.493          2.043          1.686          2.499  
## 
## Log.Lik: -23416.48

plot(result.1pl)

項目特性曲線(ICC)をプロットすると傾きが同じで，困難度母数（曲線の左右の位置）だけが異なっていることがわかると思います。

では同じデータを2PLモデルでやってみましょう。

result.2pl <- ltm(Mobility~z1)
result.2pl

## 
## Call:
## ltm(formula = Mobility ~ z1)
## 
## Coefficients:
##         Dffclt  Dscrmn
## Item 1  -1.084   2.109
## Item 2   0.631   2.058
## Item 3  -1.025   1.509
## Item 4   0.400   3.010
## Item 5   1.630   3.976
## Item 6   1.402   3.138
## Item 7   1.699   5.816
## Item 8   1.585   3.022
## 
## Log.Lik: -23141.71

plot(result.2pl)

今度は傾き（識別力）も異なっています。識別力が大きい項目ほどグラフの傾きが急になっていることに注目してください。特定の項目を抜き出して描画するときは，itemsオプションを使います。

plot(result.2pl,items=3)
plot(result.2pl,items=c(3,5))

最後に3PLモデルです。

result.3pl <- tpm(Mobility)
result.3pl

## 
## Call:
## tpm(data = Mobility)
## 
## Coefficients:
##         Gussng  Dffclt  Dscrmn
## Item 1   0.001  -1.064   2.224
## Item 2   0.000   0.642   2.038
## Item 3   0.000  -1.013   1.541
## Item 4   0.013   0.427   3.234
## Item 5   0.001   1.623   4.287
## Item 6   0.011   1.404   4.065
## Item 7   0.000   1.698   5.865
## Item 8   0.010   1.542   4.457
## 
## Log.Lik: -23079.72

plot(result.3pl)

信頼性の情報

項目反応理論では，信頼性はその項目から得られる情報の関数として扱うのでした。項目情報曲線(IIC)は次のようにして描きます。

plot(result.2pl,type="IIC")

plot(result.2pl,type="IIC",items=6)

項目全体からなるテストの総情報関数は，itemsオプションに0を渡すことで示されます。

plot(result.1pl,type="IIC",items=0)

plot(result.2pl,type="IIC",items=0)

plot(result.3pl,type="IIC",items=0)

モデル間比較

パラメタを増やしてモデルを変えていきましたが，最も良いモデルはどれでしょうか。モデルの改良を比較する尤度比検定はanova関数で行います。

anova(result.1pl,result.2pl)

## 
##  Likelihood Ratio Table
##                 AIC      BIC   log.Lik    LRT df p.value
## result.1pl 46850.96 46914.33 -23416.48                  
## result.2pl 46315.43 46428.09 -23141.71 549.53  7  <0.001

anova(result.2pl,result.3pl)

## 
##  Likelihood Ratio Table
##                 AIC      BIC   log.Lik    LRT df p.value
## result.2pl 46315.43 46428.09 -23141.71                  
## result.3pl 46207.43 46376.43 -23079.72 123.99  8  <0.001

スコアの推定

モデルの当てはめで項目の特性がわかれば，その情報をつかて被検者のスコアを推定することができます。普通，項目の数も被検者の数も大変多いデータを対象にしますので，推定には時間がかかります。

scores <- factor.scores(result.2pl,resp.patterns=Mobility)
head(scores$score.dat)

##   Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Obs      Exp
## 1      1      1      1      1      0      0      0      0 911 743.8345
## 2      0      0      0      0      0      0      0      0 829 825.5800
## 3      0      0      0      0      0      0      0      0 829 825.5800
## 4      0      0      0      0      0      0      0      0 829 825.5800
## 5      0      0      0      0      0      0      0      0 829 825.5800
## 6      0      0      0      0      0      0      0      0 829 825.5800
##           z1     se.z1
## 1  0.8052771 0.3944680
## 2 -1.3647147 0.6123158
## 3 -1.3647147 0.6123158
## 4 -1.3647147 0.6123158
## 5 -1.3647147 0.6123158
## 6 -1.3647147 0.6123158

hist(scores$score.dat$z1)

ここで求められたスコアは，素点の総合計とかなり高い相関を示します。

cor(apply(Mobility,1,sum),scores$score.dat$z1)

## [1] 0.9790227

plot(apply(Mobility,1,sum),scores$score.dat$z1)

段階反応モデル

反応が数段階ある場合は，段階反応モデルを使うのでした。段階反応モデルは順序性を持ったデータに対するカテゴリカル因子分析モデルと等価であることを思い出しておいてください。

データは同じくltmパッケージにあるScienceを使います。

data(Science)
help(Science)
head(Science)

##          Comfort       Environment           Work         Future
## 1 strongly agree    strongly agree strongly agree          agree
## 2          agree    strongly agree          agree          agree
## 3          agree          disagree       disagree       disagree
## 4          agree             agree       disagree       disagree
## 5          agree strongly disagree strongly agree strongly agree
## 6 strongly agree             agree strongly agree          agree
##       Technology       Industry           Benefit
## 1 strongly agree          agree          disagree
## 2          agree          agree             agree
## 3 strongly agree strongly agree             agree
## 4 strongly agree strongly agree             agree
## 5       disagree          agree strongly disagree
## 6          agree strongly agree             agree

result.grm <- grm(Science)
result.grm

## 
## Call:
## grm(data = Science)
## 
## Coefficients:
##              Extrmt1  Extrmt2  Extrmt3  Dscrmn
## Comfort      -10.768   -5.645    3.097   0.411
## Environment   -2.154   -0.790    0.627   1.570
## Work          32.102    9.261  -24.402  -0.074
## Future       -30.602  -11.806   10.455   0.108
## Technology    -2.462   -0.885    0.642   1.650
## Industry      -2.870   -1.529    0.286   1.642
## Benefit      -21.232   -5.982   10.297   0.136
## 
## Log.Lik: -2998.129

plot(result.grm,items=2)

plot(result.grm,items=2,type="IIC")

plot(result.grm,items=0,type="IIC")

これらの技法を使って，より良いテスト作りの参考としてください。

課題

とある試験のテストデータが次の通り得られました。

test <- matrix(c(
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,1,0,0,1,0,0,1,0,1,0,0,1,1,1,
1,1,1,1,0,1,1,1,1,0,0,1,1,1,1,0,1,1,1,0,0,1,1,0,1,1,1,1,0,1,0,1,0,1,0,1,1,0,1,0,1,0,1,1,1,1,1,1,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,
1,1,1,0,0,1,1,1,1,0,1,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,0,0,0,0,0,0,1,1,0,1,1,1,1,1,1,
0,1,0,1,1,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,1,0,0,1,1,
1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,1,0,1,1,1,1,1,1,0,1,
0,1,1,0,0,0,1,1,1,1,1,1,1,0,0,1,0,0,1,1,1,1,1,0,1,1,0,1,0,1,0,1,0,1,0,0,1,0,0,0,1,0,1,1,1,1,1,1,0,1,
1,0,0,0,0,0,1,1,0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,1,1,1,1,0,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,1,1,1,1,0,1,
1,0,1,0,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1,0,1,1,0,0,1,1,1,1,1,1,
0,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,0,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,1,1,1,1,1,1,0,1,
1,1,0,0,0,0,1,1,0,0,1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,1,1,0,0,1,1,1,0,1,0,1,0,0,0,0,1,1,0,1,0,1,1,1,
1,1,1,0,1,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,0,0,1,1,
0,1,1,1,0,0,1,1,1,0,1,0,0,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,1,1,0,1,1,1,1,1,1,1,0,0,0,1,0,1,1,1,1,0,0,1,
1,1,1,0,0,0,1,0,0,1,1,1,1,0,0,0,1,1,0,1,1,1,0,1,1,1,1,1,1,0,1,1,0,1,1,1,0,0,1,0,0,0,0,1,1,1,1,1,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,0,0,1,1,1,1,0,0,1,1,1,1,1,1,0,1,0,1,0,1,0,0,1,0,1,0,0,1,1,1,1,1,0,1,
1,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,0,0,1,0,0,1,0,1,1,1,0,1,0,1,0,0,0,1,0,0,0,0,1,1,0,1,0,1),byrow=T,nrow=16)

このテストデータ(16名,50問)に対して，1PL,2PL,3PLモデルのいずれかを当てはめなさい。
モデルによる能力値を推定し，最も能力が高い回答者は何行目のひとか，答えなさい。