NDDT Data

Load libraries and read NDDT data from S3

## Reading NDDT Item meta-data
## read  1024 records on  60 items

Load the consensus correlation N/W

Load the minimum-spanning tree of correlation N/W

From the MST, you can see L3Q9 in level 3 and L5Q9 in level five have higher centrality. Let us pick them up for running CHAID.

## *********************** 
## Segmenting  l3q9 
## ***********************

## 
## Model formula:
## y ~ l1q1 + l1q2 + l1q3 + l1q4 + l1q5 + l1q6 + l1q7 + l1q8 + l1q9 + 
##     l1q10 + l1q11 + l1q12 + l2q1 + l2q2 + l2q3 + l2q4 + l2q5 + 
##     l2q6 + l2q7 + l2q8 + l2q9 + l2q10 + l2q11 + l2q12 + l3q1 + 
##     l3q2 + l3q3 + l3q4 + l3q5 + l3q6 + l3q7 + l3q8 + l3q10 + 
##     l3q11 + l3q12 + l4q1 + l4q2 + l4q3 + l4q4 + l4q5 + l4q6 + 
##     l4q7 + l4q8 + l4q9 + l4q10 + l4q11 + l4q12 + l5q1 + l5q2 + 
##     l5q3 + l5q4 + l5q5 + l5q6 + l5q7 + l5q8 + l5q9 + l5q10 + 
##     l5q11 + l5q12
## 
## Fitted party:
## [1] root
## |   [2] l3q8 in 0
## |   |   [3] l4q4 in 0
## |   |   |   [4] l3q5 in 0: 0 (n = 194, err = 7.7%)
## |   |   |   [5] l3q5 in 1: 0 (n = 47, err = 34.0%)
## |   |   [6] l4q4 in 1
## |   |   |   [7] l2q8 in 0: 0 (n = 105, err = 16.2%)
## |   |   |   [8] l2q8 in 1
## |   |   |   |   [9] l4q5 in 0
## |   |   |   |   |   [10] l2q2 in 0: 0 (n = 70, err = 41.4%)
## |   |   |   |   |   [11] l2q2 in 1: 1 (n = 42, err = 33.3%)
## |   |   |   |   [12] l4q5 in 1
## |   |   |   |   |   [13] l3q11 in 0: 0 (n = 45, err = 13.3%)
## |   |   |   |   |   [14] l3q11 in 1: 1 (n = 7, err = 28.6%)
## |   [15] l3q8 in 1
## |   |   [16] l3q7 in 0: 0 (n = 43, err = 39.5%)
## |   |   [17] l3q7 in 1
## |   |   |   [18] l2q12 in 0: 1 (n = 20, err = 40.0%)
## |   |   |   [19] l2q12 in 1: 1 (n = 82, err = 4.9%)
## 
## Number of inner nodes:     9
## Number of terminal nodes: 10
## Total records:  655 
## % of correct responses:  0.340458 
## True Positives:  123 
## True Negatives:  404 
## False Positives:  28 
## False Negatives:  100 
## Precision: tp/(tp+fp):  0.8145695 
## Recall: tp/(tp+fp):  0.5515695 
## Accuracy: (tp+tn)/n:  0.8045802 
## *********************** 
## Segmenting  l5q9 
## ***********************

## 
## Model formula:
## y ~ l1q1 + l1q2 + l1q3 + l1q4 + l1q5 + l1q6 + l1q7 + l1q8 + l1q9 + 
##     l1q10 + l1q11 + l1q12 + l2q1 + l2q2 + l2q3 + l2q4 + l2q5 + 
##     l2q6 + l2q7 + l2q8 + l2q9 + l2q10 + l2q11 + l2q12 + l3q1 + 
##     l3q2 + l3q3 + l3q4 + l3q5 + l3q6 + l3q7 + l3q8 + l3q9 + l3q10 + 
##     l3q11 + l3q12 + l4q1 + l4q2 + l4q3 + l4q4 + l4q5 + l4q6 + 
##     l4q7 + l4q8 + l4q9 + l4q10 + l4q11 + l4q12 + l5q1 + l5q2 + 
##     l5q3 + l5q4 + l5q5 + l5q6 + l5q7 + l5q8 + l5q10 + l5q11 + 
##     l5q12
## 
## Fitted party:
## [1] root
## |   [2] l5q4 in 0: 0 (n = 149, err = 4.0%)
## |   [3] l5q4 in 1
## |   |   [4] l4q6 in 0: 0 (n = 53, err = 9.4%)
## |   |   [5] l4q6 in 1
## |   |   |   [6] l5q10 in 0: 0 (n = 13, err = 7.7%)
## |   |   |   [7] l5q10 in 1: 1 (n = 20, err = 10.0%)
## 
## Number of inner nodes:    3
## Number of terminal nodes: 4
## Total records:  235 
## % of correct responses:  0.1276596 
## True Positives:  18 
## True Negatives:  203 
## False Positives:  2 
## False Negatives:  12 
## Precision: tp/(tp+fp):  0.9 
## Recall: tp/(tp+fp):  0.6 
## Accuracy: (tp+tn)/n:  0.9404255

Top predictor for L3Q9 (multiply 2 digit numbers) was L3Q8 (concept of addition and multiplication). Top predictor for L5Q9 (convert fractions to decimals) was L5Q4 (place values in multiplication).

- All the five selected predictors were in the same cluster as that of L3Q8.