Test statistic: 10.9577 with 2 degrees of freedom Since the variables in this example are nominal rather than ordinal, the general association test is the only one appropriate for evaluating associations. At a confidence level of \(\alpha = 0.05\) after controlling for school, we reject the null (p-value = 0.0042) for the CMH test for general association in type of program and learning styles after adjusting for a school effect. That is, the data suggest there is an association between program and learning style preferemce.
The Baseline-Category Logit (BCL) model showed each variable (aside from the school2 dummy variable) to be statistically significant (or very nearly statistically significant in terms of the school1 dummy in logit 1, which has a p-value of 0.053
The traditional classroom setting is the baseline group for Y in the BCL model; this can be clearly identified in the SAS variable information output, where learning_style_pref value “c_class” is referred to as the “reference event”. Additionally, c_class is used as the logit denominator in both fitted models, suggesting that it is the baseline level being compared to new learning styles. The way this model is fit makes intuitive sense for evaluating less-traditional approaches.
The program coefficient estimates were 0.742632 for logit1 & 0.74744 for logit2. The odds ratio estimates from the BCL model are \(e^{0.742632} = 2.1015\) for logit1 & \(e^{0.74744} = 2.1116\) for logit2.
Conditioned on the event that the learning preference was either working with teams or traditional classrooms, the estimated odds of learning preference being teamwork versus traditional class for a student in a regular program is 2.1015 times the estimated odds of preference being teamwork for a student in a supplementary afterschool program.
Conditioned on the event that the learning preference was either working alone or traditional classrooms, the estimated odds of learning preference being working alone versus traditional class for a student in a regular program is 2.1116 times the estimated odds of preference being teamwork for a student in a supplementary afterschool program.
school1 <- 0
school2 <- 1
program <- 1
# Probabilities
pi_s1 <- exp(-0.656042 - (0.231888*1) + (0.747440*1))/(1 + exp(-0.653231 + (-0.475489*1) + (0.742632*1)) + exp(-0.656042 - (0.231888*1) + (0.747440*1)))
pi_t1 <- exp(-0.653231 - (0.475489*1) + (0.742632*1))/(1 + exp(-0.653231 + (-0.475489*1) + (0.742632*1)) + exp(-0.656042 - (0.231888*1) + (0.747440*1)))
pi_c1 <- 1/(1 + exp(-0.653231 + (-0.475489*1) + (0.742632*1)) + exp(-0.656042 - (0.231888*1) + (0.747440*1)))
data.frame(self = pi_s1,
team = pi_t1,
class = pi_c1)## self team class
## 1 0.3409392 0.2666951 0.3923657
## [1] 1
pi_s2 <- exp(-0.656042 + (0.747440*1))/(1 + exp(-0.653231 + (0.742632*1)) + exp(-0.656042 + (0.747440*1)))
pi_t2 <- exp(-0.656042 + (0.742632*1))/(1 + exp(-0.653231 + (0.742632*1)) + exp(-0.656042 + (0.747440*1)))
pi_c2 <- 1/(1 + exp(-0.656042 + (0.747440*1)) + exp(-0.656042 + (0.742632*1)))
data.frame(self = pi_s2,
team = pi_t2,
class = pi_c2)## self team class
## 1 0.3435648 0.3419169 0.313858
## [1] 0.9993396
Predicted Probabilities of preference for a student from school 2 without a supplementary program
\(\hat{\pi}_{self} = \frac{e^{-0.656042 - (0.231888*school2) + (0.747440*regular)}}{1 + e^{-0.653231 + (-0.475489*school2) + (0.742632*regular)) + exp(-0.656042 - (0.231888*school2) + (0.747440*regular)}} = 0.3409392\)
\(\hat{\pi}_{team} = \frac{e^{-0.653231 - (0.475489*school2) + (0.742632*regular)}}{1 + e^{-0.653231 + (-0.475489*school2) + (0.742632*regular)) + exp(-0.656042 - (0.231888*school2) + (0.747440*regular)}} = 0.2666951\)
\(\hat{\pi}_{class} = \frac{1}{1 + e^{-0.653231 + (-0.475489*school2) + (0.742632*regular)} + e^{-0.656042 - (0.231888*school2) + (0.747440*regular)}} = 0.3923657\)
Predicted Probability of preference for a student from school 3 in a supplementary program
\(\hat{\pi}_{self} = \frac{e^{-0.656042 + (0.747440*regular)}}{1 + e^{-0.653231 + (0.742632*regular)} + e^{-0.656042 + (0.747440*regular)}} = 0.3435648\)
\(\hat{\pi}_{self} = \frac{e^{-0.656042 + (0.742632*regular)}}{1 + e^{-0.653231 + (0.742632*regular)} + e^{-0.656042 + (0.747440*regular)}} = 0.3419169\)
\(\hat{\pi}_{self} = \frac{1}{1 + e^{-0.656042 + (0.747440*regular)} + e^{-0.656042 + (0.742632*regular)}} = 0.313858\)
LR Test for Age effect
\(H_{0}: \beta_{age} = 0\) There is no age effect on probability of frequency of self-exam
\(H_{A}: \beta_{age} \neq 0\) Age has some effect on probability of frequency of self-exam
Test Statistic: \(\chi^{2}_{df = 2} = 24.4799\)
With a p-value less than 0.0001, we reject the null hypothesis indicating that age does have some effect on frequency of self-exam.
Derivation of ageb: [40(age = ‘< 45’)] + [52.5(age=‘45-59’)] + [65(age = ‘60+’)];
Recodes young to age 40, mid to age 52.5, and older to 65
The estimated odds ratio from the last model fit (using the derived ageb variable) was 1.029, which can be interpreted to mean that the odds of higher frequency of self-examination for women ageb = 40 are 1.029 times that of women who are ageb = 52.5, and the odds of higher frequency self-exams for women ageb = 52.5 are 1.029 times that of women who are ageb = 65.
Based on significance of MLE’s, it seems as if the model using the dummy variables fits these data best. The model using the derived ageb variable had an MLE estimate for intercept 2 that deviated from the other estimates in direction and was substantially less significant relative to the other models’ estimates.
df <- data.frame(alt.trt = c(0, 0, 1, 1),
fem = c(0, 1, 0, 1),
y1 = c(28, 4, 41, 12),
y2 = c(45, 12, 44, 7),
y3 = c(29, 5, 20, 3),
y4 = c(26, 2, 20, 1))
kable(df)| alt.trt | fem | y1 | y2 | y3 | y4 |
|---|---|---|---|---|---|
| 0 | 0 | 28 | 45 | 29 | 26 |
| 0 | 1 | 4 | 12 | 5 | 2 |
| 1 | 0 | 41 | 44 | 20 | 20 |
| 1 | 1 | 12 | 7 | 3 | 1 |
null.model <- vglm(as.matrix(df[,3:6]) ~ 1,
family = cumulative(parallel = TRUE),
data = df)
summary(null.model) ##
## Call:
## vglm(formula = as.matrix(df[, 3:6]) ~ 1, family = cumulative(parallel = TRUE),
## data = df)
##
## Pearson residuals:
## logitlink(P[Y<=1]) logitlink(P[Y<=2]) logitlink(P[Y<=3])
## 1 -1.2988 -1.3437 -0.7374
## 2 -1.4692 0.5292 1.0511
## 3 0.9841 0.7042 -0.1800
## 4 2.2391 0.9990 1.1082
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept):1 -0.9233 0.1282 -7.202 5.95e-13 ***
## (Intercept):2 0.5993 0.1209 4.957 7.16e-07 ***
## (Intercept):3 1.6296 0.1562 10.431 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2]),
## logitlink(P[Y<=3])
##
## Residual deviance: 16.479 on 9 degrees of freedom
##
## Log-likelihood: -30.9973 on 9 degrees of freedom
##
## Number of Fisher scoring iterations: 4
##
## No Hauck-Donner effect found in any of the estimates
fit <- vglm(as.matrix(df[,3:6]) ~ alt.trt + fem,
family = cumulative(parallel = TRUE),
data = df)
f.sum <- summary(fit); f.sum##
## Call:
## vglm(formula = as.matrix(df[, 3:6]) ~ alt.trt + fem, family = cumulative(parallel = TRUE),
## data = df)
##
## Pearson residuals:
## logitlink(P[Y<=1]) logitlink(P[Y<=2]) logitlink(P[Y<=3])
## 1 0.1720 0.06056 0.2809
## 2 -1.6543 0.16312 0.8425
## 3 0.2655 -0.13909 -0.9386
## 4 0.6174 -0.01519 0.6444
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept):1 -1.3180 0.1801 -7.319 2.5e-13 ***
## (Intercept):2 0.2492 0.1621 1.538 0.12412
## (Intercept):3 1.3001 0.1852 7.021 2.2e-12 ***
## alt.trt 0.5807 0.2119 2.741 0.00613 **
## fem 0.5414 0.2953 1.834 0.06671 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2]),
## logitlink(P[Y<=3])
##
## Residual deviance: 5.5677 on 7 degrees of freedom
##
## Log-likelihood: -25.5417 on 7 degrees of freedom
##
## Number of Fisher scoring iterations: 5
##
## No Hauck-Donner effect found in any of the estimates
##
##
## Exponentiated coefficients:
## alt.trt fem
## 1.787262 1.718403
The estimated odds ratio for the alternate therapy effect indcates that the odds of a poor response from a patient in an alternating therapy group are \(e^{0.6493} = 1.914009\) times the odds of poor response from a patient in a sequential group.
The model estimated odds ratio for the gender indicates that the odds of a poor response in a female are \(e^{0.2368} = 1.2671877\) times the odds of poor response in a male.
The alternate therapy effect was found to be significant with a p-value of 0.0066 while the gender effect was not found to be significant (p-value = 0.45).
int.fit <- vglm(as.matrix(df[,3:6]) ~ alt.trt + fem + alt.trt*fem,
family = cumulative(parallel = T),
data = df)
summary(int.fit)##
## Call:
## vglm(formula = as.matrix(df[, 3:6]) ~ alt.trt + fem + alt.trt *
## fem, family = cumulative(parallel = T), data = df)
##
## Pearson residuals:
## logitlink(P[Y<=1]) logitlink(P[Y<=2]) logitlink(P[Y<=3])
## 1 0.02471 -0.13434 0.1387
## 2 -1.30855 0.56305 1.0374
## 3 0.48236 0.01789 -0.7993
## 4 0.05105 -0.39051 0.4948
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept):1 -1.2757 0.1843 -6.920 4.52e-12 ***
## (Intercept):2 0.2957 0.1681 1.760 0.0785 .
## (Intercept):3 1.3452 0.1909 7.045 1.85e-12 ***
## alt.trt 0.4881 0.2288 2.133 0.0329 *
## fem 0.2742 0.4094 0.670 0.5030
## alt.trt:fem 0.5904 0.5935 0.995 0.3199
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2]),
## logitlink(P[Y<=3])
##
## Residual deviance: 4.5209 on 6 degrees of freedom
##
## Log-likelihood: -25.0183 on 6 degrees of freedom
##
## Number of Fisher scoring iterations: 5
##
## No Hauck-Donner effect found in any of the estimates
##
##
## Exponentiated coefficients:
## alt.trt fem alt.trt:fem
## 1.629170 1.315475 1.804727
## Likelihood ratio test
##
## Model 1: as.matrix(df[, 3:6]) ~ alt.trt + fem
## Model 2: as.matrix(df[, 3:6]) ~ alt.trt + fem + alt.trt * fem
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 7 -25.542
## 2 6 -25.018 -1 1.0468 0.3062
Test Statistic: \(\chi^{2} = 1.1677\)
At a confidence level of \(\alpha = 0.05\), the likelihood ratio test indicates with a p.value = 0.2799 that the interaction effect between alternating therapy and gender is not statistically significant and thus negligble.
gender.fit <- vglm(as.matrix(df[,3:6]) ~ fem,
family = cumulative(parallel = TRUE),
data = df)
lrtest(fit, gender.fit)## Likelihood ratio test
##
## Model 1: as.matrix(df[, 3:6]) ~ alt.trt + fem
## Model 2: as.matrix(df[, 3:6]) ~ fem
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 7 -25.542
## 2 8 -29.327 1 7.5702 0.005934 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
trt.fit <- vglm(as.matrix(df[,3:6]) ~ alt.trt,
family = cumulative(parallel = TRUE),
data = df)
lrtest(fit, trt.fit)## Likelihood ratio test
##
## Model 1: as.matrix(df[, 3:6]) ~ alt.trt + fem
## Model 2: as.matrix(df[, 3:6]) ~ alt.trt
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 7 -25.542
## 2 8 -27.340 1 3.5965 0.0579 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Test Statistic: \(\chi^{2} = 7.5702\)
At a confidence level of \(\alpha = 0.05\), the likelihood ratio test indicates with a p.value = 0.0059 that the effect of alternating therapy is statistically significant in the model and should be kept.
Test Statistic: \(\chi^{2} = 3.5965\)
At a confidence level of \(\alpha = 0.05\), the likelihood ratio test indicates with a p.value = 0.058 that the gender effect is not statistically significant.