library("sandwich")
library("zoo")
library("lmtest")
library("MASS")
library("pscl")
library("LogisticDx")
library("ucminf")
library("ordinal")
library("reshape")
library("generalhoslem")

# install.packages("devtools")
library("devtools")

library(devtools)

# install_version("oglmx", version = "3.0.0.0",
#                 repos = "http://cran.us.r-project.org")

library("oglmx")
library("aod")
library("brant")
library(dplyr)

1 Exercise 1

We illustrate the ordered probit and logit techniques with a model of corporate bond ratings. The dataset ratings.csv contains information on 98 U.S. corporations’ bond ratings and financial characteristics where the bond ratings are AAA (excellent) to C (poor). The integer codes underlying the ratings increase in the quality of the firm’s rating, such that an increase in the response variable indicates that the firm’s bonds are a more attractive investment opportunity. The bond rating variable (rating83c) is coded as integers 2–5, with 5 corresponding to the highest quality (AAA) bonds to the lowest. We model the 1983 bond rating as a function of the firm’s income-to-asset ratio in 1983 (ia83: roughly, return on assets) and the change in that ratio from 1982 to 1983 (dia).

Estimate ordered logit for rating83c.
Are ia83 and dia jointly significant?
Interpret parameters’ coefficients.
Perform goodness-of-fit tests.
Perform the Brant’s test.
Calculate and interpret pseudo-$R^2$ statistics.
Calculate marginal effects and interpret them.

1.1 a) Estimate ordered logit for `rating83c`.

rat <- read.csv(file = "Ratings.csv", header = TRUE, sep = ",")
rat %>%
    dplyr::select(rating83c, ia83, dia) %>%
    as_tibble()

## # A tibble: 98 × 3
##    rating83c  ia83     dia
##    <chr>     <dbl>   <dbl>
##  1 BA_B_C    14.4    5.23 
##  2 BAA        1.70  -4.80 
##  3 AA_A      14.6    0.434
##  4 AAA        2.16 -10.8  
##  5 AAA       16.5   -2.52 
##  6 AAA        8.57   2.41 
##  7 BAA       12.3    0.319
##  8 AAA       17.5    0.112
##  9 BAA       10.2    4.21 
## 10 AA_A      21.2   -1.96 
## # ℹ 88 more rows

table(rat$rating83c)

## 
##   AA_A    AAA BA_B_C    BAA 
##     15     29     26     28

rr <- rep(0, 98)
rr[which(rat$rating83c == "BA_B_C")] <- 2
rr[rat$rating83c == "BAA"] <- 3
rr[rat$rating83c == "AA_A"] <- 4
rr[rat$rating83c == "AAA"] <- 5

数据预处理，这是回归分析前的关键步骤。

数据选择：rat %>% dplyr::select(...) 用于从原始数据集 rat 中提取建模所需的三个变量：目标变量 rating83c（1983年评级）和自变量 ia83（收益资产比）、dia（收益资产比的变化）。
数值化处理：由于 rating83c 原始数据可能是字符型（如 “AAA”, “BAA”），代码通过创建一个向量 rr 并使用索引赋值，将其转化为 2, 3, 4, 5 的整数序列。
排序逻辑：在有序 Logit 中，数值的大小必须反映等级的顺序。这里的 5 代表最高等级（AAA），符合“数值越大，质量越高”的逻辑。

# Estimate ordered logit for \texttt{rating83c}.
# polr from MASS package
# polr 是 "proportional odds logistic regression"（比例优势逻辑回归）的缩写
ologit <- polr(as.factor(rr) ~ ia83 + dia, data = rat)
summary(ologit)

## Call:
## polr(formula = as.factor(rr) ~ ia83 + dia, data = rat)
## 
## Coefficients:
##         Value Std. Error t value
## ia83  0.09392    0.02962   3.171
## dia  -0.08669    0.04498  -1.927
## 
## Intercepts:
##     Value   Std. Error t value
## 2|3 -0.1853  0.3571    -0.5188
## 3|4  1.1857  0.3882     3.0544
## 4|5  1.9084  0.4165     4.5822
## 
## Residual Deviance: 254.5429 
## AIC: 264.5429

1.1.1 代码和参数

polr 函数：全称是 Proportional Odds Logistic Regression。它专门用于估计因变量为有序分类（Ordinal）的模型。
as.factor(rr)：这是关键步骤。因变量必须转化为因子（Factor）类型，R 才能识别出这是一个分类任务，而非普通的线性回归。这里的 rr 对应 2-5 的信用评级。
ia83 + dia：自变量。ia83 是 1983 年的收入资产比，dia 是该比例的年度变化。
默认设置：该函数默认使用逻辑分布（Logistic distribution），对应的就是 Ordered Logit。

1.1.2 Coefficients（回归系数）

这部分展示了自变量对潜在倾向值（Latent Score）的影响。

ia83 (Value = 0.09392)：系数为正且 t 值 (3.171) 较大。这表明企业的收入资产比越高，其债券评级上升（向更高数值 5 移动）的可能性显著增加。
dia (Value = -0.08669)：系数为负，t 值 (-1.927) 接近显著性临界值（通常为 1.96）。这意味着收入资产比的增长（变动额）对评级可能有负向影响，但在 5% 的置信水平下略显勉强。

1.1.3 Intercepts（切点/阈值）

在有序模型中，这些被称为 Cutpoints（通常记为 $\zeta$）。它们将连续的潜在变量划分为不同的观察等级：

2|3 (-0.1853)：区分评级 2 和评级 3 的临界点。
3|4 (1.1857)：区分评级 3 和评级 4 的临界点。
4|5 (1.9084)：区分评级 4 和评级 5 (AAA) 的临界点。

这些值随着评级升高而递增，符合逻辑。

1.1.4 模型拟合统计量

Residual Deviance (254.5429)：残差离差，用于评估模型相对于饱和模型的拟合程度，通常用于似然比检验。
AIC (264.5429)：赤池信息准则。用于模型比较，数值越小代表模型在拟合度和简洁度之间的平衡越好。

1.2 重要注释与注意事项

潜在变量模型：有序 Logit 假设存在一个不可观测的连续变量 $y^*$，其公式为： \[y^* = \beta_1 \text{ia83} + \beta_2 \text{dia} + \epsilon\] 当 $y^*$ 超过某个切点（如 4|5）时，我们才能观察到评级从 4 变为 5。
系数解释：这里的 0.09392 并不直接代表概率的变化，而是代表 Log-Odds（对数几率）的变化。若要解释为几率比（Odds Ratio），需要计算 $\exp(0.09392) \approx 1.098$，即 ia83 每增加一个单位，评级提升一个等级的几率增加约 9.8%。
t 值与显著性：polr 的 ==输出默认不提供p值== 。通常需要根据 t value 进行判断：如果 $|t| > 1.96$，则在 5% 水平下显著。

1.2.1 变量含义（白话版）

ia83 (Income-to-Asset Ratio)：赚钱效率。即“每一块钱资产能换回多少利润”。数值越高，说明公司底气越足，还债能力越强。
dia (Change in ia83)：成长波动。即“今年比去年赚钱效率变动了多少”。数值正向增加，说明公司在扩张或转型。

1.2.2 系数现实逻辑

ia83 系数 (0.09392 > 0)：稳健经营获利。 现实意义：公司越能赚钱，债权人越放心。ia83 每提高一个单位，公司评级上调的几率增加 9.8%。这是评级的“加分项”。
dia 系数 (-0.08669 < 0)：变动即风险。 现实意义：债券市场最怕“不确定性”。即便收入在增加，但如果变动剧烈（dia 绝对值大），评级机构可能认为公司经营不稳或风险偏好上升。这是评级的“潜在扣分项”。

1.2.3 核心结论

对债券评级而言，“现在的赚钱能力”比“增长速度”重要得多。评级机构更看重公司当下的厚实程度（ia83），而对业绩的剧烈波动（dia）持谨慎甚至负面态度。

1.3 `coeftest` 系数显著性检验

coeftest(ologit)

## 
## t test of coefficients:
## 
##       Estimate Std. Error t value Pr(>|t|)   
## ia83  0.093918   0.029620  3.1708 0.002059 **
## dia  -0.086694   0.044979 -1.9274 0.056977 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

这段输出是使用 lmtest 包中的 coeftest 函数对有序 Logit 模型进行的系数显著性检验。它为每个变量提供了精确的 $p$ 值，让我们能更科学地判断变量是否“入流”。

1.3.1 代码与参数解释

coeftest(ologit)：该函数的主要作用是计算模型系数的** $z$ 检验**（输出中标记为 $t$ test，但在大样本逻辑回归中通常遵循正态分布）。
对比 summary：之前的 summary(ologit) ==默认不提供 $p$ 值==，只给 $t$ 值。coeftest 补齐了这一环，方便直接判断变量是否在统计学上显著。

1.3.2 输出解读

表格中的四列数据分别对应：

Estimate (估计值)：系数的大小和方向。
Std. Error (标准误)：系数估计的不确定性。
t value ($z$ 统计量)：系数除以标准误。绝对值越大，越不可能为零。
Pr(>|t|) ($p$ 值)：这是核心。它代表“如果该变量实际没影响，我们观测到当前结果的概率”。

1.3.3 变量表现

ia83 (0.002059 **)： $p$ 值远小于 0.01。这意味着在 1% 的显著性水平下，公司的“赚钱效率”对债券评级有极显著的正向影响。
dia (0.056977 .)： $p$ 值略大于 0.05，但小于 0.1。这意味着它在 5% 的水平下不显著，但在 10% 的水平下是显著的。在严谨的学术报告中，这通常被称为“边际显著（Marginally Significant）”。

1.3.4 显著性代码（Signif. codes）注释

R 语言用小符号直观标注了显著程度：

**：代表极显著（$p < 0.01$）。
.：代表勉强显著或趋势性显著（$p < 0.1$）。
如果没有符号：代表在统计上与零没有区别（不显著）。

1.3.5 现实意义小结

通过这组检验，我们可以确信：

“赚钱效率 (ia83)” 是评级机构打分的铁证，绝对不容忽视。而 “赚钱能力的变动 (dia)” 虽然有一定的负向趋势，但在数据证据上还没有达到“板上钉钉”的程度（未过 0.05 门槛）。

1.4 b) Are `ia83` and `dia` jointly significant?

# Are \texttt{ia83} and \texttt{dia} jointly significant?
# Likelihood ratio test
ologit.restricted <- polr(as.factor(rr) ~ 1, data = rat)
lrtest(ologit, ologit.restricted)

## Likelihood ratio test
## 
## Model 1: as.factor(rr) ~ ia83 + dia
## Model 2: as.factor(rr) ~ 1
##   #Df  LogLik Df  Chisq Pr(>Chisq)   
## 1   5 -127.27                        
## 2   3 -133.04 -2 11.542   0.003117 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

这段输出展示了对 ia83 和 dia 进行的联合显著性检验（Joint Significance Test），通常使用似然比检验（Likelihood Ratio Test, LRT）。

1.4.1 代码与参数解释

ologit.restricted：这是一个受限模型（Restricted Model）。它只包含截距项（~ 1），不包含任何自变量。其假设（原假设 $H_0$）是：ia83 和 dia 的系数均为 0。
lrtest(ologit, ologit.restricted)：该函数比较“全模型”（包含自变量）与“受限模型”的对数似然值（Log-Likelihood）。它判断增加这两个自变量是否显著提升了模型的解释能力。

1.4.2 输出解读

LogLik (对数似然值)：
- 模型 1（全模型）：-127.27
- 模型 2（仅截距）：-133.04
- 逻辑：全模型的对数似然值更大（更接近 0），说明拟合效果更好。
Chisq (卡方统计量)：数值为 11.542。这是通过计算两个模型对数似然值差异的两倍得到的：$LR = 2 \times (-127.27 - (-133.04))$。
Df (自由度)：差异为 2。因为全模型比受限模型多了两个参数（ia83 和 dia）。
Pr(>Chisq) ($p$ 值)：数值为 0.003117。

1.4.3 结论

由于 $p$ 值（0.0031）远小于常用的显著性水平 0.01，我们拒绝原假设。

1.4.4 现实意义

这意味着 ia83（赚钱效率）和 dia（效率变化）联合起来对债券评级具有显著的影响。即使在之前的单项检验中 dia 表现一般，但作为一个财务特征组合，它们共同显著地改善了模型对企业信用状况的预测能力。

1.5 c) Interpret parameters’ coefficients.

1.6 d) Perform goodness-of-fit tests.

lipsitz.test(ologit)

## 
##  Lipsitz goodness of fit test for ordinal response models
## 
## data:  formula:  as.factor(rr) ~ ia83 + dia
## LR statistic = 12.514, df = 9, p-value = 0.1859

在有序 Logit 模型中，传统的 $R^2$ 并不适用，因此我们使用专门的拟合优度检验（Goodness-of-Fit Tests）。这段输出展示了 Lipsitz 检验的结果。

1.6.1 代码与参数解释

lipsitz.test(ologit)：这是专门为有序响应模型设计的拟合优度检验。
核心逻辑： Lipsitz 检验通过将数据根据预测概率分成若干组（通常是 10 组），并引入这些组的指示变量（Dummy Variables）重新估计模型。它测试的是：模型是否系统性地偏离了观测到的数据？
原假设 ($H_0$)：模型拟合良好（即观测到的频数与模型预测的频数之间没有显著差异）。

1.6.2 输出解读

LR statistic (12.514)：似然比统计量。它衡量了当前模型与一个更复杂的（能完美拟合数据的）模型之间的差异。
df (9)：自由度，通常与分组数量相关。
p-value (0.1859)：这是最重要的输出值。
- 判断标准：通常看是否小于 0.05。
- 当前结果：这里的 $p$ 值为 0.1859，大于 0.05。

1.6.3 结论

由于 $p$ 值（0.1859）较大，我们无法拒绝原假设（即接受原假设）。

1.6.4 现实意义

这意味着：该模型对债券评级的拟合是恰当的。 换句话说，模型中设定的函数形式（有序 Logit）以及选取的变量（ia83 和 dia）能够较好地描述数据的内在结构，没有明显的证据表明模型存在严重的设定偏误（Misspecification）。

1.6.5 补充说明

拟合优度检验在高级计量中通常作为“体检报告”。如果 $p$ 值很小（比如 < 0.05），说明你的模型漏掉了重要的非线性项，或者有序 Logit 的分布假设不成立，需要调整模型结构。而在你给出的结果中，模型已经顺利通过了这项“体检”。

1.6.6 Goodness-of-Fit Test Result

The Lipsitz test was performed to evaluate the overall fit of the ordered logit model. This test assesses whether the model’s predicted probabilities align significantly with the observed frequencies in the data.

Test Statistic (LR): 12.514
Degrees of Freedom (df): 9
p-value: 0.1859

1.6.6.1 Interpretation

The null hypothesis ($H_0$) for the Lipsitz test is that the model fits the data well. Since the p-value (0.1859) is greater than the standard significance level of 0.05, we fail to reject the null hypothesis.

This indicates that there is no evidence of significant lack-of-fit. The ordered logit functional form, using ia83 and dia as predictors, is appropriate for modeling the corporate bond ratings in this dataset.

logitgof(rat$rating83c, fitted(ologit), g = 5, ord = TRUE)

## 
##  Hosmer and Lemeshow test (ordinal model)
## 
## data:  rat$rating83c, fitted(ologit)
## X-squared = 88.61, df = 11, p-value = 0.0000000000000312

这段输出展示了另一种拟合优度检验：Hosmer-Lemeshow (H-L) 检验（针对有序模型）。有趣的是，这个结果与之前的 Lipsitz 检验截然相反。

1.6.7 代码与参数解释

logitgof(...)：该函数来自 generalhoslem 包，用于执行 Hosmer-Lemeshow 拟合优度检验。
rat$rating83c：观测到的实际债券评级。
fitted(ologit)：模型预测的每个评级类别的概率。
g = 5：将样本按预测概率分成 5 组（Groups）。
ord = TRUE：明确告诉函数这是一个有序（Ordinal）响应模型，而非普通的多项 Logit。

1.7 输出解读

X-squared (88.61)：卡方统计量。它衡量了“实际观测频数”与“模型预测频数”之间的差异。数值越大，说明差异越大。
p-value (3.12e-14)：这是关键结果。$3.12 \times 10^{-14}$ 是一个极小的数值，远小于 0.05。

1.7.1 结论

由于 $p$ 值接近于零，我们强烈拒绝“模型拟合良好”的原假设。

1.7.2 现实意义与矛盾分析

这里出现了一个非常典型的计量经济学现象：不同的检验给出了矛盾的结论。

Lipsitz 检验 (p = 0.1859)：认为模型通过了检验。
H-L 检验 (p < 0.01)：认为模型没通过检验。

为什么会这样？

样本量与分组 (g)：H-L 检验对分组数 g 的选择非常敏感。如果样本量较小（如本案中的 98 个观测值），分组可能导致某些单元格频数过低，从而使统计量失真。
有序性处理：H-L 检验最初是为二元 Logit 设计的，尽管此处使用了有序版本，但在小样本下其稳健性通常不如 Lipsitz 检验。

1.7.3 英文简答 (Concise Summary)

The Hosmer-Lemeshow test was conducted to check the goodness-of-fit for the ordered logit model.

Result: The test yielded a p-value of 3.12e-14, which is significantly less than 0.05.
Interpretation: We reject the null hypothesis ($H_0$), suggesting a significant lack-of-fit.
Note: This result contradicts the Lipsitz test. Such discrepancies often occur in small samples ($N=98$), where the H-L test can be overly sensitive to grouping or model specifications.

1.8 e) Perform the Brant’s test.

# Brant's test
# the function works with polr model results
brant(ologit)

## -------------------------------------------- 
## Test for X2  df  probability 
## -------------------------------------------- 
## Omnibus      68.74   4   0
## ia83     14.24   2   0
## dia      1.1 2   0.58
## -------------------------------------------- 
## 
## H0: Parallel Regression Assumption holds

在有序 Logit 模型中，Brant 检验是判断模型设定是否正确的“终极大考”。它测试的是有序模型最核心的假设：平行线假设（Parallel Regression Assumption）。

1.8.1 代码与参数解释

brant(ologit)：该函数来自 brant 包，专门用于检验 polr 模型是否违反了比例几率（Proportional Odds）假设。
核心逻辑：有序 Logit 假设自变量（如 ia83）对不同等级之间转移的影响是一致的。例如，“从评级 2 到 3”的影响应等于“从评级 4 到 5”的影响。如果不一致（线条不平行），模型就会失效。
原假设 ($H_0$)： 平行线假设成立。即所有等级共享同一套系数 $\beta$。

1.8.2 输出解读

表格提供了整体检验和单个变量的拆解：

Omnibus (总体检验)：
- 卡方值 (68.74) 非常大。
- Probability (0)：$p$ 值接近于 0。
- 结论：强烈拒绝原假设。这说明模型整体上违反了平行线假设。
ia83 (变量检验)：
- Probability (0)：$p$ 值接近于 0。
- 结论：ia83（赚钱效率）是导致模型违规的主要原因。这意味着它对“低评级向中评级跨越”的影响，与对“高评级向极高评级跨越”的影响显著不同。
dia (变量检验)：
- Probability (0.58)：$p$ 值远大于 0.05。
- 结论：dia 符合平行线假设。

1.8.3 现实意义

模型失败了。尽管之前的拟合优度检验（Lipsitz）勉强通过，但 Brant 检验揭示了深层问题：对于债券评级，收益资产比（ia83）在不同档位的作用力是不均匀的。

1.8.4 下一步建议（高级计量视角）

当 Brant 检验失败时，通常不能再直接使用简单的 Ordered Logit。你可能需要： 1. 使用 广义有序 Logit（Generalized Ordered Logit）。 2. 使用 多项 Logit（Multinomial Logit），它允许每个类别有独立的系数。

1.8.5 英文简答 (Concise Summary)

The Brant’s test was performed to verify the Parallel Regression Assumption.

Omnibus Result: The p-value is 0, which leads us to strongly reject the null hypothesis.
Variable Results: ia83 violates the assumption ($p=0$), while dia satisfies it ($p=0.58$).
Conclusion: The standard ordered logit model is inappropriate for this data because the effect of ia83 varies across different rating thresholds. A more flexible model, such as a Generalized Ordered Logit or Multinomial Logit, should be considered.

1.8.6 如果“收益资产比（ia83）在不同档位的作用力是不均匀的”，那么怎样处理数据

这是一个非常深刻的计量经济学问题。当 Brant 检验 失败（$p < 0.05$），意味着你的“梯子”每级高度不相等——即 ia83 对从“等级 2 升到 3”的拉动力，和从“等级 4 升到 5”的拉动力在统计上显著不同。

针对这种情况，通常有以下三种处理方案，按推荐程度排序：

1.8.7 1. 广义有序 Logit 模型 (Generalized Ordered Logit)

这是最直接的对症下药。普通的 Ordered Logit 强制所有等级共享同一个 $\beta$，而 Generalized Ordered Logit (gologit2) 允许系数随等级变化。

做法：对违反假设的变量（ia83）释放约束，让它在每个阈值处都有独立的系数 $\beta_j$。
优点：保留了因变量的“有序性”信息，同时修正了参数的不一致。
R 工具：可以使用 vgam 包中的 vglm 函数，设置 family = cumulative(parallel = FALSE)。

1.8.8 2. 局部限制性有序 Logit (Partial Proportional Odds Model)

如果你发现只有 ia83 违反假设，而 dia 是守规矩的（正如你的 Brant 检验结果：dia 的 $p=0.58$），你可以使用这种折中方案。

做法：只让 ia83 的系数随等级变化，而强迫 dia 的系数在所有等级间保持一致。
优点：比完全广义的模型更简洁（更省自由度），模型解释力也更强。

1.8.9 3. 多项 Logit 模型 (Multinomial Logit)

如果排序信息不再重要，或者你认为不同评级之间完全是“质变”而非“量变”。

做法：使用 multinom 函数。它完全不考虑 2, 3, 4, 5 的先后顺序，把它们看作四个独立的类别。
缺点：会损失“评级有高低之分”这一重要财务信息，通常作为最后的保底手段。

1.8.10 4. 变量转换 (Data Transformation)

有时候违反平行线假设是因为非线性关系。

做法：尝试对 ia83 取对数 $\ln(ia83)$、加入平方项 $ia83^2$，或者进行标准化处理。
原理：有时并不是作用力不均匀，而是你没捕捉到它随数值增长而产生的边际递减效应。

1.8.11 总结建议

在学术论文或高级计量作业中，最标准的做法是：

“由于 Brant 检验显著拒绝了平行线假设，说明标准有序 Logit 存在设定偏误。我们应转而采用广义有序 Logit (Generalized Ordered Logit) 模型，以允许关键变量 ia83 在不同评级门槛间具有异质性影响。”

1.9 f) Calculate and interpret pseudo-$R^2$ statistics.

# Pseudo-R2 statistics
pR2(ologit)

## fitting null model for pseudo-r2

##          llh      llhNull           G2     McFadden         r2ML         r2CU 
## -127.2714574 -133.0422446   11.5415743    0.0433756    0.1111006    0.1189762

在非线性模型（如 Ordered Logit）中，由于不存在最小二乘法中的剩余方差概念，传统的 $R^2$ 无法计算。因此，我们使用 Pseudo-$R^2$（伪 $R^2$） 来衡量模型的拟合优度。

1.9.1 代码与参数解释

pR2(ologit)：该函数来自 pscl 包。它通过比较全模型（Full Model）和仅含截距项的空模型（Null Model）的对数似然值，计算出多种伪 $R^2$ 指标。
llh (-127.27)：全模型的对数似然值。
llhNull (-133.04)：空模型的对数似然值。
G2 (11.54)：似然比检验统计量，即 $2 \times (llh - llhNull)$。

1.9.2 输出解读

输出提供了四种主流的伪 $R^2$ 计算结果：

McFadden (0.0434)：这是最常用的指标。计算公式为 $1 - \frac{llh}{llhNull}$。
- 评价标准：在离散选择模型中，McFadden $R^2$ 在 0.2 到 0.4 之间就被认为拟合得非常好。这里的 0.043 数值偏低。
r2ML (0.1111)：基于 Cox & Snell 的定义。它利用似然比来衡量。缺点是其最大值通常小于 1。
r2CU (0.1190)：即 Cragg-Uhler / Nagelkerke $R^2$。它对 r2ML 进行了修正，确保最大值可以达到 1。
- 现实含义：该模型解释了债券评级约 11.9% 的变异。

1.9.3 结论与注释

1.9.3.1 现实意义

虽然模型在统计上是显著的（由之前的 LR 检验可知），但从伪 $R^2$ 来看，其解释力较为有限。

原因分析：债券评级是一个极其复杂的过程，除了 ia83（赚钱能力）和 dia（变化趋势）外，还受到债务规模、行业环境、宏观经济、公司治理等诸多因素影响。仅用两个变量解释 12% 的变异在金融研究中其实是正常现象，但也暗示了模型可能遗漏了其他重要的解释变量。

1.9.4 英文简答 (Concise Summary)

f) Calculate and interpret pseudo-$R^2$ statistics.

Several pseudo-$R^2$ metrics were calculated to assess the model’s explanatory power.

McFadden’s $R^2$: 0.0434
Nagelkerke (Cragg-Uhler) $R^2$: 0.1190

Interpretation: While the model is statistically significant, the explanatory power is relatively modest. The Nagelkerke $R^2$ suggests that the financial variables ia83 and dia account for approximately 11.9% of the variation in corporate bond ratings. The low McFadden value (4.3%) indicates that many other unobserved factors likely influence a firm’s creditworthiness beyond simple income-to-asset ratios.

1.10 g) Calculate marginal effects and interpret them.

2 Exercise 2

Let’s analyse variable coding health status of respondents (hstatus). The variable takes three values:

1 for a person with bad health,
2 for moderate health, and
3 for good health status.

Covariates for the dependent variable are:

income,
female – a binary variable (1 for women),
black,
and num – family size.

Questions:

Why shouldn’t we use a logit/probit model for this problem?
Estimate an ordered logit for health status with covariates mentioned above.
Determine and interpret pseudo-$R^2$ statistics.
Are variables in the model jointly significant?
The parameter next to income is equal to $-3.39 \times 10^{-5}$.
Are we allowed to conclude on the basis of the value that income is insignificant?
Having parameters estimated determine signs of marginal effects for female and famsize for the alternative good health.
Perform goodness-of-fit tests.
Determine and interpret marginal effects for the alternative moderate health.
Determine and interpret marginal effects for the alternative good health.
Verify hypothesis that income affects respondents’ health linearly against the alternative that income has nonlinear, cubic impact. Use likelihood ratio test.

2.1 a) Why shouldn’t we use a logit/probit model for this problem?

标准的 Logit 或 Probit 模型仅适用于二元因变量 (Binary dependent variables)，即只有两个类别（如 0 和 1）。

本题情况：这里的因变量 hstatus 是一个有 3 个等级的排序变量 (3-value ordered variable)（差、中、优）。

排序性 (Ordinal nature)：虽然我们可以用 1, 2, 3 来编码，但这些数字仅代表排序 (Ordering)，而不具备数量意义 (Quantitative interpretation)。例如，状态“3”比“1”好，但并不代表它正好是“1”的三倍好，也不意味着 1+2=3 。

因此，使用排序选择模型 (Ordered Choice Models)（如 Ordered Logit/Probit）更为合适。

2.2 b) Estimate an ordered logit for health status with covariates mentioned above.

rd = read.csv(file="Randdata.csv", header=TRUE, sep=",")
rd$health = 0
rd$health[rd$hlthp==1]=1
rd$health[rd$hlthf==1]=2
rd$health[rd$hlthg==1]=3
indeksy = which(rd$health==0)
rd = rd[-indeksy,]
rd$health = as.factor(rd$health)

代码逻辑：

数据清洗与编码： 根据来源，hstatus（此处为 health）是一个典型的排序变量，将状态编码为 1（差）、2（中）、3（优）。
处理潜变量： 这种编码方式反映了观测变量值是由底层的潜变量 (Latent variable) $y^*$ 确定的。
模型准备： 最后一行 as.factor() 至关重要，因为排序 Logit/Probit 模型（如 MASS 包中的 polr 函数）要求因变量必须是因子类型，以便程序能够识别其类别顺序并估计相应的阈值 (Thresholds),,。

# Estimate ordered logit for hstatus.
ologit = ologit.reg(health~income+female+num, data=rd)
summary(ologit)

## Ordered Logit Regression 
## Log-Likelihood: -5294.072 
## No. Iterations: 5 
## McFadden's R2: 0.02912101 
## AIC: 10598.14 
##             Estimate    Std. error t value              Pr(>|t|)    
## income  0.0001157821  0.0000070506 16.4217 < 0.00000000000000022 ***
## female -0.1412207117  0.0535073206 -2.6393              0.008308 ** 
## num     0.0293452983  0.0126663105  2.3168              0.020515 *  
## ----- Threshold Parameters -----
##                   Estimate Std. error  t value              Pr(>|t|)    
## Threshold (1->2) -2.580795   0.092750 -27.8254 < 0.00000000000000022 ***
## Threshold (2->3) -0.528933   0.077969  -6.7839       0.0000000000117 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

这份 R 语言输出展示了一个排序 Logit 回归 (Ordered Logit Regression) 的估计结果。该模型用于分析具有自然顺序但没有定量解释的离散因变量。

2.3 代码功能解释

这段代码执行了排序 Logit 模型的估计与结果展示：

ologit = ologit.reg(health ~ income + female + num, data = rd)：使用 rd 数据集估计模型。因变量是 health（1=差，2=中，3=优），自变量包括 income（收入）、female（性别虚拟变量）和 num（家庭人数）。
summary(ologit)：提取并显示模型的详细统计摘要。

2.4 模型整体评价指标

在输出的顶部，提供了衡量模型拟合程度的关键统计量：

Log-Likelihood (-5294.072)：对数似然值，用于衡量模型对观测数据的解释能力。
McFadden’s R2 (0.02912101)：伪 $R^2$，用于评估离散选择模型的拟合优度。在微观数据分析中，这个数值虽然看起来较小，但通常是可接受的。
AIC (10598.14)：赤池信息准则。这是一个平衡模型简洁性和拟合度的指标，在比较不同模型时，AIC 值越小越好。

2.5 回归系数解释

这部分展示了每个解释变量对潜变量 $y^*$（隐藏的健康倾向）的影响：

income (1.1578e-04)：
- 系数为正且极显著（p 值 < 2.2e-16）。
- 这意味着收入增加会提高进入最高健康等级（良好健康）的概率，并降低进入最低健康等级（差）的概率。
female (-1.4122e-01)：
- 系数为负且在 0.01 水平下显著。
- 这表明在其他条件相同的情况下，女性报告处于较低健康等级的可能性比男性更高。
num (2.9345e-02)：
- 系数为正且在 0.05 水平下显著。
- 这意味着家庭人数较多与较高的健康评价倾向相关。

In an Ordered Logit Model, the regression coefficients ($\beta$) represent the relationship between the explanatory variables and the latent variable ($y^*$), which is the underlying, unobservable continuous propensity for a specific outcome. 在有序Logit模型中，回归系数（β）表示解释变量与潜在变量（y∗）之间的关系，潜在变量是特定结果的潜在、不可观测的连续倾向。

Here is a breakdown of how to interpret these coefficients in an English-language academic context:

2.5.1 Directional Effect on the Latent Variable

The sign of the coefficient indicates the direction of the change in the latent propensity.

Positive Coefficient ($\beta > 0$): As the independent variable increases, the latent propensity ($y^*$) also increases. This moves the individual toward higher ordered categories (e.g., from “Moderate Health” toward “Good Health”).
Negative Coefficient ($\beta < 0$): As the independent variable increases, the latent propensity decreases, making lower ordered categories more likely.

2.5.2 Impact on Probability Categories

While the coefficient directly affects the latent variable, its impact on the observed probabilities of specific categories follows a specific logic:

The Highest Category: A positive coefficient always increases the probability of being in the highest category (e.g., Outcome 3: Good Health).
The Lowest Category: A positive coefficient always decreases the probability of being in the lowest category (e.g., Outcome 1: Bad Health).
Middle Categories: The effect on middle categories (e.g., Outcome 2: Moderate Health) is mathematically ambiguous. Even if a coefficient is positive, the probability of a middle category could increase or decrease depending on the specific thresholds and marginal effects.

2.5.3 Statistical Significance

The “Value” or “Estimate” of the coefficient should not be interpreted in isolation; it must be checked for statistical significance using the t-value or p-value.

A variable is considered statistically significant if its p-value is less than the chosen alpha level (typically $0.05$). For example, in the provided health model, income has a p-value of < 2.2e-16, indicating it is highly significant.
The absolute magnitude of the coefficient (e.g., $1.15 \times 10^{-4}$ for income) may appear small, but this does not mean the variable is “insignificant”; it simply reflects the unit of measurement for that specific variable.

2.5.4 Proportional Odds Assumption

It is important to remember that in a standard Ordered Logit model, these coefficients are assumed to be constant across all thresholds. This is known as the Parallel Regression Assumption, meaning the effect of a variable like income is assumed to be the same regardless of whether you are moving from Category 1 to 2 or Category 2 to 3.

2.6 c) Determine and interpret pseudo-$R^2$ statistics.

2.7 d) Are variables in the model jointly significant?

2.8 e) The parameter next to income is equal to $-3.39 \times 10^{-5}$. Are we allowed to conclude on the basis of the value that income is insignificant?

2.9 f) Having parameters estimated determine signs of marginal effects for female and famsize for the alternative good health.

2.10 g) Perform goodness-of-fit tests.

2.11 h) Determine and interpret marginal effects for the alternative moderate health.

2.12 i) Determine and interpret marginal effects for the alternative good health.

2.13 j) Verify hypothesis that income affects respondents’ health linearly against the alternative that income has nonlinear, cubic impact. Use likelihood ratio test.

3 🍉 Cheat Sheet for Ordered Logit/Probit

3.0.1 正态分布（Z 分布）的常用双尾检验临界值

Percent		10%	5%	1%
Point	$\ll$ 1.645	1.645	1.96	2.58	$\gg 2.58$
Conclusion	✅ fail to reject	❌	❌	❌	Strongly Reject❌❌

3.0.2 `polr` for Ordinal logit/probit

ologit <- polr(as.factor(rr) ~ ia83 + dia, data = rat)
## Coefficients:
##         Value Std. Error t value
## ia83  0.09392    0.02962   3.171
## dia  -0.08669    0.04498  -1.927

3.0.2.1 参数

as.factor：缓缓为因子(factor)，才能识别出一个分类任务而非普通的线性回归。

3.0.2.2 解读

系数解释：这里的 0.09392 并不直接代表概率的变化，而是代表 Log-Odds（对数几率）的变化。若要解释为几率比（==Odds Ratio==），需要计算 $\exp(0.09392) \approx 1.098$，即 ia83 每增加一个单位，评级提升一个等级的几率增加约 9.8%。
ia83 (Value = 0.09392)：系数为正且 t 值 (3.171) 较大。这表明企业的收入资产比越高，其债券评级上升（向更高数值 5 移动）的可能性显著增加。
dia (Value: -0.08669)：系数为负，t 值 (-1.927) ==先取绝对值== 接近显著性临界值（通常为 1.96）。这意味着收入资产比的增长（变动额）对评级可能有负向影响，但在 5% 的置信水平下略显勉强。

3.0.3 `coeftest` 系数显著性检验

coeftest(ologit)
## 
## t test of coefficients:
## 
##       Estimate Std. Error t value Pr(>|t|)   
## ia83  0.093918   0.029620  3.1708 0.002059 **
## dia  -0.086694   0.044979 -1.9274 0.056977 .

之前的 summary(ologit) 默认不提供 𝑝 值，只给 𝑡 值。coeftest 补齐了这一环，方便直接判断变量是否在统计学上显著。

3.0.4 Likelihood ratio test for Joint Significance

ologit.restricted <- polr(as.factor(rr) ~ 1, data = rat)
lrtest(ologit, ologit.restricted)
## Likelihood ratio test
## 
## Model 1: as.factor(rr) ~ ia83 + dia
## Model 2: as.factor(rr) ~ 1
##   #Df  LogLik Df  Chisq Pr(>Chisq)   
## 1   5 -127.27                        
## 2   3 -133.04 -2 11.542   0.003117 **

~ 1: 只包含截距项, 不包含任何自变量
$H_0$: ia83 和 dia 的系数均为 0。
lrtest(ologit, ologit.restricted)：该函数比较“全模型”与“受限模型”的对数似然值（Log-Likelihood）。判断增加这两个自变量是否显著提升了模型的解释能力。
LogLik (对数似然值)：越大（接近 0），说明拟合效果更好。
$P\ge 0.05$: fail to reject H0, Jointly Significant;
$P \ll 0.05$: Reject H0, strongly reject h0, Jointly In-Significant.

3.0.5 `lipsitz.test` goodness-of-fit tests

lipsitz.test(ologit)
## 
##  Lipsitz goodness of fit test for ordinal response models
## 
## data:  formula:  as.factor(rr) ~ ia83 + dia
## LR statistic = 12.514, df = 9, p-value = 0.1859

在有序 Logit 模型中，传统的 𝑅2 并不适用，因此我们使用专门的拟合优度检验（Goodness-of-Fit Tests）。
原假设 (𝐻0)：模型拟合良好（即观测到的频数与模型预测的频数之间没有显著差异）。
The null hypothesis (𝐻0) for the Lipsitz test is that the model fits the data well. Since the p-value (0.1859) is greater than the standard significance level of 0.05, we fail to reject the null hypothesis.

3.0.6 `logitgof` Hosmer-Lemeshow (H-L)拟合优度检验

logitgof(rat$rating83c, fitted(ologit), g = 5, ord = TRUE)

==小样本，不稳健==。其稳健性通常不如 Lipsitz 检验。

3.0.7 `brant` 平行线检测

brant(ologit)
## -------------------------------------------- 
## Test for X2  df  probability 
## -------------------------------------------- 
## Omnibus      68.74   4   0
## ia83             14.24   2   0
## dia              1.1         2   0.58
## -------------------------------------------- 
## 
## H0: Parallel Regression Assumption holds

用于检验 polr 模型是否违反了比例几率（Proportional Odds）假设。
有序 Logit 假设自变量（如 ia83）对不同等级之间转移的影响是一致的。例如，“从评级 2 到 3”的影响应等于“从评级 4 到 5”的影响。如果不一致（线条不平行），模型就会失效。
原假设 (𝐻0)： 平行线假设成立。即所有等级共享同一套系数 𝛽。
只有当 Omnibus 以及每一个自变量的 $p$ 值都大于 0.05 时，我们才“无法拒绝原假设”。既然模型不合适：
- 改用广义有序Logistic回归 (Generalized Ordered Logit Model)
- 改用多项 Logistic 回归 (Multinomial Logistic Regression)
- 将因变量合并为二分类变量 (Binary Logit)

3.0.8 Pseudo-𝑅2（伪 𝑅2）来衡量模型的拟合优度。

# Pseudo-R2 statistics
pR2(ologit)
## fitting null model for pseudo-r2
##          llh      llhNull           G2     McFadden         r2ML         r2CU 
## -127.2714574 -133.0422446   11.5415743    0.0433756    0.1111006    0.1189762

在 Logistic 回归（包括有序 Logistic 回归）中，因为我们使用的是最大似然估计法（MLE）而不是最小二乘法（OLS），所以不能计算出像线性回归那样的绝对 $R^2$。统计学家们发明了“伪 $R^2$”来评估模型的拟合优度（Goodness of Fit）。
最重要的是McFadden和r2ML两个指标
- McFadden 在 0.2 到 0.4 之间就被认为模型拟合得非常优秀了。
- r2ML：
  - 致命缺点是最大值永远达不到 1
  - 只能用于同一批数据下不同模型的横向对比

3.0.9 `ologit.reg` Ordered Logit Regression

ologit = ologit.reg(health~income+female+num, data=rd)
summary(ologit)
## Ordered Logit Regression 
## Log-Likelihood: -5294.072 
## No. Iterations: 5 
## McFadden's R2: 0.02912101 
## AIC: 10598.14 
##             Estimate    Std. error t value              Pr(>|t|)    
## income  0.0001157821  0.0000070506 16.4217 < 0.00000000000000022 ***
## female -0.1412207117  0.0535073206 -2.6393              0.008308 ** 
## num     0.0293452983  0.0126663105  2.3168              0.020515 *

有序 Logit 模型中，系数（Estimate）代表：在其他变量保持不变时，自变量每增加一个单位，因变量进入更高健康等级（更好健康状况）的==对数几率（Log Odds）==的变化。