library(haven)
library(ggplot2)
This paper will be looking to test for discrimination on the basis of race in the mortgage loan market.
The binary variable to be explained is approve, which is equal to 1 if a mortgage loan to an individual was approved.
The key explanatory variable is nonwhite, a dummy variable equal to 1 if the applicant was black or hispanic and 0 if the applicant was white.
Variable (obrat) which will be used is a measure of other obligations as a percentage of income. A higher value suggests a higher proportion of income being allocated to other debt obligations.
Other explanatory variables will also be included in the linear and nonlinear regression models:
Binary explanatory variables: male, (unem) unemployed, (pubrec) public records, (cosign) if the mortgage is cosigned by another individual
Continuous explanatory variables: (hrat) housing expense ratio, (loanprc) loan to price ratio
Discrete explanatory variables: (chist) credit card history (dep) number of dependants, (sch) number of school aged children
To test for discrimination on the basis of race in the mortgage loan market, a linear probability model can be used:
\(\text{approve} = \beta_0 + \beta_1(\text{nonwhite}) + u\), where u represents other factors that influence loan approvals.
What is the interpretation of β1 in the model above? If there is discrimination against minorities, and the appropriate factors have been controlled for, what is the expected sign of β1?
The interpretation of β1 in the model above is the average difference in probability of being approved for a loan between white and non-white applicants, holding all other factors constant. If there is discrimination in the model, we expect β1 < 0 which would imply non-white applicants have a lower probability of being approved for a loan than white applicants, all else being equal.
Regress approve on nonwhite and report the results
First load the data:
file_path <- "/Users/nathanfoale/Desktop/data/loanapp.dta"
loan_data <- read_dta(file_path)
Then generate the non-white variable:
loan_data$nonwhite <- ifelse(loan_data$black == 1 | loan_data$hispan == 1, 1, 0)
The regression equation is: \(\text{approve} = \beta_0 + \beta_1(\text{nonwhite}) + u\)
Run the regression of approval on non-white and display the regression output:
model <- lm(approve ~ nonwhite, data = loan_data)
summary(model)
##
## Call:
## lm(formula = approve ~ nonwhite, data = loan_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.90839 0.09161 0.09161 0.09161 0.29221
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.908388 0.007807 116.35 <2e-16 ***
## nonwhite -0.200596 0.019840 -10.11 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3201 on 1987 degrees of freedom
## Multiple R-squared: 0.04893, Adjusted R-squared: 0.04845
## F-statistic: 102.2 on 1 and 1987 DF, p-value: < 2.2e-16
Interpret the coefficient on nonwhite.
The coefficient of non-white is the average difference in the probability of receiving a loan against their white counterparts, all else being equal. In this case it is -0.20, representing a 20.0 percentage point decrease in the probability of a loan approval compared with being white.
Is it statistically significantly different from zero?
Null hypothesis \(H_0 : \beta_1 = 0\) Alternative hypothesis \(H_1: \beta_1 \neq 0\)
#pull coefficients and standard errors from regression output
coef_nonwhite <- coef(model)["nonwhite"]
se_nonwhite <- sqrt(vcov(model)["nonwhite", "nonwhite"])
#Calc T-statistic
t_stat <- coef_nonwhite / se_nonwhite
df <- length(model$residuals) - length(coef(model))
# significance level of alpha
alpha <- 0.05
# calc T-critical value
critical_value <- qt(1 - alpha/2, df)
# output the T-statistic and T-critical value
print(paste("t-statistic:", round(t_stat, 3)))
## [1] "t-statistic: -10.111"
print(paste("Critical value:", round(critical_value, 3)))
## [1] "Critical value: 1.961"
Rejection criteria:
If \(|\text{T-stat}| > |\text{T-crit}|\), then we reject \(H_0\).
if (abs(t_stat) > critical_value) {
print("Reject the null hypothesis (β1 = 0)")
} else {
print("Fail to reject the null hypothesis (β1 = 0)")
}
## [1] "Reject the null hypothesis (β1 = 0)"
Thus we reject the null hypothesis that there is no effect of being non-white on being approved for a loan.
Estimate equation (1) by adding variables hrat, obrat, loanprc, unem, male, dep, sch, cosign, chist, pubrec, mortlat1, mortlat2, and vr as explanatory variables and report the results.
full_model <- lm(approve ~ nonwhite + hrat + obrat + loanprc + unem + male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 + vr, data = loan_data)
summary(full_model)
##
## Call:
## lm(formula = approve ~ nonwhite + hrat + obrat + loanprc + unem +
## male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 +
## vr, data = loan_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.04990 0.00993 0.06630 0.13559 0.68644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.0858769 0.0488025 22.250 < 2e-16 ***
## nonwhite -0.1290515 0.0197662 -6.529 8.42e-11 ***
## hrat 0.0017817 0.0012653 1.408 0.1592
## obrat -0.0054859 0.0011035 -4.971 7.23e-07 ***
## loanprc -0.1474165 0.0375820 -3.923 9.06e-05 ***
## unem -0.0074776 0.0032030 -2.335 0.0197 *
## male 0.0121995 0.0179769 0.679 0.4975
## dep -0.0007108 0.0063491 -0.112 0.9109
## sch 0.0007582 0.0166753 0.045 0.9637
## cosign 0.0110298 0.0412094 0.268 0.7890
## chist 0.1308505 0.0192810 6.786 1.52e-11 ***
## pubrec -0.2440697 0.0282668 -8.634 < 2e-16 ***
## mortlat1 -0.0608350 0.0500838 -1.215 0.2246
## mortlat2 -0.1181896 0.0670829 -1.762 0.0783 .
## vr -0.0350075 0.0139984 -2.501 0.0125 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3026 on 1956 degrees of freedom
## (18 observations deleted due to missingness)
## Multiple R-squared: 0.1622, Adjusted R-squared: 0.1562
## F-statistic: 27.05 on 14 and 1956 DF, p-value: < 2.2e-16
What happens to the coefficient on nonwhite?
When adding more variables into the model, the coefficient of non-white reduces from -0.20 to -0.13, suggesting a 13 percentage point decrease in the probability of a loan approval compared with being white.
Is there still evidence of discrimination against non-whites ?
Looking at the model shows there is still evidence of discrimination against non-whites. The coefficient of non-white after adding in more variables remains statistically significant, as evident by the small p-value of 0.00. Even after controlling for additional factors, being non-white continues to have a significant negative impact on the probability of loan approval. Hence, the findings suggest persistent evidence of discrimination against non-white applicants in the mortgage loan approval process.
Now, if we allow the effect of race to interact with the variable measuring other obligations as a percentage of income (obrat), then how will the previous regression equation change? Estimate this new regression equation.
The new model will look like:
\(\begin{align*} \text{approve} &= \beta_0 + \beta_1(\text{nonwhite}) + \beta_2(\text{hrat}) + \beta_3(\text{obrat}) + \beta_4(\text{nonwhite} \times \text{obrat}) + \beta_5(\text{loanprc}) + \beta_6(\text{unem}) + \beta_7(\text{male}) \\ &\quad + \beta_8(\text{dep}) + \beta_9(\text{sch}) + \beta_{10}(\text{cosign}) + \beta_{11}(\text{chist}) + \beta_{12}(\text{pubrec}) + \beta_{13}(\text{mortlat1}) + \beta_{14}(\text{mortlat2}) + \beta_{15}(\text{vr}) + u \end{align*}\)
Regression allowing the effect of race to interact with obrat variable:
model_interaction <- lm(approve ~ nonwhite + hrat + obrat + nonwhite:obrat + loanprc + unem + male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 + vr, data = loan_data)
summary(model_interaction)
##
## Call:
## lm(formula = approve ~ nonwhite + hrat + obrat + nonwhite:obrat +
## loanprc + unem + male + dep + sch + cosign + chist + pubrec +
## mortlat1 + mortlat2 + vr, data = loan_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.04036 0.01546 0.06504 0.12285 0.80887
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.0547482 0.0494478 21.331 < 2e-16 ***
## nonwhite 0.1468387 0.0804027 1.826 0.067959 .
## hrat 0.0017385 0.0012616 1.378 0.168366
## obrat -0.0041869 0.0011599 -3.610 0.000314 ***
## loanprc -0.1526721 0.0375011 -4.071 4.86e-05 ***
## unem -0.0077066 0.0031942 -2.413 0.015929 *
## male 0.0102175 0.0179330 0.570 0.568905
## dep -0.0015553 0.0063350 -0.246 0.806089
## sch 0.0007887 0.0166264 0.047 0.962172
## cosign 0.0189903 0.0411500 0.461 0.644498
## chist 0.1276796 0.0192453 6.634 4.21e-11 ***
## pubrec -0.2424480 0.0281876 -8.601 < 2e-16 ***
## mortlat1 -0.0663651 0.0499613 -1.328 0.184224
## mortlat2 -0.1313348 0.0669891 -1.961 0.050075 .
## vr -0.0340802 0.0139598 -2.441 0.014722 *
## nonwhite:obrat -0.0081201 0.0022943 -3.539 0.000411 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3017 on 1955 degrees of freedom
## (18 observations deleted due to missingness)
## Multiple R-squared: 0.1675, Adjusted R-squared: 0.1612
## F-statistic: 26.23 on 15 and 1955 DF, p-value: < 2.2e-16
Is the interaction term statistically significantly different from zero?
Null hypothesis \(H_0 : \beta_4 = 0\)
Alternative hypothesis \(H_1: \beta_4 \neq 0\)
coef_interaction <- coef(model_interaction)["nonwhite:obrat"]
se_interaction <- sqrt(vcov(model_interaction)["nonwhite:obrat", "nonwhite:obrat"])
df_interaction <- length(model_interaction$residuals) - length(coef(model_interaction))
# Calc t-statistic
t_stat_interaction <- coef_interaction / se_interaction
# Significance level of alpha
alpha <- 0.05
# calc critical value
t_crit <- qt(1 - alpha / 2, df_interaction)
# t-statistic and critical value
print(paste("t-statistic:", round(t_stat_interaction, 3)))
## [1] "t-statistic: -3.539"
print(paste("Critical value (two-tailed):", round(t_crit, 3)))
## [1] "Critical value (two-tailed): 1.961"
Rejection criteria:
If \(|\text{T-stat}| > |\text{T-crit}|\), then we reject \(H_0\).
if (abs(t_stat_interaction) > t_crit) {
print("Reject the null hypothesis (β4 = 0)")
} else {
print("Fail to reject the null hypothesis (β4 = 0))")
}
## [1] "Reject the null hypothesis (β4 = 0)"
Thus we reject the null hypothesis that the coefficient of the interaction term β4 is equal to zero, suggesting that the interaction between the nonwhite and obrat variables has a statistically significant effect on the approval of mortgage loans.
Using the previous model, derive the marginal effect of being nonwhite on the probability of approval for someone with a median value of obrat. Report the estimated effect and provide interpretation. Also obtain a 95% confidence interval for this effect.
To derive the marginal effect of being non-white on the probability of approval for someone with a median value of obrat, the coefficients from the previous model will be used.
\(\text{Marginal effect} = \frac{\partial \hat{p}}{\partial \text{nonwhite}} = \beta_1 + \hat{\beta}_4 \times \text{Median obrat}\)
where \(\hat{p}\) is the estimated probability of loan approval, \(\beta_1\) is the coefficient of nonwhite. \(\beta_3\) is the coefficient of obrat, and \(\hat{\beta}_4\) is the coefficient of the interaction term nonwhite:obrat.
# calc the median value of obrat
median_obrat <- median(loan_data$obrat, na.rm = TRUE)
# coefficients from model
coef_nonwhite <- coef(model_interaction)["nonwhite"]
coef_obrat <- coef(model_interaction)["obrat"]
coef_interaction <- coef(model_interaction)["nonwhite:obrat"]
# marginal effect calc
marginal_effect <- coef_nonwhite + coef_interaction * median_obrat
print(paste("Marginal effect of being nonwhite on the probability of approval:", round(marginal_effect, 4)))
## [1] "Marginal effect of being nonwhite on the probability of approval: -0.1211"
This marginal effect represents the change in the probability of loan approval associated with being nonwhite, holding all other variables constant.
Obtain a 95% confidence interval for this effect:
# calc the standard error of the marginal effect
se_marginal_effect <- sqrt((vcov(model_interaction)["nonwhite", "nonwhite"]) +
(median_obrat^2 * vcov(model_interaction)["nonwhite:obrat", "nonwhite:obrat"]) +
(2 * median_obrat * vcov(model_interaction)["nonwhite", "nonwhite:obrat"]))
# Calc t-statistic
t_stat_marginal_effect <- marginal_effect / se_marginal_effect
# Degrees of freedom
df_marginal_effect <- length(model_interaction$residuals) - length(coef(model_interaction))
# critical value (two-tailed)
t_crit_marginal_effect <- qt(1 - (0.05 / 2), df_marginal_effect)
# 95% confidence interval
lower_ci <- marginal_effect - qt(0.975, df_marginal_effect) * se_marginal_effect
upper_ci <- marginal_effect + qt(0.975, df_marginal_effect) * se_marginal_effect
# results
print(paste("95% Confidence Interval:", round(lower_ci, 4), "-", round(upper_ci, 4)))
## [1] "95% Confidence Interval: -0.16 - -0.0822"
This confidence interval suggests that we are 95% confident the marginal effect of being nonwhite on the probability of loan approval lies between -0.16 and -0.0822.
# Check if the t-statistic is greater than the critical value
if (abs(t_stat_marginal_effect) > t_crit_marginal_effect) {
print("The marginal effect is statistically significant.")
} else {
print("The marginal effect is not statistically significant.")
}
## [1] "The marginal effect is statistically significant."
As \(|\text{T-stat}| > |\text{T-crit}|\),it indicates that the marginal effect is statistically significant at the 5% significance level, thus we have sufficient evidence to conclude that being nonwhite has a significant impact on the probability of loan approval for individuals with a median value of obrat.
Estimate a probit and logit model of approve on nonwhite and report the results Write the expression for the marginal effect of being nonwhite on the probability of loan approval using both probit and logit. Estimate this marginal effect using logit and probit. How do these compare with the linear probability estimates?
Probit Marginal Effect = \(\frac{\partial P_i}{\partial X_k} = \frac{\partial \Phi(X_iB)}{\partial X_k} = \Phi(X_iB) \cdot B_k\)
Logit Marginal Effect = \(\frac{\partial P_i}{\partial X_k} = \frac{\partial Λ(X_iB)}{\partial X_k} = Λ(X_iB) \cdot B_k\)
mfxboot <- function(modform, dist, data, boot = 1000, digits = 3) {
x <- glm(modform, family = binomial(link = dist), data)
pdf <- ifelse(dist == "probit",
dnorm(predict(x, type = "link")),
dlogis(predict(x, type = "link")))
marginal_effect <- coef(x)["nonwhite"] * pdf
return(mean(marginal_effect))
}
linear_model <- lm(approve ~ nonwhite, data = loan_data)
# probit model
probit_model <- glm(approve ~ nonwhite, data = loan_data, family = binomial(link = "probit"))
# logit model
logit_model <- glm(approve ~ nonwhite, data = loan_data, family = binomial(link = "logit"))
# coefficients
beta_hat_probit <- coef(probit_model)["nonwhite"]
beta_hat_logit <- coef(logit_model)["nonwhite"]
# marginal effects
marginal_effects_probit <- mfxboot(modform = approve ~ nonwhite, dist = "probit", data = loan_data)
marginal_effects_logit <- mfxboot(modform = approve ~ nonwhite, dist = "logit", data = loan_data)
# RESULTS
cat("Marginal effect of being nonwhite (Probit):", round(marginal_effects_probit, digits = 3), "\n")
## Marginal effect of being nonwhite (Probit): -0.129
cat("Marginal effect of being nonwhite (Logit):", round(marginal_effects_logit, digits = 3), "\n")
## Marginal effect of being nonwhite (Logit): -0.117
For the probit model, the marginal effect of being nonwhite is -0.129, indicating that being nonwhite is associated with a decrease of approximately 0.129 percentage points in the probability of loan approval, relative to white applicants, all else being equal.
For the logit model, the marginal effect of being nonwhite is -0.117, indicating that being nonwhite is associated with a decrease of approximately 0.117percentage points in the probability of loan approval, relative to white applicants, all else being equal.
Comparing these results with the linear probability estimate -0.2006, we observe that the marginal effects from both the probit and logit models are slightly smaller in magnitude which is expected as the linear probability model assumes constant marginal effects, while the probit and logit models capture the nonlinear relationship between the regressor variables and the probability of approval. The direction of the effect remains consistent across all models, indicating a lower likelihood of loan approval for non-white applicants compared to white applicants.
Add the variables hrat, obrat, loanprc, unem, male, dep, sch, cosign, chist, pubrec, mortlat1, mortlat2, and vr to the probit and logit models and report the results.
# probit model with added variables
probit_model_additional <- glm(approve ~ nonwhite + hrat + obrat + loanprc + unem + male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 + vr,
data = loan_data, family = binomial(link = "probit"))
# logit model with added variables
logit_model_additional <- glm(approve ~ nonwhite + hrat + obrat + loanprc + unem + male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 + vr,
data = loan_data, family = binomial(link = "logit"))
summary(probit_model_additional)
##
## Call:
## glm(formula = approve ~ nonwhite + hrat + obrat + loanprc + unem +
## male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 +
## vr, family = binomial(link = "probit"), data = loan_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.692125 0.302512 8.899 < 2e-16 ***
## nonwhite -0.515876 0.096685 -5.336 9.52e-08 ***
## hrat 0.007847 0.007011 1.119 0.26300
## obrat -0.028303 0.006141 -4.609 4.05e-06 ***
## loanprc -1.002545 0.240065 -4.176 2.97e-05 ***
## unem -0.038333 0.017619 -2.176 0.02958 *
## male 0.056142 0.104421 0.538 0.59082
## dep -0.012503 0.036822 -0.340 0.73419
## sch 0.008762 0.095109 0.092 0.92659
## cosign 0.103767 0.240739 0.431 0.66644
## chist 0.566890 0.095105 5.961 2.51e-09 ***
## pubrec -0.788850 0.126655 -6.228 4.71e-10 ***
## mortlat1 -0.217128 0.254793 -0.852 0.39412
## mortlat2 -0.529491 0.322551 -1.642 0.10068
## vr -0.218399 0.080934 -2.698 0.00697 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1476.0 on 1970 degrees of freedom
## Residual deviance: 1208.5 on 1956 degrees of freedom
## (18 observations deleted due to missingness)
## AIC: 1238.5
##
## Number of Fisher Scoring iterations: 5
summary(logit_model_additional)
##
## Call:
## glm(formula = approve ~ nonwhite + hrat + obrat + loanprc + unem +
## male + dep + sch + cosign + chist + pubrec + mortlat1 + mortlat2 +
## vr, family = binomial(link = "logit"), data = loan_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.95187 0.57659 8.588 < 2e-16 ***
## nonwhite -0.93422 0.17246 -5.417 6.06e-08 ***
## hrat 0.01328 0.01285 1.033 0.3015
## obrat -0.05432 0.01130 -4.809 1.52e-06 ***
## loanprc -1.88625 0.46242 -4.079 4.52e-05 ***
## unem -0.06919 0.03269 -2.117 0.0343 *
## male 0.10995 0.19652 0.559 0.5758
## dep -0.02136 0.06917 -0.309 0.7575
## sch 0.02599 0.17790 0.146 0.8838
## cosign 0.17274 0.44618 0.387 0.6986
## chist 1.03003 0.17011 6.055 1.40e-09 ***
## pubrec -1.35251 0.21664 -6.243 4.29e-10 ***
## mortlat1 -0.36603 0.45862 -0.798 0.4248
## mortlat2 -0.96742 0.56119 -1.724 0.0847 .
## vr -0.38701 0.15283 -2.532 0.0113 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1476 on 1970 degrees of freedom
## Residual deviance: 1209 on 1956 degrees of freedom
## (18 observations deleted due to missingness)
## AIC: 1239
##
## Number of Fisher Scoring iterations: 5
Is the coefficient on nonwhite statistically significantly different from zero? Is there statistical evidence of discrimination against nonwhites using these nonlinear models?
# Extract coefficient estimate / standard error for nonwhite
beta_hat_nonwhite <- coef(probit_model_additional)["nonwhite"]
se_nonwhite <- sqrt(vcov(probit_model_additional)["nonwhite", "nonwhite"])
# Calc t-statistic
t_stat_nonwhite <- beta_hat_nonwhite / se_nonwhite
# Degrees of freedom
df <- nrow(loan_data) - length(coef(probit_model_additional))
# Critical value for two-tailed test
t_crit <- qt(1 - (0.05 / 2), df)
# Test
if (abs(t_stat_nonwhite) > t_crit) {
print("Reject the null hypothesis (coefficient on nonwhite is statistically significant)")
} else {
print("Fail to reject the null hypothesis (coefficient on nonwhite is not statistically significant)")
}
## [1] "Reject the null hypothesis (coefficient on nonwhite is statistically significant)"
As \(|\text{T-stat}| > |\text{T-crit}|\) we can conclude that given this data set, there is statistical evidence of discrimination against non-whites.
Conclusion
This paper examines if discrimination exists in the mortgage loan market based on an applicants race.
Firstly, a linear probability model was employed to test for discrimination, with the coefficient β1 representing the average difference in the probability of loan approval between white and non-white applicants. The sign of β1 was negative, indicating a lower probability of approval for non-white applicants.
The regression of approval on only the non-white regressor yielded a significant negative coefficient of -0.2006, suggesting discrimination against non-white applicants in loan approval. Subsequently, additional regressors were added to the model to control for other factors influencing loan approval.
Even after controlling for additional variables, the coefficient for non-white is still significant -0.13, indicating persistent discrimination. Furthermore, an interaction term between non-white and obrat was introduced to examine how the effect of race varies with other obligations.
The regression including the interaction term showed that it is statistically significant, suggesting that the impact of race on loan approval varies depending on an applicants other obligations. Moreover, the marginal effect of being non-white on loan approval, accounting for median obrat, is estimated and found to be statistically significant -0.1211 with a 95% confidence interval.
Probit and logit models were employed to assess discrimination using nonlinear approaches. The marginal effects of being non-white from these models are compared with the linear probability estimates, showing slightly smaller magnitudes but consistent directionality.
Finally, the paper extends the analysis by including additional regressors in the probit and logit models. The coefficient for non-white remains statistically significant in both models, providing further evidence of discrimination against non-white applicants in the mortgage loan market.
In conclusion, this paper provides robust evidence of discrimination against non-white applicants in mortgage loan approval, as evident by significant coefficients across different models and methodologies.
It would be prudent to recognize that while this reveals statistical evidence of discrimination against non-whites in the mortgage loan market based on the given dataset, it is essential to consider the potential influence of unobserved variables or factors not captured in the analysis, thus highlighting the need for cautious interpretation of the findings.