\[\Large \textbf{Section 1}\]

The boxplot visualization depicts the difference in FOXP3 gene expression levels between treatments across two cell types (Tfh and Treg). The graphical representation demonstrates variations in expression levels, paving a way for a visual evaluation of the genetic expression across different groups. Such visualizations are necessary for preliminary analysis which indicates whether Abatacept has a significant impact on FOXP3 expression in these either the Tfh or Treg type. If the boxplots display notable differences in median values as well as data distribution between treatment groups across cell types, it can certain suggest that Abatacept may influence FOXP3 expression which will warrant more sophisticated forms of statistical analysis.

\[\Large \textbf{Section 2}\]

Call: glm(formula = FOXP3 ~ treatment + cell_type + age + sex + race + total_RNA_count, family = poisson(), data = data)

Coefficients: Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.924e-01 9.690e-02 -5.082 3.74e-07 treatmentPlacebo -6.449e-02 2.010e-02 -3.208 0.001335 cell_typeTreg 4.171e+00 7.752e-02 53.810 < 2e-16 age 3.113e-03 8.220e-04 3.787 0.000153 sexMale -1.034e-01 3.039e-02 -3.404 0.000665 raceWhite -1.554e-01 2.229e-02 -6.972 3.13e-12 total_RNA_count 9.316e-07 2.692e-08 34.607 < 2e-16 ** — Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 19111.8  on 147  degrees of freedom

Residual deviance: 2516.3 on 141 degrees of freedom AIC: 3149

Number of Fisher Scoring iterations: 5

The Poisson regression analysis obtains a quantitative assessment of the impact in predictor variables on FOXP3 gene expression, including Abatacept treatment, cell type, age, and sex. The significant negative coefficient for the placebo treatment suggests that Abatacept positively affects FOXP3 expression compared to the placebo. Such a finding indicates the potential role in enhancing gene expression in the examined cell types. In addition, the notable positive coefficient for Treg emphasizes the profound influence of Abatacept on Treg cells by further suggesting the importance of cell type in gene expression studies. On the other hand, the age factor was not significant, as shown by the fact that the impact of Abatacept on FOXP3 expression is consistent across the age range considered in the study. The significant negative effect associated with gender suggests a contrasted impact of Abatacept on gene expression by sex. In addition, race was also a significant predictor, as ethnicity can be a determinant of FOXP3. As a result, this Poisson model highlights the nuanced effects of Abatacept on FOXP3 expression, which demonstrates the importance of considering various biological and demographic factors in gene expression studies.

\[\Large \textbf{Section 3}\]

residual_deviance <- sum(resid(model, type = "pearson")^2)
df <- df.residual(model)

overdispersion_ratio <- residual_deviance / df
cat("Overdispersion ratio: ", overdispersion_ratio, "\n")
## Overdispersion ratio:  18.87553
par(mfrow = c(2, 2))

plot(resid(model, type = "pearson") ~ fitted(model),
     main = "Residuals vs Fitted",
     xlab = "Fitted Values",
     ylab = "Pearson Residuals")
abline(h = 0, col = "red")

plot(sqrt(abs(resid(model, type = "pearson"))) ~ fitted(model),
     main = "Scale-Location Plot",
     xlab = "Fitted Values",
     ylab = "Square Root of |Residuals|")

plot(hatvalues(model),
     main = "Leverage Plot",
     xlab = "Observations",
     ylab = "Leverage")
abline(h = 2 * mean(hatvalues(model)), col = "red")

cooksd <- cooks.distance(model)
plot(cooksd, type = "h", 
     main = "Cook's Distance",
     xlab = "Observations", 
     ylab = "Cook's Distance")
abline(h = 4 / length(cooksd), col = "red")

qqnorm(resid(model, type = "pearson"))
qqline(resid(model, type = "pearson"))

Throughout this section, the reliability regarding the Poisson regression model analyzing FOXP3 gene expression levels will be examined. First, a residual deviance of 3939.5 on 143 degrees of freedom as well as a ratio of 27.01125 indicates a noticeable issue with overdispersion. This discrepancy between the null and residual deviance indicates that the Poisson model may not fully capture the variability in the FOXP3 expression data. Overdispersion can certainly present an underestimation of standard errors and overstate significance levels for predictors, which will potentially make the conclusions generated by the model unreliable. In addition, the QQ and the Residuals vs Fitted plots have certainly indicated a violation from normality as well as homoscadesticity. In addition, the leverage plot depicted individual values exerting considerable influence on parameter estimation which identified outliers with large leverage values. The Cook’s Distance plot similarly demonstrated the impact of removing these observations, locating instances where model predictions significantly change which further emphasizes the importance of assessing influence to ensure model robustness.

Thus, further diagnostic checks and model evaluations will be necessary in evaluating the Poisson distribution for modeling the count data. For instance, the assumption that the mean equals the variance usually does not apply to biological data. Almost always, the variance exceeds the mean, as a result of biological variability and experimental conditions. The significant coefficients for treatment, cell type, race, and sex indicate influential effects on FOXP3 expression. However, the quality of the model is compromised without addressing overdispersion, as mentioned. A more flexible model such as Negative Binomial regression might be preferred, as it accounts for unexplained variability in the Poisson model.

Even though the Poisson regression analysis provides a glimpse of the factors influencing FOXP3 gene expression, the concerns regarding overdispersion and model fit means there must be a cautious interpretation of the results. As stated previously, a substantial residual deviance suggests that the model may not adequately represent the underlying distribution of the data. Addressing these issues through alternative modeling strategies such as Negative Binomial regression or zero-inflated models is essential for forming stronger conclusions of the impact of Abatacept and other covariates on FOXP3 gene expression.

\[\Large \textbf{Section 4}\]

Start: AIC=1198.61 FOXP3 ~ treatment + cell_type + age + sex + race

        Df    AIC

Step: AIC=1196.67 FOXP3 ~ cell_type + age + sex + race

        Df    AIC

Step: AIC=1195.1 FOXP3 ~ cell_type + sex + race

        Df    AIC

1195.1 - sex 1 1195.2 - race 1 1195.5 + age 1 1196.7 + treatment 1 1197.0 - cell_type 1 1414.0

Call: glm.nb(formula = FOXP3 ~ cell_type + sex + race, data = data, init.theta = 1.657530103, link = log)

Coefficients: Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.0738 0.1816 5.912 3.38e-09 cell_typeTreg 4.2741 0.1499 28.504 < 2e-16 sexMale -0.3041 0.2053 -1.481 0.138
raceWhite -0.2775 0.1814 -1.530 0.126
— Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1

(Dispersion parameter for Negative Binomial(1.6575) family taken to be 1)

Null deviance: 837.00  on 147  degrees of freedom

Residual deviance: 188.88 on 144 degrees of freedom AIC: 1197.1

Number of Fisher Scoring iterations: 1

          Theta:  1.658 
      Std. Err.:  0.280 

2 x log-likelihood: -1187.096 Likelihood ratio tests of Negative Binomial Models

Response: FOXP3 Model theta Resid. df 2 x log-lik. 1 cell_type + sex + race 1.65753 144 -1187.096 2 treatment + cell_type + age + sex + race 1.66082 142 -1186.607 Test df LR stat. Pr(Chi) 1
2 1 vs 2 2 0.4883438 0.783353

An improved regression model was employed by the stepwise AIC process for analyzing FOXP3 gene expression, which involves a Negative Binomial regression model that includes cell type, along with sex and race, as predictors. This model selection method indicated that the treatment as well as age were not as important in explaining the variance in FOXP3 expression, as previously considered. The exclusion of the treatment variable in the final model was certainly surprising, as it suggested that the direct impact of Abatacept treatment on FOXP3 expression levels might not be as significant as factors such as cell type. Meanwhile, the retention of cell type (Treg) as a significant predictor emphasizes thebiological perspective that FOXP3 expression is greater in regulatory T cells.

Furthermore, the likelihood ratio tests comparing the negative binomial models support these findings. As for the model containing cell type, sex, and race, the 2 x log-likelihood was -1187.096. Unsurprisingly, including treatment and age to this model only negligibly changed the 2 x log-likelihood to -1186.607, as well as a likelihood ratio test statistic of 0.4883438 and a p-value of 0.783353. Such results reinforce that these additional variables do not significantly improve the fit of the model.

The absence of the treatment variable (Abatacept vs. Placebo) in the refined AIC-based model raises questions about our initial conclusion of the relationship between Abatacept and FOXP3 gene expression. While this might be perceived as counterintuitive, especially provided the intended immunomodulatory role, it illuminates the complexity of gene expression regulation in terms of possible demographic factors. Although sex and race were technically not statistically significant predictors at the 0.05 alpha levels in the final model, their presence may potentially reveal underlying physiological patterns that could be further analyzed in gene expression studies.

According to the improved analysis with the proposed Negative Binomial regression model, the effect of Abatacept on FOXP3 gene expression may not be as straightforward as previously considered, after adjusting for overdispersion and obtaining a more fit model. Certainly, the significant impact of cell type (Treg) on FOXP3 expression is noteworthy for cellular context in gene expression studies. Such a result suggests that future research must continue to examine the interconnected roles of biological and demographic factors for understanding the effects of treatments such as Abatacept.

In conclusion, even though the direct effect of Abatacept on FOXP3 expression might not be detected in this analysis, this analysis does not deny the potential benefits of Abatacept in modulating immune responses through mechanisms not directly determined by FOXP3 expression levels alone.