knitr::opts_chunk$set(echo = FALSE) # False when reporting
library(readr)
library(ggplot2)
library(car)
## Loading required package: carData
library(leaps)
stroke_risk_dataset <- read_csv("~/STAT 840 Projects/FINAL PROJECT/stroke_risk_dataset_v2.csv")
## Rows: 35000 Columns: 19
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): gender
## dbl (18): age, chest_pain, high_blood_pressure, irregular_heartbeat, shortne...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

I. Introduction

A. Study Design This study investigates whether stroke-related symptoms are associated with age amoung patients in the stroke risk data set. Understanding how stroke-related symptoms are associated with age may provide valuable insight into stroke risk and prevention. Exploring this data set and interest of unknown parameters, Yi is the dependent/response variable “Age”. Xij is the stroke-related symptoms are the primary binary predictor variables including chest pain, high blood pressure, shortness of breath, irregular heartbeat, fatigue/weakness, dizziness, swelling, neck/jaw pain, excessive sweating, persistent cough, nausea and vomiting, chest discomfort, cold hands/feet, snoring or sleep apnea, and anxiety. The data set contains (n = 35,000) observations among patients.Each variable is coded as ) for absence of symptom and 1 for presence of the symptom.

Model Assumptions The multiple linear regression model assumes that there is a existing linear relationship between Age and stroke-related symptoms, observations are independent, residuals are scattered randomly assuming normal distribution, homoscedasticity of the residuals across fitted values and predictors are not multicolinear.

Regression Equation Yᵢ = Predicted Age for individual i β₀ = Intercept β₁ = Coefficient for predicted j Xᵢⱼ = Value of predictor j for individual i εᵢ = random error

Yᵢ = β₀ + β₁(Chest Pain)ᵢ + β₂(High Blood Pressure)ᵢ + β₃(Shortness of Breath)ᵢ + β₄(Irregular Heartbeat)ᵢ + β₅(Fatigue/Weakness)ᵢ + β₆(Dizziness)ᵢ + β₇(Swelling/Edema)ᵢ + β₈(Neck/Jaw Pain)ᵢ + β₉(Excessive Sweating)ᵢ + β₁₀(Persistent Cough)ᵢ + β₁₁(Nausea/Vomiting)ᵢ + β₁₂(Chest Discomfort)ᵢ + β₁₃(Cold Hands/Feet)ᵢ + β₁₄(Sleep Apnea)ᵢ + β₁₅(Anxiety Doom)ᵢ + εᵢ

B. Aims The purpose of this study is to ask whether stroke-related symptoms associated with age.This analysis evaluates whether the individuals reporting stroke-related symptoms tend to have different predicted ages compared with those who do not report those symptoms.

  1. Methods

A. Preliminary Model A preliminary multiple linear regression model was fit using all stroke-related symptoms considered clinically relevant to age prediction. Diagnostic procedure and predictor screening methods were used to evaluate the assumptions, variables significance, and multicollinearity.

## 
## Call:
## lm(formula = age ~ chest_pain + high_blood_pressure + shortness_of_breath + 
##     irregular_heartbeat + fatigue_weakness + dizziness + swelling_edema + 
##     neck_jaw_pain + excessive_sweating + persistent_cough + nausea_vomiting + 
##     chest_discomfort + cold_hands_feet + snoring_sleep_apnea + 
##     anxiety_doom, data = stroke_risk_dataset)
## 
## Coefficients:
##         (Intercept)           chest_pain  high_blood_pressure  
##              30.591                3.871                4.533  
## shortness_of_breath  irregular_heartbeat     fatigue_weakness  
##               2.870                4.861                2.729  
##           dizziness       swelling_edema        neck_jaw_pain  
##               3.163                3.578                5.028  
##  excessive_sweating     persistent_cough      nausea_vomiting  
##               2.936                3.775                2.923  
##    chest_discomfort      cold_hands_feet  snoring_sleep_apnea  
##               3.833                3.056                4.139  
##        anxiety_doom  
##               2.934

B. Final Model The Final model retained all stroke-related predictors, because each predictor was clinically relevant to the study aim and contributed to evaluating the overall relationship between age and symptoms of strokes. Predictor screening showed no evidence of severe multicollinearity. The model showed statstical relevance. The final fitted model remained the same as the preliminary model.

  1. Results

The overall model was statistically significant, F(15, 34984) = 857.1873,p < 0.001, indicating that at least one stroke-related symptom was significantly associated with age. The model explained approximately 26.88% of the variability in age, with an adjusted of 26.84%. This suggests that the symptom predictors provide meaningful explanatory information about variation in age, although a substantial amount of age variability remains unexplained by symptoms alone.

The hypotheses for the overall model were:

H0: Stroke-related symptoms are not associated with age. Ha: At least one stroke-related symptom is associated with age.

Because the overall F-test was significant, the null hypothesis was rejected. This provides evidence that age is associated with at least one stroke-related symptom predictor.

Summary/Conclusion: The multiple linear regression analysis identified a statistically significant association between stroke-related symptoms and age. All symptom variables demonstrated positive relationships with age, with high blood pressure showing one of the strongest associations in the final model. Although the adjusted R^2 indicated that the model explained only part of the variability in age, the diagnostic results supported the overall adequacy and stability of the model. Because the data set was synthetic and age is influenced by many additional factors not included in the analysis, the results should be interpreted as evidence of association rather than causation. Future research using real-world clinical data and additional predictors may improve predictive performance and model generalization.

## 
## Call:
## lm(formula = age ~ chest_pain + high_blood_pressure + shortness_of_breath + 
##     irregular_heartbeat + fatigue_weakness + dizziness + swelling_edema + 
##     neck_jaw_pain + excessive_sweating + persistent_cough + nausea_vomiting + 
##     chest_discomfort + cold_hands_feet + snoring_sleep_apnea + 
##     anxiety_doom, data = stroke_risk_dataset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -31.879  -7.423  -1.535   6.576  48.483 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         30.59130    0.09034  338.62   <2e-16 ***
## chest_pain           3.87145    0.15059   25.71   <2e-16 ***
## high_blood_pressure  4.53283    0.12342   36.73   <2e-16 ***
## shortness_of_breath  2.86985    0.13555   21.17   <2e-16 ***
## irregular_heartbeat  4.86091    0.17869   27.20   <2e-16 ***
## fatigue_weakness     2.72947    0.12354   22.09   <2e-16 ***
## dizziness            3.16339    0.13515   23.41   <2e-16 ***
## swelling_edema       3.57820    0.15069   23.75   <2e-16 ***
## neck_jaw_pain        5.02797    0.17762   28.31   <2e-16 ***
## excessive_sweating   2.93633    0.17857   16.44   <2e-16 ***
## persistent_cough     3.77514    0.17257   21.88   <2e-16 ***
## nausea_vomiting      2.92280    0.17855   16.37   <2e-16 ***
## chest_discomfort     3.83257    0.15148   25.30   <2e-16 ***
## cold_hands_feet      3.05599    0.13425   22.76   <2e-16 ***
## snoring_sleep_apnea  4.13918    0.15039   27.52   <2e-16 ***
## anxiety_doom         2.93380    0.17773   16.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.891 on 34984 degrees of freedom
## Multiple R-squared:  0.2688, Adjusted R-squared:  0.2684 
## F-statistic: 857.2 on 15 and 34984 DF,  p-value: < 2.2e-16
## Analysis of Variance Table
## 
## Response: age
##                        Df  Sum Sq Mean Sq F value    Pr(>F)    
## chest_pain              1  132450  132450 1353.88 < 2.2e-16 ***
## high_blood_pressure     1  259708  259708 2654.68 < 2.2e-16 ***
## shortness_of_breath     1   81173   81173  829.74 < 2.2e-16 ***
## irregular_heartbeat     1  114592  114592 1171.33 < 2.2e-16 ***
## fatigue_weakness        1   71927   71927  735.22 < 2.2e-16 ***
## dizziness               1   74553   74553  762.07 < 2.2e-16 ***
## swelling_edema          1   78582   78582  803.25 < 2.2e-16 ***
## neck_jaw_pain           1   95252   95252  973.64 < 2.2e-16 ***
## excessive_sweating      1   32588   32588  333.11 < 2.2e-16 ***
## persistent_cough        1   58611   58611  599.11 < 2.2e-16 ***
## nausea_vomiting         1   30156   30156  308.25 < 2.2e-16 ***
## chest_discomfort        1   71916   71916  735.11 < 2.2e-16 ***
## cold_hands_feet         1   54403   54403  556.10 < 2.2e-16 ***
## snoring_sleep_apnea     1   75316   75316  769.87 < 2.2e-16 ***
## anxiety_doom            1   26656   26656  272.48 < 2.2e-16 ***
## Residuals           34984 3422498      98                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##                         2.5 %    97.5 %
## (Intercept)         30.414224 30.768368
## chest_pain           3.576289  4.166605
## high_blood_pressure  4.290922  4.774736
## shortness_of_breath  2.604181  3.135528
## irregular_heartbeat  4.510660  5.211154
## fatigue_weakness     2.487337  2.971607
## dizziness            2.898480  3.428293
## swelling_edema       3.282854  3.873551
## neck_jaw_pain        4.679825  5.376121
## excessive_sweating   2.586325  3.286330
## persistent_cough     3.436894  4.113393
## nausea_vomiting      2.572828  3.272768
## chest_discomfort     3.535664  4.129478
## cold_hands_feet      2.792851  3.319130
## snoring_sleep_apnea  3.844418  4.433948
## anxiety_doom         2.585441  3.282164
  1. Discussion

This analysis used a synthetic stroke-related dataset created by Mahatir Ahmed Tusher using medical literature from the American Stroke Association, WHO Global Stroke Reports, Harrison’s Principles of Internal Medicine (20th Edition), and Stroke Prevention, Treatment, and Rehabilitation (Oxford, 2021). Because the dataset was synthetic rather than collected from real patients, the results may not fully represent real-world clinical variability and could introduce simulation-related bias or measurement error[3].

This analysis found evidence of an association between stroke-related symptoms and age. Positive coefficient estimates suggested that individuals reporting symptoms tended to have higher predicted ages than individuals not reporting symptoms, while holding other symptoms constant. High blood pressure demonstrated one of the strongest associations with age based on the coefficient and t-statistic output.

The overall model was statistically significant. The adjusted R^2 indicated that symptoms explained only part of the variation in age. This was expected because age is influenced by many additional factors not included in the model, and binary symptom predictors limit the precision of predicting a continuous outcome such as age.

## # A tibble: 6 × 19
##     age gender chest_pain high_blood_pressure irregular_heartbeat
##   <dbl> <chr>       <dbl>               <dbl>               <dbl>
## 1    22 Male            1                   0                   0
## 2    52 Male            0                   1                   1
## 3    63 Female          0                   1                   0
## 4    41 Male            0                   0                   1
## 5    53 Male            0                   0                   0
## 6    28 Female          0                   0                   0
## # ℹ 14 more variables: shortness_of_breath <dbl>, fatigue_weakness <dbl>,
## #   dizziness <dbl>, swelling_edema <dbl>, neck_jaw_pain <dbl>,
## #   excessive_sweating <dbl>, persistent_cough <dbl>, nausea_vomiting <dbl>,
## #   chest_discomfort <dbl>, cold_hands_feet <dbl>, snoring_sleep_apnea <dbl>,
## #   anxiety_doom <dbl>, stroke_risk_percentage <dbl>, at_risk <dbl>
## Total Missing Values: 0

V. Appendix

A. Diagnostics for predictors

The histogram for age demonstrates moderate variability with the data set slightly right skewed. Majority of the patients’ age falls between 25 - 45. The box plots for all independent variables patients who have experienced the stroke-related symptom variables suggest that individuals reporting the symptoms have a higher median for age in comparison to those who reported no symptom. Additionally, several upper age outliers were observed across groups, but a greater concentration of outliers among individuals not reporting symptoms.

The correlation matrix showed generally weak correlations among the predictor variables, suggesting low multicollinearity. Most pair correlations were below 0.25, indicating that the symptom variables provided relatively independent information in the regression model.

##       age           gender            chest_pain     high_blood_pressure
##  Min.   :18.00   Length:35000       Min.   :0.0000   Min.   :0.0000     
##  1st Qu.:30.00   Class :character   1st Qu.:0.0000   1st Qu.:0.0000     
##  Median :37.00   Mode  :character   Median :0.0000   Median :0.0000     
##  Mean   :38.63                      Mean   :0.1459   Mean   :0.2519     
##  3rd Qu.:46.00                      3rd Qu.:0.0000   3rd Qu.:1.0000     
##  Max.   :86.00                      Max.   :1.0000   Max.   :1.0000     
##  irregular_heartbeat shortness_of_breath fatigue_weakness   dizziness     
##  Min.   :0.00000     Min.   :0.0000      Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.00000     1st Qu.:0.0000      1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.00000     Median :0.0000      Median :0.0000   Median :0.0000  
##  Mean   :0.09846     Mean   :0.1901      Mean   :0.2445   Mean   :0.1907  
##  3rd Qu.:0.00000     3rd Qu.:0.0000      3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :1.00000     Max.   :1.0000      Max.   :1.0000   Max.   :1.0000  
##  swelling_edema   neck_jaw_pain     excessive_sweating persistent_cough
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000    Min.   :0.000   
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000    1st Qu.:0.000   
##  Median :0.0000   Median :0.00000   Median :0.00000    Median :0.000   
##  Mean   :0.1459   Mean   :0.09951   Mean   :0.09751    Mean   :0.106   
##  3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000    3rd Qu.:0.000   
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00000    Max.   :1.000   
##  nausea_vomiting   chest_discomfort cold_hands_feet  snoring_sleep_apnea
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000     
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000     
##  Median :0.00000   Median :0.0000   Median :0.0000   Median :0.0000     
##  Mean   :0.09754   Mean   :0.1438   Mean   :0.1946   Mean   :0.1471     
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.0000   3rd Qu.:0.0000     
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000     
##   anxiety_doom     stroke_risk_percentage    at_risk      
##  Min.   :0.00000   Min.   :  1.50         Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.: 19.90         1st Qu.:0.0000  
##  Median :0.00000   Median : 38.70         Median :0.0000  
##  Mean   :0.09854   Mean   : 44.48         Mean   :0.3682  
##  3rd Qu.:0.00000   3rd Qu.: 64.50         3rd Qu.:1.0000  
##  Max.   :1.00000   Max.   :100.00         Max.   :1.0000

##                      age chest_pain high_blood_pressure shortness_of_breath
## age                 1.00       0.17                0.24                0.15
## chest_pain          0.17       1.00                0.05                0.03
## high_blood_pressure 0.24       0.05                1.00                0.06
## shortness_of_breath 0.15       0.03                0.06                1.00
## irregular_heartbeat 0.18       0.04                0.07                0.03
## fatigue_weakness    0.15       0.02                0.04                0.03
## dizziness           0.15       0.04                0.05                0.03
## swelling_edema      0.16       0.03                0.06                0.03
## neck_jaw_pain       0.18       0.04                0.06                0.04
## excessive_sweating  0.11       0.02                0.03                0.03
## persistent_cough    0.15       0.03                0.05                0.03
## nausea_vomiting     0.11       0.01                0.02                0.02
## chest_discomfort    0.17       0.03                0.05                0.03
## cold_hands_feet     0.15       0.03                0.05                0.04
## snoring_sleep_apnea 0.18       0.04                0.06                0.03
## anxiety_doom        0.11       0.03                0.03                0.02
##                     irregular_heartbeat fatigue_weakness dizziness
## age                                0.18             0.15      0.15
## chest_pain                         0.04             0.02      0.04
## high_blood_pressure                0.07             0.04      0.05
## shortness_of_breath                0.03             0.03      0.03
## irregular_heartbeat                1.00             0.04      0.03
## fatigue_weakness                   0.04             1.00      0.02
## dizziness                          0.03             0.02      1.00
## swelling_edema                     0.04             0.03      0.03
## neck_jaw_pain                      0.03             0.03      0.03
## excessive_sweating                 0.02             0.02      0.01
## persistent_cough                   0.03             0.04      0.03
## nausea_vomiting                    0.02             0.03      0.01
## chest_discomfort                   0.03             0.03      0.02
## cold_hands_feet                    0.03             0.03      0.02
## snoring_sleep_apnea                0.05             0.02      0.04
## anxiety_doom                       0.03             0.02      0.01
##                     swelling_edema neck_jaw_pain excessive_sweating
## age                           0.16          0.18               0.11
## chest_pain                    0.03          0.04               0.02
## high_blood_pressure           0.06          0.06               0.03
## shortness_of_breath           0.03          0.04               0.03
## irregular_heartbeat           0.04          0.03               0.02
## fatigue_weakness              0.03          0.03               0.02
## dizziness                     0.03          0.03               0.01
## swelling_edema                1.00          0.04               0.02
## neck_jaw_pain                 0.04          1.00               0.02
## excessive_sweating            0.02          0.02               1.00
## persistent_cough              0.02          0.02               0.02
## nausea_vomiting               0.02          0.03               0.01
## chest_discomfort              0.03          0.03               0.02
## cold_hands_feet               0.03          0.03               0.03
## snoring_sleep_apnea           0.06          0.04               0.02
## anxiety_doom                  0.02          0.01               0.01
##                     persistent_cough nausea_vomiting chest_discomfort
## age                             0.15            0.11             0.17
## chest_pain                      0.03            0.01             0.03
## high_blood_pressure             0.05            0.02             0.05
## shortness_of_breath             0.03            0.02             0.03
## irregular_heartbeat             0.03            0.02             0.03
## fatigue_weakness                0.04            0.03             0.03
## dizziness                       0.03            0.01             0.02
## swelling_edema                  0.02            0.02             0.03
## neck_jaw_pain                   0.02            0.03             0.03
## excessive_sweating              0.02            0.01             0.02
## persistent_cough                1.00            0.02             0.03
## nausea_vomiting                 0.02            1.00             0.02
## chest_discomfort                0.03            0.02             1.00
## cold_hands_feet                 0.04            0.02             0.04
## snoring_sleep_apnea             0.03            0.02             0.04
## anxiety_doom                    0.02            0.01             0.02
##                     cold_hands_feet snoring_sleep_apnea anxiety_doom
## age                            0.15                0.18         0.11
## chest_pain                     0.03                0.04         0.03
## high_blood_pressure            0.05                0.06         0.03
## shortness_of_breath            0.04                0.03         0.02
## irregular_heartbeat            0.03                0.05         0.03
## fatigue_weakness               0.03                0.02         0.02
## dizziness                      0.02                0.04         0.01
## swelling_edema                 0.03                0.06         0.02
## neck_jaw_pain                  0.03                0.04         0.01
## excessive_sweating             0.03                0.02         0.01
## persistent_cough               0.04                0.03         0.02
## nausea_vomiting                0.02                0.02         0.01
## chest_discomfort               0.04                0.04         0.02
## cold_hands_feet                1.00                0.03         0.02
## snoring_sleep_apnea            0.03                1.00         0.02
## anxiety_doom                   0.02                0.02         1.00

B. Screening for Predictors All stroke-related symptoms were retained in the regression model due to their relevance to the primary research question.

Variance Inflation Factor (VIF) VIF was used to assess multicollinearity. This is important because highly correlated predictors can cause instability in the coefficients, inflate the standard errors, and make it harder to isolate the effect of each predictor [2]. An acceptable VIF score typically falls within the range of 1 to 5, with values around 5 raising concern and values between 6 and 10 indicating a serious multicollinearity problem. All predictors fell between 1 and 2, indicating no concern for multicollinearity. This was expected due to the low correlations reviewed earlier.

Adjusted R Squared When all predictors were included in the model, approximately 26.84% of the variability in age was explained by stroke-related symptom variables. The R-squared value was 26.87%. This closeness in percentages suggests that the predictors contributed meaningful explanatory information to the model and explained variation in age rather than serving as unnecessary variables. This helps reflect the main question of how age tends to differ across stroke-related symptoms[2].

Coefficient Significance The predictor variables all displayed large positive t-values, providing evidence against H0. The large t-values and very small standard errors support strong evidence of a meaningful relationship between age and stroke-related symptoms, particularly showing a noticeably greater gap for the predictor variable high blood pressure. This suggests the relationship is unlikely to be due to chance.

F-Value The F-statistic evaluates the null hypothesis that all regression coefficients associated with the predictor variables are simultaneously equal to zero [2], compared with the alternative hypothesis that at least one predictor contributes meaningful explanatory information regarding the response variable [2]. With F(15, 34,984) = 857.1873, p < .001, the regression model explains more variation in age than a model containing only the intercept. Therefore, at least one stroke-related symptom predictor is significantly associated with age.

##          chest_pain high_blood_pressure shortness_of_breath irregular_heartbeat 
##            1.010731            1.026998            1.011818            1.014029 
##    fatigue_weakness           dizziness      swelling_edema       neck_jaw_pain 
##            1.008667            1.008530            1.012529            1.011481 
##  excessive_sweating    persistent_cough     nausea_vomiting    chest_discomfort 
##            1.003967            1.009443            1.004041            1.010581 
##     cold_hands_feet snoring_sleep_apnea        anxiety_doom 
##            1.010524            1.015069            1.003918
## [1] 0.2684432
## [1] 0.2687568
##         (Intercept)          chest_pain high_blood_pressure shortness_of_breath 
##           338.61879            25.70880            36.72685            21.17264 
## irregular_heartbeat    fatigue_weakness           dizziness      swelling_edema 
##            27.20233            22.09451            23.40575            23.74615 
##       neck_jaw_pain  excessive_sweating    persistent_cough     nausea_vomiting 
##            28.30689            16.44357            21.87559            16.36934 
##    chest_discomfort     cold_hands_feet snoring_sleep_apnea        anxiety_doom 
##            25.30072            22.76297            27.52338            16.50684
##      value      numdf      dendf 
##   857.1873    15.0000 34984.0000

C. Model Validation Model validation was performed using a 70/30 train-test split. The model was trained on 70% of the observations and evaluated on the remaining 30%. Mean squared prediction error (MSPR) was used to assess predictive performance in the testing sample[2]. The square root of MSPR gives the prediction error in years.

The model had an MSPR of 98.18 and an approximate root mean squared prediction error of 9.9 years. Given that age ranges from (18 to 86) years and the predictors are binary symptoms, this suggests moderate predictive performance for unseen observations.

## [1] 98.17695
## [1] 9.908428

D. Residual Diagnostics Residual Vs Fitted The residuals vs fitted plot was examined to assess model form and constant variance. The residuals were generally centered around zero, which suggests that the model was not systematically over predicting or under predicting age overall. However, the spread of residuals changed across fitted values, suggesting heteroscedasticity. This indicates that the relationship between age and symptom predictors may not be equally stable across all fitted age values.

QQ Plot The Q-Q plot was used to assess the normality assumption[2]. The points followed the reference line reasonably well in the center of the distribution, but deviations were present in the tails. This suggests approximate normality for most observations, with some evidence of extreme residuals or heavy tails.

Cook’s Distance / Residual vs Leverage The residuals-versus-leverage plot and Cook’s distance were examined to identify observations with excessive influence on the fitted model[2]. Most observations had low leverage and clustered near zero residuals. Although a few observations had relatively higher influence values, the Cook’s distance values remained small, suggesting that no single observation excessively influenced the estimated relationship between age and symptom predictors.

References:

1.WHO | Disease burden and mortality estimatesAccessed May 21, 2020 at:http://www.who.int/healthinfo/global_burden_disease/estimates/en

2.Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models (5th ed.). McGraw-Hill Irwin.

  1. Mahatir Ahmed Tusher, and Saket Choudary Kongara. (2025). Stroke Risk Prediction Dataset based on Literature [Version2]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/10892812