組員:賴加耕 黃丞祥 姚冠豪 趙致忠
資料來源:https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset

這是資料中的變數及其中文翻譯:

變數名稱 中文翻譯
Person ID 人員編號
Gender 性別
Age 年齡
Occupation 職業
Sleep Duration 睡眠時間(小時)
Quality of Sleep 睡眠質量
Physical Activity Level 身體活動水平
Stress Level 壓力水平
BMI Category 體重指數類別
Blood Pressure 血壓
Heart Rate 心跳速率
Daily Steps 每日步數
Sleep Disorder 睡眠障礙
# 讀取資料
library(readr)
data<-read_csv("C:/Users/Howard/Sleep_health_and_lifestyle_dataset.csv")
## Rows: 374 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): Gender, Occupation, BMI Category, Blood Pressure, Sleep Disorder
## dbl (8): Person ID, Age, Sleep Duration, Quality of Sleep, Physical Activity...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
## # A tibble: 6 × 13
##   `Person ID` Gender   Age Occupation        `Sleep Duration` `Quality of Sleep`
##         <dbl> <chr>  <dbl> <chr>                        <dbl>              <dbl>
## 1           1 Male      27 Software Engineer              6.1                  6
## 2           2 Male      28 Doctor                         6.2                  6
## 3           3 Male      28 Doctor                         6.2                  6
## 4           4 Male      28 Sales Representa…              5.9                  4
## 5           5 Male      28 Sales Representa…              5.9                  4
## 6           6 Male      28 Software Engineer              5.9                  4
## # ℹ 7 more variables: `Physical Activity Level` <dbl>, `Stress Level` <dbl>,
## #   `BMI Category` <chr>, `Blood Pressure` <chr>, `Heart Rate` <dbl>,
## #   `Daily Steps` <dbl>, `Sleep Disorder` <chr>
str(data)
## spc_tbl_ [374 × 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Person ID              : num [1:374] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Gender                 : chr [1:374] "Male" "Male" "Male" "Male" ...
##  $ Age                    : num [1:374] 27 28 28 28 28 28 29 29 29 29 ...
##  $ Occupation             : chr [1:374] "Software Engineer" "Doctor" "Doctor" "Sales Representative" ...
##  $ Sleep Duration         : num [1:374] 6.1 6.2 6.2 5.9 5.9 5.9 6.3 7.8 7.8 7.8 ...
##  $ Quality of Sleep       : num [1:374] 6 6 6 4 4 4 6 7 7 7 ...
##  $ Physical Activity Level: num [1:374] 42 60 60 30 30 30 40 75 75 75 ...
##  $ Stress Level           : num [1:374] 6 8 8 8 8 8 7 6 6 6 ...
##  $ BMI Category           : chr [1:374] "Overweight" "Normal" "Normal" "Obese" ...
##  $ Blood Pressure         : chr [1:374] "126/83" "125/80" "125/80" "140/90" ...
##  $ Heart Rate             : num [1:374] 77 75 75 85 85 85 82 70 70 70 ...
##  $ Daily Steps            : num [1:374] 4200 10000 10000 3000 3000 3000 3500 8000 8000 8000 ...
##  $ Sleep Disorder         : chr [1:374] "None" "None" "None" "Sleep Apnea" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   `Person ID` = col_double(),
##   ..   Gender = col_character(),
##   ..   Age = col_double(),
##   ..   Occupation = col_character(),
##   ..   `Sleep Duration` = col_double(),
##   ..   `Quality of Sleep` = col_double(),
##   ..   `Physical Activity Level` = col_double(),
##   ..   `Stress Level` = col_double(),
##   ..   `BMI Category` = col_character(),
##   ..   `Blood Pressure` = col_character(),
##   ..   `Heart Rate` = col_double(),
##   ..   `Daily Steps` = col_double(),
##   ..   `Sleep Disorder` = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>
# 載入必要的庫
library(dplyr)
## 
## 載入套件:'dplyr'
## 下列物件被遮斷自 'package:stats':
## 
##     filter, lag
## 下列物件被遮斷自 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(stats)
data <- data %>%
  rename(Sleep_Disorder = `Sleep Disorder`)
data <- data %>%
  mutate(Sleep_Disorder = ifelse(Sleep_Disorder == "None", 0, 1),  # 將睡眠障礙轉換為0或1
         Gender = factor(Gender),  # 將性別轉換為因子
         `BMI Category` = factor(`BMI Category`))  # 將 BMI 類別轉換為因子
model_data <- data %>%
  select(Sleep_Disorder, Age, Gender, `Stress Level`, `Heart Rate`, `Daily Steps`, `Sleep Duration`, `Quality of Sleep`, `BMI Category`)
m1<- glm(Sleep_Disorder ~ Age + Gender + `Stress Level` + `Heart Rate` + `Daily Steps` + `Sleep Duration` + `Quality of Sleep` + `BMI Category`,
                     data = model_data, 
                     family = binomial)
summary(m1)
## 
## Call:
## glm(formula = Sleep_Disorder ~ Age + Gender + `Stress Level` + 
##     `Heart Rate` + `Daily Steps` + `Sleep Duration` + `Quality of Sleep` + 
##     `BMI Category`, family = binomial, data = model_data)
## 
## Coefficients:
##                               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)                  2.622e+00  6.165e+00   0.425  0.67065   
## Age                          1.766e-01  5.601e-02   3.154  0.00161 **
## GenderMale                  -1.699e-01  6.493e-01  -0.262  0.79351   
## `Stress Level`              -4.657e-01  4.430e-01  -1.051  0.29320   
## `Heart Rate`                 3.613e-02  9.812e-02   0.368  0.71271   
## `Daily Steps`                1.721e-04  1.565e-04   1.100  0.27137   
## `Sleep Duration`             6.030e-02  7.542e-01   0.080  0.93627   
## `Quality of Sleep`          -1.811e+00  6.719e-01  -2.695  0.00703 **
## `BMI Category`Normal Weight  1.287e-01  9.332e-01   0.138  0.89026   
## `BMI Category`Obese          1.906e+01  1.137e+03   0.017  0.98662   
## `BMI Category`Overweight     2.206e+00  8.115e-01   2.719  0.00655 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 507.47  on 373  degrees of freedom
## Residual deviance: 202.38  on 363  degrees of freedom
## AIC: 224.38
## 
## Number of Fisher Scoring iterations: 16

AIC為224.38
年齡 (Age)、睡眠質量 (Quality of Sleep)、以及 BMI 為 Overweight 是影響睡眠障礙的主要變數。

m2<- glm(Sleep_Disorder ~ (Age + Gender + `Stress Level` + `Heart Rate` + `Daily Steps` + `Sleep Duration` + `Quality of Sleep` + `BMI Category`)^2,
                     data = model_data, 
                     family = binomial)
## Warning: glm.fit:擬合機率算出來是數值零或一
summary(m2)
## 
## Call:
## glm(formula = Sleep_Disorder ~ (Age + Gender + `Stress Level` + 
##     `Heart Rate` + `Daily Steps` + `Sleep Duration` + `Quality of Sleep` + 
##     `BMI Category`)^2, family = binomial, data = model_data)
## 
## Coefficients: (2 not defined because of singularities)
##                                                  Estimate Std. Error   z value
## (Intercept)                                     6.107e+17  1.052e+10  58067770
## Age                                            -4.470e+15  1.359e+08 -32895498
## GenderMale                                     -3.104e+16  1.193e+09 -26018208
## `Stress Level`                                  1.622e+16  5.343e+08  30366615
## `Heart Rate`                                   -1.024e+16  1.161e+08 -88192939
## `Daily Steps`                                  -2.331e+13  5.336e+05 -43675824
## `Sleep Duration`                               -1.041e+17  1.305e+09 -79763414
## `Quality of Sleep`                              7.012e+16  1.095e+09  64060574
## `BMI Category`Normal Weight                    -8.992e+16  3.062e+09 -29363604
## `BMI Category`Obese                            -1.000e+15  7.851e+09   -127388
## `BMI Category`Overweight                       -2.493e+16  2.814e+09  -8859376
## Age:GenderMale                                  6.578e+13  6.127e+06  10735123
## Age:`Stress Level`                             -8.043e+13  4.212e+06 -19096756
## Age:`Heart Rate`                                6.797e+13  1.548e+06  43908654
## Age:`Daily Steps`                              -6.105e+09  2.495e+03  -2447261
## Age:`Sleep Duration`                           -2.554e+14  6.995e+06 -36513742
## Age:`Quality of Sleep`                          2.613e+14  6.049e+06  43202540
## Age:`BMI Category`Normal Weight                -8.330e+13  8.215e+06 -10138999
## Age:`BMI Category`Obese                        -7.944e+14  2.760e+07 -28780614
## Age:`BMI Category`Overweight                   -1.547e+14  1.103e+07 -14019958
## GenderMale:`Stress Level`                      -5.843e+13  5.599e+07  -1043589
## GenderMale:`Heart Rate`                         5.457e+14  1.369e+07  39864744
## GenderMale:`Daily Steps`                       -2.090e+11  5.389e+04  -3878167
## GenderMale:`Sleep Duration`                     3.398e+15  1.002e+08  33898487
## GenderMale:`Quality of Sleep`                  -4.232e+15  8.720e+07 -48528256
## GenderMale:`BMI Category`Normal Weight         -3.964e+14  8.254e+07  -4802403
## GenderMale:`BMI Category`Obese                 -1.175e+16  1.881e+08 -62492391
## GenderMale:`BMI Category`Overweight            -6.878e+15  1.019e+08 -67515029
## `Stress Level`:`Heart Rate`                     7.687e+13  4.181e+06  18387900
## `Stress Level`:`Daily Steps`                   -6.100e+11  2.279e+04 -26766619
## `Stress Level`:`Sleep Duration`                -4.850e+15  5.369e+07 -90343297
## `Stress Level`:`Quality of Sleep`               2.756e+15  3.299e+07  83559672
## `Stress Level`:`BMI Category`Normal Weight      3.109e+13  7.551e+07    411796
## `Stress Level`:`BMI Category`Obese             -1.836e+15  1.930e+08  -9510505
## `Stress Level`:`BMI Category`Overweight         1.537e+15  8.597e+07  17872773
## `Heart Rate`:`Daily Steps`                      2.846e+11  6.578e+03  43263793
## `Heart Rate`:`Sleep Duration`                   1.690e+15  1.870e+07  90396015
## `Heart Rate`:`Quality of Sleep`                -1.048e+15  1.110e+07 -94452286
## `Heart Rate`:`BMI Category`Normal Weight        1.647e+15  2.713e+07  60712245
## `Heart Rate`:`BMI Category`Obese                9.370e+14  7.030e+07  13327559
## `Heart Rate`:`BMI Category`Overweight           6.565e+14  2.946e+07  22287388
## `Daily Steps`:`Sleep Duration`                  2.926e+12  2.672e+04 109510445
## `Daily Steps`:`Quality of Sleep`               -1.632e+12  3.649e+04 -44726322
## `Daily Steps`:`BMI Category`Normal Weight      -5.421e+11  4.347e+04 -12471461
## `Daily Steps`:`BMI Category`Obese               1.008e+12  6.897e+05   1461519
## `Daily Steps`:`BMI Category`Overweight         -2.006e+12  5.659e+04 -35445375
## `Sleep Duration`:`Quality of Sleep`            -5.328e+14  6.829e+07  -7800958
## `Sleep Duration`:`BMI Category`Normal Weight    6.518e+15  2.526e+08  25802640
## `Sleep Duration`:`BMI Category`Obese                   NA         NA        NA
## `Sleep Duration`:`BMI Category`Overweight       6.110e+15  1.088e+08  56144728
## `Quality of Sleep`:`BMI Category`Normal Weight -8.357e+15  1.408e+08 -59336916
## `Quality of Sleep`:`BMI Category`Obese                 NA         NA        NA
## `Quality of Sleep`:`BMI Category`Overweight    -6.107e+15  1.713e+08 -35642973
##                                                Pr(>|z|)    
## (Intercept)                                      <2e-16 ***
## Age                                              <2e-16 ***
## GenderMale                                       <2e-16 ***
## `Stress Level`                                   <2e-16 ***
## `Heart Rate`                                     <2e-16 ***
## `Daily Steps`                                    <2e-16 ***
## `Sleep Duration`                                 <2e-16 ***
## `Quality of Sleep`                               <2e-16 ***
## `BMI Category`Normal Weight                      <2e-16 ***
## `BMI Category`Obese                              <2e-16 ***
## `BMI Category`Overweight                         <2e-16 ***
## Age:GenderMale                                   <2e-16 ***
## Age:`Stress Level`                               <2e-16 ***
## Age:`Heart Rate`                                 <2e-16 ***
## Age:`Daily Steps`                                <2e-16 ***
## Age:`Sleep Duration`                             <2e-16 ***
## Age:`Quality of Sleep`                           <2e-16 ***
## Age:`BMI Category`Normal Weight                  <2e-16 ***
## Age:`BMI Category`Obese                          <2e-16 ***
## Age:`BMI Category`Overweight                     <2e-16 ***
## GenderMale:`Stress Level`                        <2e-16 ***
## GenderMale:`Heart Rate`                          <2e-16 ***
## GenderMale:`Daily Steps`                         <2e-16 ***
## GenderMale:`Sleep Duration`                      <2e-16 ***
## GenderMale:`Quality of Sleep`                    <2e-16 ***
## GenderMale:`BMI Category`Normal Weight           <2e-16 ***
## GenderMale:`BMI Category`Obese                   <2e-16 ***
## GenderMale:`BMI Category`Overweight              <2e-16 ***
## `Stress Level`:`Heart Rate`                      <2e-16 ***
## `Stress Level`:`Daily Steps`                     <2e-16 ***
## `Stress Level`:`Sleep Duration`                  <2e-16 ***
## `Stress Level`:`Quality of Sleep`                <2e-16 ***
## `Stress Level`:`BMI Category`Normal Weight       <2e-16 ***
## `Stress Level`:`BMI Category`Obese               <2e-16 ***
## `Stress Level`:`BMI Category`Overweight          <2e-16 ***
## `Heart Rate`:`Daily Steps`                       <2e-16 ***
## `Heart Rate`:`Sleep Duration`                    <2e-16 ***
## `Heart Rate`:`Quality of Sleep`                  <2e-16 ***
## `Heart Rate`:`BMI Category`Normal Weight         <2e-16 ***
## `Heart Rate`:`BMI Category`Obese                 <2e-16 ***
## `Heart Rate`:`BMI Category`Overweight            <2e-16 ***
## `Daily Steps`:`Sleep Duration`                   <2e-16 ***
## `Daily Steps`:`Quality of Sleep`                 <2e-16 ***
## `Daily Steps`:`BMI Category`Normal Weight        <2e-16 ***
## `Daily Steps`:`BMI Category`Obese                <2e-16 ***
## `Daily Steps`:`BMI Category`Overweight           <2e-16 ***
## `Sleep Duration`:`Quality of Sleep`              <2e-16 ***
## `Sleep Duration`:`BMI Category`Normal Weight     <2e-16 ***
## `Sleep Duration`:`BMI Category`Obese                 NA    
## `Sleep Duration`:`BMI Category`Overweight        <2e-16 ***
## `Quality of Sleep`:`BMI Category`Normal Weight   <2e-16 ***
## `Quality of Sleep`:`BMI Category`Obese               NA    
## `Quality of Sleep`:`BMI Category`Overweight      <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance:  507.47  on 373  degrees of freedom
## Residual deviance: 1730.10  on 323  degrees of freedom
## AIC: 1832.1
## 
## Number of Fisher Scoring iterations: 12
m3<-step(m1)
## Start:  AIC=224.38
## Sleep_Disorder ~ Age + Gender + `Stress Level` + `Heart Rate` + 
##     `Daily Steps` + `Sleep Duration` + `Quality of Sleep` + `BMI Category`
## 
##                      Df Deviance    AIC
## - `Sleep Duration`    1   202.38 222.38
## - Gender              1   202.44 222.44
## - `Heart Rate`        1   202.51 222.51
## - `Stress Level`      1   203.45 223.45
## - `Daily Steps`       1   203.64 223.64
## <none>                    202.38 224.38
## - `Quality of Sleep`  1   210.77 230.77
## - `BMI Category`      3   215.48 231.48
## - Age                 1   213.68 233.68
## 
## Step:  AIC=222.38
## Sleep_Disorder ~ Age + Gender + `Stress Level` + `Heart Rate` + 
##     `Daily Steps` + `Quality of Sleep` + `BMI Category`
## 
##                      Df Deviance    AIC
## - Gender              1   202.44 220.44
## - `Heart Rate`        1   202.58 220.58
## - `Daily Steps`       1   203.64 221.64
## - `Stress Level`      1   203.80 221.80
## <none>                    202.38 222.38
## - `Quality of Sleep`  1   211.47 229.47
## - Age                 1   215.97 233.97
## - `BMI Category`      3   221.98 235.98
## 
## Step:  AIC=220.44
## Sleep_Disorder ~ Age + `Stress Level` + `Heart Rate` + `Daily Steps` + 
##     `Quality of Sleep` + `BMI Category`
## 
##                      Df Deviance    AIC
## - `Heart Rate`        1   202.69 218.69
## - `Daily Steps`       1   203.68 219.68
## <none>                    202.44 220.44
## - `Stress Level`      1   204.74 220.74
## - `Quality of Sleep`  1   214.02 230.02
## - `BMI Category`      3   222.97 234.97
## - Age                 1   220.63 236.63
## 
## Step:  AIC=218.69
## Sleep_Disorder ~ Age + `Stress Level` + `Daily Steps` + `Quality of Sleep` + 
##     `BMI Category`
## 
##                      Df Deviance    AIC
## - `Daily Steps`       1   203.76 217.76
## <none>                    202.69 218.69
## - `Stress Level`      1   204.98 218.98
## - `Quality of Sleep`  1   214.17 228.17
## - Age                 1   220.65 234.65
## - `BMI Category`      3   241.66 251.66
## 
## Step:  AIC=217.76
## Sleep_Disorder ~ Age + `Stress Level` + `Quality of Sleep` + 
##     `BMI Category`
## 
##                      Df Deviance    AIC
## - `Stress Level`      1   205.24 217.24
## <none>                    203.76 217.76
## - `Quality of Sleep`  1   214.23 226.23
## - Age                 1   221.43 233.43
## - `BMI Category`      3   241.93 249.93
## 
## Step:  AIC=217.24
## Sleep_Disorder ~ Age + `Quality of Sleep` + `BMI Category`
## 
##                      Df Deviance    AIC
## <none>                    205.24 217.24
## - Age                 1   221.50 231.50
## - `Quality of Sleep`  1   222.20 232.20
## - `BMI Category`      3   257.00 263.00
summary(m3)
## 
## Call:
## glm(formula = Sleep_Disorder ~ Age + `Quality of Sleep` + `BMI Category`, 
##     family = binomial, data = model_data)
## 
## Coefficients:
##                               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                   -0.88731    1.19552  -0.742 0.457965    
## Age                            0.15800    0.04118   3.837 0.000124 ***
## `Quality of Sleep`            -1.05602    0.28015  -3.769 0.000164 ***
## `BMI Category`Normal Weight    0.75233    0.72263   1.041 0.297829    
## `BMI Category`Obese           19.49590 1178.11022   0.017 0.986797    
## `BMI Category`Overweight       2.77389    0.54659   5.075 3.88e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 507.47  on 373  degrees of freedom
## Residual deviance: 205.24  on 368  degrees of freedom
## AIC: 217.24
## 
## Number of Fisher Scoring iterations: 16

最終模型公式為:

\[ \text{Sleep Disorder} = \beta_0 + \beta_1 \cdot \text{Age} + \beta_2 \cdot \text{Quality of Sleep} + \beta_3 \cdot \text{BMI Category} \] 其中:
- \(\beta_0\) Age 的係數;
- \(\beta_2\) 是 Quality of Sleep 的係數;
- \(\beta_3\) 是 BMI Category 的係數。

# 載入必要的套件
library(ggplot2)
library(dplyr)
# 繪製殘差圖
plot(fitted(m3), residuals(m3), 
     xlab = "Fitted Values", 
     ylab = "Residuals", 
     main = "Residuals vs Fitted Values")
abline(h = 0, col = "red", lty = 2)

# 提取標準化殘差
std_residuals <- rstandard(m3)

# 繪製 Q-Q 圖
qqnorm(std_residuals, 
       main = "Normal Q-Q Plot of Residuals")
qqline(std_residuals, col = "blue", lty = 2)

library(lmtest)
## 載入需要的套件:zoo
## 
## 載入套件:'zoo'
## 下列物件被遮斷自 'package:base':
## 
##     as.Date, as.Date.numeric
bptest(m3)
## 
##  studentized Breusch-Pagan test
## 
## data:  m3
## BP = 8.4072, df = 5, p-value = 0.1352

BP值: 8.4072 ;    自由度 (df): 5;  ;   p-value: 0.1352
p-value 大於 0.05:我們無法拒絕虛無假設,這意味著模型的殘差顯示出均齊性。 模型的殘差變異性是穩定的