Topic: Proportions of US residents dead because of the Lyme Disease in 2015 (from Week 1 to Week 52).

Introduction

Background Informations.

Lyme disease, from its Latin name Lyme borreliosis, is caused by Ixodes ticks (bite) through bacterial infection and only if the tick has already bitten an infected animal. It is transmitted to pets and humans. Headache, chills, fever, myalgia are the most common symptoms. A visual means on his body is the appearance of a red or purple circle or mark where the tick has bitten. It is not necessary to consult a doctor only if the infected person is sick. Antibiotics (Doxycycline, or Amoxicillin, or Cefuroxime) treat the disease. It is possible to remove the tick on our own from our body with fine-tipped tweezers or a tick puller. To avoid ticks and their bites, it is possible to spray a tick spray on the clothes.

Data

Collections

The datas were collected by observations. Some datas are observational (Weeks, lyme disease death counts, and year 2015) and one categorical (area). It is about the death proportions of US resident who get the lyme disease. There are two bias. First, the datas are incomplete: we are missing data for C.N.M.I, American Samoa, Guam, Puerto Rico, and Virgin Island. Consequently, we exclude those territories for answering the questions. Second, We assume it is human death of lyme disease proportion.

Informations

The Centers for Disease Control and Prevention publishes the Morbidity and Mortality Weekly Report (MMWR) series every week (CDC). The MMWR series is CDC’s primary medium for scientific publication of timely, credible, definitive, correct, impartial, and useful public health facts and recommendations. Physicians, nurses, public health professionals, epidemiologists and other scientists, academics, students, and laboratorians are among the most frequent readers of the MMWR. Here, the MMWR begin with the 1st week of 2015 and ends with th 52th week of 2015. The proportions of lyme disease death are reported by Regions and States in the USA.

Datasets

library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.4
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.0.6     v dplyr   1.0.3
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lyme <- read_csv("lymeweek.csv")
## 
## -- Column specification --------------------------------------------------------
## cols(
##   Reporting_Area = col_character(),
##   `MMWR Year` = col_double(),
##   MMWR_Week = col_double(),
##   `Lyme disease, Cum 2014` = col_double()
## )
glimpse(lyme)
## Rows: 3,380
## Columns: 4
## $ Reporting_Area           <chr> "ALABAMA", "ALASKA", "ARIZONA", "ARKANSAS"...
## $ `MMWR Year`              <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, ...
## $ MMWR_Week                <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ `Lyme disease, Cum 2014` <dbl> NA, NA, NA, NA, 1, NA, 11, 3, NA, 8, 1, 1,...
view(lyme)

Questions

  1. Do the proportion of lyme death cases vary between regions in the USA?
  2. Do the proportion of lyme death cases vary within regions in the USA?

Research Questions Answers (Week 1 study)

Between Regions

Do the proportion of lyme death cases vary between regions in the USA ?

Histogram of the Number of Death by Lyme in Different Regions

allregions1 <- c("New England", "Mid Atlantic", "E.N. Central", "W.N. Central", "S. Atlantic", "E.S. Central", "W.S. Central", "Mountain", "Pacific")
allcases1 <- c(124, 186, 13, 2, 44, 1, 0, 1, 2)
df1 <- data.frame(allregions1, allcases1)
df1
##    allregions1 allcases1
## 1  New England       124
## 2 Mid Atlantic       186
## 3 E.N. Central        13
## 4 W.N. Central         2
## 5  S. Atlantic        44
## 6 E.S. Central         1
## 7 W.S. Central         0
## 8     Mountain         1
## 9      Pacific         2
plot1 <- df1%>%
  ggplot(aes(x=allregions1, y = allcases1, fill = allregions1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 45)) + 
  ggtitle("Number of Death by Lyme in Different Regions")+
  ylab("Reported Cases") +
  xlab("Regions") 
plot1

Answer

Ho: There is no difference in proportion of lyme death cases between US regions. Ha: There is a difference in proportion of lyme death cases between US regions.

392
## [1] 392
392/9
## [1] 43.55556
null.probs = c(44/392, 47/392, 43/392, 43/392, 43/392, 43/392, 43/392, 43/392, 43/392)
allcases1 = c(124, 186, 13, 2, 44, 1, 0, 1, 2)
chisq.test(allcases1, p=null.probs)
## 
##  Chi-squared test for given probabilities
## 
## data:  allcases1
## X-squared = 819.53, df = 8, p-value < 2.2e-16

Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed between US regions.

Within Regions

Do the proportion of lyme death cases vary within regions in the USA ?

New England

Histogram of the Number of Death by Lyme in New England

newenglands1 <- c("Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont")
casenewenglands1 <- c(30, 15, 42, 5, 9, 8)
df2 <- data.frame(newenglands1, casenewenglands1)
df2
##    newenglands1 casenewenglands1
## 1   Connecticut               30
## 2         Maine               15
## 3 Massachusetts               42
## 4 New Hampshire                5
## 5  Rhode Island                9
## 6       Vermont                8
plot2 <- df2%>%
  ggplot(aes(x=newenglands1, y = casenewenglands1, fill = newenglands1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 30)) + 
  ggtitle("Number of Death by Lyme in New England")+
  ylab("Reported Cases") +
  xlab("States")
plot2

Answer

Ho: There is no difference in proportion of lyme death cases in the New England. Ha: There is a difference in proportion of lyme death cases in the New England.

30+15+42+5+9+8
## [1] 109
109/6
## [1] 18.16667
null.probs = c(18/109, 18/109, 19/109, 18/109, 18/109, 18./109)
casenewenglands1 = c(30, 15, 42, 5, 9, 8)
chisq.test(casenewenglands1, p=null.probs)
## 
##  Chi-squared test for given probabilities
## 
## data:  casenewenglands1
## X-squared = 55.787, df = 5, p-value = 8.992e-11

Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the New England.

Mid Atlantic

Histogram of the Number of Death by Lyme in Mid Atlantic

ma1 <- c("New Jersey", "New York", "Pennsylvania")
casema1 <- c(39, 44, 79)
df3 <- data.frame(ma1, casema1)
df3
##            ma1 casema1
## 1   New Jersey      39
## 2     New York      44
## 3 Pennsylvania      79
plot3 <- df3%>%
  ggplot(aes(x=ma1, y = casema1, fill = ma1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 360)) + 
  ggtitle("Number of Death by Lyme in Mid. Atlantic")+
  ylab("Reported Cases") +
  xlab("States")
plot3

Answer

Ho: There is no difference in proportion of lyme death cases in the Mid Atlantic. Ha: There is a difference in proportion of lyme death cases in the Mid Atlantic.

39+44+79
## [1] 162
162/3
## [1] 54
null.probs = c(54/162, 54/162, 54/162)
casem1 = c(39, 44, 79)
chisq.test(casem1, p=null.probs)
## 
##  Chi-squared test for given probabilities
## 
## data:  casem1
## X-squared = 17.593, df = 2, p-value = 0.0001513

Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the Mid Atlantic.

E. N. Central

Histogram of the Number of Death by Lyme in E. N. Central

enc1 <- c("Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin")
caseenc1 <- c(1, 1, 1, 1, 10)
df4 <- data.frame(enc1, caseenc1)
df4
##        enc1 caseenc1
## 1  Illinois        1
## 2   Indiana        1
## 3  Michigan        1
## 4      Ohio        1
## 5 Wisconsin       10
plot4 <- df4%>%
  ggplot(aes(x=enc1, y = caseenc1, fill = enc1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 30)) + 
  ggtitle("Number of Death by Lyme in E. N. Central")+
  ylab("Reported Cases") +
  xlab("States")
plot4

Answer

Ho: There is no difference in proportion of lyme death cases in the E. N. Central. Ha: There is a difference in proportion of lyme death cases in the E. N. Central.

1+1+1+1+10
## [1] 14
14/5
## [1] 2.8
null.probs = c(2.8/14, 2.8/14, 2.8/14, 2.8/14, 2.8/14)
caseenc1 = c(1,1,1,1,10)
chisq.test(caseenc1, p=null.probs)
## Warning in chisq.test(caseenc1, p = null.probs): Chi-squared approximation may
## be incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  caseenc1
## X-squared = 23.143, df = 4, p-value = 0.0001186

Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the E. N. Central.

W. N. Central

Histogram of the Number of Death by Lyme in W. N. Central

wnc1 <- c("Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota")
casewnc1 <- c(1, 0, 0, 0, 0, 0, 0)
df10 <- data.frame(wnc1, casewnc1)
df10
##           wnc1 casewnc1
## 1         Iowa        1
## 2       Kansas        0
## 3    Minnesota        0
## 4     Missouri        0
## 5     Nebraska        0
## 6 North Dakota        0
## 7 South Dakota        0
plot10 <- df10 %>%
  ggplot(aes(x=wnc1, y = casewnc1, fill = wnc1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 30)) + 
  ggtitle("Number of Death by Lyme in W. N. Central")+
  ylab("Reported Cases") +
  xlab("States")
plot10

Answer

Ho: There is no difference in proportion of lyme death cases in the W. N. Central. Ha: There is a difference in proportion of lyme death cases in the W. N. Central.

1
## [1] 1
1/7
## [1] 0.1428571
null.probs = c(0.4, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1)
casewnc1 = c(1, 0, 0, 0, 0, 0, 0)
chisq.test(casewnc1, p=null.probs)
## Warning in chisq.test(casewnc1, p = null.probs): Chi-squared approximation may
## be incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  casewnc1
## X-squared = 1.5, df = 6, p-value = 0.9595

Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the E. N. Central.

s. Atlantic

Histogram of the Number of Death by Lyme in S. Atlantic

sa1 <- c("Delaware", "DC", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "West Virginia")
casesa1 <- c(6, 0, 2, 0, 15, 0, 0, 16, 1)
df5 <- data.frame(sa1, casesa1)
df5
##              sa1 casesa1
## 1       Delaware       6
## 2             DC       0
## 3        Florida       2
## 4        Georgia       0
## 5       Maryland      15
## 6 North Carolina       0
## 7 South Carolina       0
## 8       Virginia      16
## 9  West Virginia       1
plot5 <- df5%>%
  ggplot(aes(x=sa1, y = casesa1, fill = sa1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 30)) + 
  ggtitle("Number of Death by Lyme in S. Atlantic")+
  ylab("Reported Cases") +
  xlab("States")
plot5

Answer

Ho: There is no difference in proportion of lyme death cases in the S. Atlantic. Ha: There is a difference in proportion of lyme death cases in the S. Atlantic.

6+0+2+0+15+0+0+16+1
## [1] 40
40/9
## [1] 4.444444
null.probs = c(4/40, 4/40, 4/40, 4/40, 4/40, 4/40, 4/40, 8/40, 4/40)
casesa1 = c(6, 0, 2, 0, 15, 0, 0, 16, 1)
chisq.test(casesa1, p=null.probs)
## Warning in chisq.test(casesa1, p = null.probs): Chi-squared approximation may be
## incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  casesa1
## X-squared = 58.5, df = 8, p-value = 9.17e-10

Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the S. Atlantic.

E. S. Central

Histogram of the Number of Death by Lyme in E. S. Central

esc1 <- c("Alabama", "Kentucky", "Mississippi", "Tennessee")
caseesc1 <- c(0, 1, 0, 0)
df6 <- data.frame(esc1, caseesc1)
df6
##          esc1 caseesc1
## 1     Alabama        0
## 2    Kentucky        1
## 3 Mississippi        0
## 4   Tennessee        0
plot6 <- df6%>%
  ggplot(aes(x=esc1, y = caseesc1, fill = esc1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 360)) + 
  ggtitle("Number of Death by Lyme in E. S. Central")+
  ylab("Reported Cases") +
  xlab("States")
plot6

Answer

Ho: There is no difference in proportion of lyme death cases in the E. S. Central. Ha: There is a difference in proportion of lyme death cases in the E. S. Central.

1
## [1] 1
1/4
## [1] 0.25
null.probs = c(0.25, 0.25, 0.25, 0.25)
caseesc1 = c(0, 1, 0, 0)
chisq.test(caseesc1, p=null.probs)
## Warning in chisq.test(caseesc1, p = null.probs): Chi-squared approximation may
## be incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  caseesc1
## X-squared = 3, df = 3, p-value = 0.3916

Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the E. S. Central.

W.S. Central

Histogram of the Number of Death by Lyme in W. S. Central

wsc1 <- c("Arkansas", "Louisiana", "Oklahoma", "Texas")
casewsc1 <- c(0, 0, 0, 0)
df7 <- data.frame(wsc1, casewsc1)
df7
##        wsc1 casewsc1
## 1  Arkansas        0
## 2 Louisiana        0
## 3  Oklahoma        0
## 4     Texas        0
plot7 <- df7%>%
  ggplot(aes(x=wsc1, y = casewsc1, fill = wsc1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 360)) + 
  ggtitle("Number of Death by Lyme in W. S. Central")+
  ylab("Reported Cases") +
  xlab("States")
plot7

Answer

Ho: There is no difference in proportion of lyme death cases in the W. S. Central. Ha: There is a difference in proportion of lyme death cases in the W. S. Central.

Answer: With the eyeball we can say, since all the values are the same, the p-value = 1. Consequently, p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the W. S. Central.

Mountain

Histogram of the Number of Death by Lyme in Mountain

m1 <- c("Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah", "Wyoming") 
casem1 <- c(0, 0, 0, 0, 1, 0, 0, 0)
df8 <- data.frame(m1, casem1)
df8
##           m1 casem1
## 1    Arizona      0
## 2   Colorado      0
## 3      Idaho      0
## 4    Montana      0
## 5     Nevada      1
## 6 New Mexico      0
## 7       Utah      0
## 8    Wyoming      0
plot8 <- df8%>%
  ggplot(aes(x=m1, y = casem1, fill = m1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 30)) + 
  ggtitle("Number of Death by Lyme in Mountain")+
  ylab("Reported Cases") +
  xlab("States")
plot8

Answer

Ho: There is no difference in proportion of lyme death cases in the Mountain. Ha: There is a difference in proportion of lyme death cases in the Mountain.

1
## [1] 1
1/8
## [1] 0.125
null.probs = c(0.1, 0.1, 0.1, 0.1, 0.3, 0.1, 0.1, 0.1)
casem1 = c(0, 0, 0, 0, 1, 0, 0, 0)
chisq.test(casem1, p=null.probs)
## Warning in chisq.test(casem1, p = null.probs): Chi-squared approximation may be
## incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  casem1
## X-squared = 2.3333, df = 7, p-value = 0.9391

Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the Mountain.

Pacific

Histogram of the Number of Death by Lyme in Pacific

p1 <- c("Alaska", "California", "Hawaii", "Oregon", "Washington")
casep1 <- c(0, 1, 0, 1, 0)
df9 <- data.frame(p1, casep1)
df9
##           p1 casep1
## 1     Alaska      0
## 2 California      1
## 3     Hawaii      0
## 4     Oregon      1
## 5 Washington      0
plot9 <- df9%>%
  ggplot(aes(x=p1, y = casep1, fill = p1)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme(axis.text.x = element_text(angle = 360)) + 
  ggtitle("Number of Death by Lyme in Pacific")+
  ylab("Reported Cases") +
  xlab("States")
plot9

Answer

Ho: There is no difference in proportion of lyme death cases in the Pacific. Ha: There is a difference in proportion of lyme death cases in the Pacific.

1+1
## [1] 2
2/5
## [1] 0.4
null.probs = c(0.1, 0.35, 0.1, 0.35, 0.1)
casep1 = c(0, 1, 0, 1, 0)
chisq.test(casep1, p=null.probs)
## Warning in chisq.test(casep1, p = null.probs): Chi-squared approximation may be
## incorrect
## 
##  Chi-squared test for given probabilities
## 
## data:  casep1
## X-squared = 0.85714, df = 4, p-value = 0.9306

Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the Pacific.

Odds of dying of the lyme disease

Mid Atlantic (the most)

MidAtlantic <- c(186, 0.4744898)
Others1 <- c(206, 0.5255102)
US <- c(392, 1)
testma <- data.frame(MidAtlantic, Others1, US)
testma
##   MidAtlantic     Others1  US
## 1 186.0000000 206.0000000 392
## 2   0.4744898   0.5255102   1

Answer: The Mid-Atlantic resident have 0.9029126 times higher risk of dying of lyme disease than other US resident.

Pacific (the less)

Pacific <- c(2, 0.005102041)
Others2 <- c(390, 0.994898)
US <- c(392, 1)
testp <- data.frame(Pacific, Others2, US)
testp
##       Pacific    Others2  US
## 1 2.000000000 390.000000 392
## 2 0.005102041   0.994898   1

Answer: The Pacific resident have 0.005128205 times higher risk of dying of lyme disease than other US resident.

Linear Regressions

States

library(tidyverse)
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lymefits1 <- read_csv("lymefits1.csv")
## 
## -- Column specification --------------------------------------------------------
## cols(
##   Reporting_Area = col_character(),
##   MMWR_Year = col_double(),
##   MMWR_Week = col_double(),
##   `Lyme disease, Previous 52 weeks Med` = col_double(),
##   `Lyme disease, Cum 2014` = col_double()
## )
lymefits1a <- lymefits1 %>%
  rename(
    areas = Reporting_Area,
    counts = `Lyme disease, Previous 52 weeks Med`,
    week = MMWR_Week
    )
lymefits1a
## # A tibble: 2,652 x 5
##    areas         MMWR_Year  week counts `Lyme disease, Cum 2014`
##    <chr>             <dbl> <dbl>  <dbl>                    <dbl>
##  1 ALABAMA            2015     1      0                       NA
##  2 ALASKA             2015     1      0                       NA
##  3 ARIZONA            2015     1      0                       NA
##  4 ARKANSAS           2015     1      0                       NA
##  5 CALIFORNIA         2015     1      0                        1
##  6 COLORADO           2015     1      0                       NA
##  7 CONNECTICUT        2015     1     30                       11
##  8 DELAWARE           2015     1      6                        3
##  9 DIST. OF COL.      2015     1      0                       NA
## 10 FLORIDA            2015     1      2                        1
## # ... with 2,642 more rows
glimpse(lymefits1)
## Rows: 2,652
## Columns: 5
## $ Reporting_Area                        <chr> "ALABAMA", "ALASKA", "ARIZONA...
## $ MMWR_Year                             <dbl> 2015, 2015, 2015, 2015, 2015,...
## $ MMWR_Week                             <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ `Lyme disease, Previous 52 weeks Med` <dbl> 0, 0, 0, 0, 0, 0, 30, 6, 0, 2...
## $ `Lyme disease, Cum 2014`              <dbl> NA, NA, NA, NA, 1, NA, 11, 3,...
View(lymefits1)
fit1 <- lm(counts ~ week + areas, data=lymefits1a)
summary(fit1)
## 
## Call:
## lm(formula = counts ~ week + areas, data = lymefits1a)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -17.409  -0.101  -0.002   0.081  32.947 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          0.623015   0.332482   1.874 0.061066 .  
## week                -0.003191   0.003011  -1.060 0.289451    
## areasALASKA         -0.538462   0.456456  -1.180 0.238244    
## areasARIZONA        -0.538462   0.456456  -1.180 0.238244    
## areasARKANSAS       -0.538462   0.456456  -1.180 0.238244    
## areasCALIFORNIA      0.019231   0.456456   0.042 0.966398    
## areasCOLORADO       -0.538462   0.456456  -1.180 0.238244    
## areasCONNECTICUT    24.884615   0.456456  54.517  < 2e-16 ***
## areasDELAWARE        4.903846   0.456456  10.743  < 2e-16 ***
## areasDIST. OF COL.   0.423077   0.456456   0.927 0.354078    
## areasFLORIDA         1.730769   0.456456   3.792 0.000153 ***
## areasGEORGIA        -0.538462   0.456456  -1.180 0.238244    
## areasHAWAII         -0.538462   0.456456  -1.180 0.238244    
## areasIDAHO          -0.538462   0.456456  -1.180 0.238244    
## areasILLINOIS        0.730769   0.456456   1.601 0.109506    
## areasINDIANA        -0.019231   0.456456  -0.042 0.966398    
## areasIOWA            0.596154   0.456456   1.306 0.191651    
## areasKANSAS         -0.538462   0.456456  -1.180 0.238244    
## areasKENTUCKY       -0.538462   0.456456  -1.180 0.238244    
## areasLOUISIANA      -0.538462   0.456456  -1.180 0.238244    
## areasMAINE          13.576923   0.456456  29.744  < 2e-16 ***
## areasMARYLAND       19.538462   0.456456  42.805  < 2e-16 ***
## areasMASSACHUSETTS  37.942308   0.456456  83.124  < 2e-16 ***
## areasMICHIGAN        0.519231   0.456456   1.138 0.255423    
## areasMINNESOTA       0.576923   0.456456   1.264 0.206372    
## areasMISSISSIPPI    -0.538462   0.456456  -1.180 0.238244    
## areasMISSOURI       -0.538462   0.456456  -1.180 0.238244    
## areasMONTANA        -0.538462   0.456456  -1.180 0.238244    
## areasNEBRASKA       -0.538462   0.456456  -1.180 0.238244    
## areasNEVADA         -0.538462   0.456456  -1.180 0.238244    
## areasNEW HAMPSHIRE   3.961538   0.456456   8.679  < 2e-16 ***
## areasNEW JERSEY     39.500000   0.456456  86.536  < 2e-16 ***
## areasNEW MEXICO     -0.538462   0.456456  -1.180 0.238244    
## areasNEW YORK       40.846154   0.456456  89.486  < 2e-16 ***
## areasNORTH CAROLINA -0.384615   0.456456  -0.843 0.399522    
## areasNORTH DAKOTA   -0.538462   0.456456  -1.180 0.238244    
## areasOHIO            1.211538   0.456456   2.654 0.007997 ** 
## areasOKLAHOMA       -0.538462   0.456456  -1.180 0.238244    
## areasOREGON          0.307692   0.456456   0.674 0.500314    
## areasPENNSYLVANIA   88.596154   0.456456 194.096  < 2e-16 ***
## areasRHODE ISLAND   10.076923   0.456456  22.076  < 2e-16 ***
## areasSOUTH CAROLINA  0.423077   0.456456   0.927 0.354078    
## areasSOUTH DAKOTA   -0.538462   0.456456  -1.180 0.238244    
## areasTENNESSEE      -0.538462   0.456456  -1.180 0.238244    
## areasTEXAS          -0.403846   0.456456  -0.885 0.376377    
## areasUTAH           -0.538462   0.456456  -1.180 0.238244    
## areasVERMONT         6.115385   0.456456  13.398  < 2e-16 ***
## areasVIRGINIA       18.750000   0.456456  41.077  < 2e-16 ***
## areasWASHINGTON     -0.538462   0.456456  -1.180 0.238244    
## areasWEST VIRGINIA   1.307692   0.456456   2.865 0.004205 ** 
## areasWISCONSIN       8.807692   0.456456  19.296  < 2e-16 ***
## areasWYOMING        -0.538462   0.456456  -1.180 0.238244    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.327 on 2600 degrees of freedom
## Multiple R-squared:  0.9786, Adjusted R-squared:  0.9782 
## F-statistic:  2337 on 51 and 2600 DF,  p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(fit1)

Answer: 97.82% of the variations in the observations may be explained by this model. Consequently, there is a correlation between the lyme disease death and the area + the week. We can also say, thanks to the Normal Q-Q plot, that the lyme disease death cases in the US’s States are not normally distributed (categorical datas).

US and Regions

library(tidyverse)
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lymefits2 <- read.csv("lymefits2.csv")
glimpse (lymefits2)
## Rows: 520
## Columns: 5
## $ Reporting_Area                     <chr> "E.N. CENTRAL", "E.S. CENTRAL", ...
## $ MMWR.Year                          <int> 2015, 2015, 2015, 2015, 2015, 20...
## $ MMWR_Week                          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,...
## $ Lyme.disease..Previous.52.week.Med <int> 13, 1, 186, 1, 124, 2, 44, 392, ...
## $ Lyme.disease..Cum.2014             <int> 8, 1, 77, 1, 53, 2, 11, 156, 3, ...
View(lymefits2)
lymefits2a <- lymefits2 %>%
  rename(
    areas = Reporting_Area,
    counts = `Lyme.disease..Previous.52.week.Med`,
    week = MMWR_Week
    )
lymefits2a
##             areas MMWR.Year week counts Lyme.disease..Cum.2014
## 1    E.N. CENTRAL      2015    1     13                      8
## 2    E.S. CENTRAL      2015    1      1                      1
## 3   MID. ATLANTIC      2015    1    186                     77
## 4        MOUNTAIN      2015    1      1                      1
## 5     NEW ENGLAND      2015    1    124                     53
## 6         PACIFIC      2015    1      2                      2
## 7     S. ATLANTIC      2015    1     44                     11
## 8   UNITED STATES      2015    1    392                    156
## 9    W.N. CENTRAL      2015    1      2                      3
## 10   W.S. CENTRAL      2015    1      0                     NA
## 11   E.N. CENTRAL      2015    2     13                     17
## 12   E.S. CENTRAL      2015    2      1                      1
## 13  MID. ATLANTIC      2015    2    187                    143
## 14       MOUNTAIN      2015    2      1                      2
## 15    NEW ENGLAND      2015    2    124                    126
## 16        PACIFIC      2015    2      2                      3
## 17    S. ATLANTIC      2015    2     45                     28
## 18  UNITED STATES      2015    2    393                    323
## 19   W.N. CENTRAL      2015    2      2                      3
## 20   W.S. CENTRAL      2015    2      0                     NA
## 21   E.N. CENTRAL      2015    3     13                     22
## 22   E.S. CENTRAL      2015    3      1                      1
## 23  MID. ATLANTIC      2015    3    187                    204
## 24       MOUNTAIN      2015    3      1                      2
## 25    NEW ENGLAND      2015    3    121                    252
## 26        PACIFIC      2015    3      2                      4
## 27    S. ATLANTIC      2015    3     48                     41
## 28  UNITED STATES      2015    3    400                    530
## 29   W.N. CENTRAL      2015    3      2                      4
## 30   W.S. CENTRAL      2015    3      0                     NA
## 31   E.N. CENTRAL      2015    4     13                     27
## 32   E.S. CENTRAL      2015    4      2                      2
## 33  MID. ATLANTIC      2015    4    187                    273
## 34       MOUNTAIN      2015    4      1                      3
## 35    NEW ENGLAND      2015    4    121                    297
## 36        PACIFIC      2015    4      2                      2
## 37    S. ATLANTIC      2015    4     48                     51
## 38  UNITED STATES      2015    4    401                    659
## 39   W.N. CENTRAL      2015    4      2                      4
## 40   W.S. CENTRAL      2015    4      0                     NA
## 41   E.N. CENTRAL      2015    5     13                     34
## 42   E.S. CENTRAL      2015    5      2                      2
## 43  MID. ATLANTIC      2015    5    187                    360
## 44       MOUNTAIN      2015    5      1                      3
## 45    NEW ENGLAND      2015    5    121                    360
## 46        PACIFIC      2015    5      2                      2
## 47    S. ATLANTIC      2015    5     48                     65
## 48  UNITED STATES      2015    5    402                    831
## 49   W.N. CENTRAL      2015    5      2                      5
## 50   W.S. CENTRAL      2015    5      0                     NA
## 51   E.N. CENTRAL      2015    6     15                     38
## 52   E.S. CENTRAL      2015    6      2                      2
## 53  MID. ATLANTIC      2015    6    187                    425
## 54       MOUNTAIN      2015    6      1                      4
## 55    NEW ENGLAND      2015    6    122                    418
## 56        PACIFIC      2015    6      2                      3
## 57    S. ATLANTIC      2015    6     52                     88
## 58  UNITED STATES      2015    6    405                    984
## 59   W.N. CENTRAL      2015    6      2                      6
## 60   W.S. CENTRAL      2015    6      0                     NA
## 61   E.N. CENTRAL      2015    7     15                     40
## 62   E.S. CENTRAL      2015    7      2                      2
## 63  MID. ATLANTIC      2015    7    187                    488
## 64       MOUNTAIN      2015    7      1                      4
## 65    NEW ENGLAND      2015    7    122                    465
## 66        PACIFIC      2015    7      2                      3
## 67    S. ATLANTIC      2015    7     52                    104
## 68  UNITED STATES      2015    7    406                   1112
## 69   W.N. CENTRAL      2015    7      2                      6
## 70   W.S. CENTRAL      2015    7      0                     NA
## 71   E.N. CENTRAL      2015    8     15                     42
## 72   E.S. CENTRAL      2015    8      2                      3
## 73  MID. ATLANTIC      2015    8    189                    567
## 74       MOUNTAIN      2015    8      1                      4
## 75    NEW ENGLAND      2015    8    122                    512
## 76        PACIFIC      2015    8      2                      4
## 77    S. ATLANTIC      2015    8     53                    115
## 78  UNITED STATES      2015    8    408                   1253
## 79   W.N. CENTRAL      2015    8      2                      6
## 80   W.S. CENTRAL      2015    8      0                     NA
## 81   E.N. CENTRAL      2015    9     15                     53
## 82   E.S. CENTRAL      2015    9      2                      3
## 83  MID. ATLANTIC      2015    9    191                    681
## 84       MOUNTAIN      2015    9      1                      4
## 85    NEW ENGLAND      2015    9    125                    560
## 86        PACIFIC      2015    9      2                      3
## 87    S. ATLANTIC      2015    9     53                    132
## 88  UNITED STATES      2015    9    413                   1443
## 89   W.N. CENTRAL      2015    9      2                      7
## 90   W.S. CENTRAL      2015    9      0                     NA
## 91   E.N. CENTRAL      2015   10     15                     62
## 92   E.S. CENTRAL      2015   10      2                      4
## 93  MID. ATLANTIC      2015   10    192                    776
## 94       MOUNTAIN      2015   10      1                      6
## 95    NEW ENGLAND      2015   10    130                    617
## 96        PACIFIC      2015   10      2                      8
## 97    S. ATLANTIC      2015   10     55                    145
## 98  UNITED STATES      2015   10    419                   1627
## 99   W.N. CENTRAL      2015   10      2                      9
## 100  W.S. CENTRAL      2015   10      0                     NA
## 101  E.N. CENTRAL      2015   11     15                     71
## 102  E.S. CENTRAL      2015   11      2                      4
## 103 MID. ATLANTIC      2015   11    192                    844
## 104      MOUNTAIN      2015   11      1                      6
## 105   NEW ENGLAND      2015   11    130                    689
## 106       PACIFIC      2015   11      2                      9
## 107   S. ATLANTIC      2015   11     56                    160
## 108 UNITED STATES      2015   11    420                   1792
## 109  W.N. CENTRAL      2015   11      2                      9
## 110  W.S. CENTRAL      2015   11      0                     NA
## 111  E.N. CENTRAL      2015   12     15                     73
## 112  E.S. CENTRAL      2015   12      2                      4
## 113 MID. ATLANTIC      2015   12    192                    931
## 114      MOUNTAIN      2015   12      1                      7
## 115   NEW ENGLAND      2015   12    130                    753
## 116       PACIFIC      2015   12      2                     12
## 117   S. ATLANTIC      2015   12     56                    185
## 118 UNITED STATES      2015   12    421                   1974
## 119  W.N. CENTRAL      2015   12      2                      9
## 120  W.S. CENTRAL      2015   12      0                     NA
## 121  E.N. CENTRAL      2015   13     15                     77
## 122  E.S. CENTRAL      2015   13      2                      5
## 123 MID. ATLANTIC      2015   13    192                   1001
## 124      MOUNTAIN      2015   13      1                      8
## 125   NEW ENGLAND      2015   13    131                    806
## 126       PACIFIC      2015   13      2                     13
## 127   S. ATLANTIC      2015   13     57                    210
## 128 UNITED STATES      2015   13    421                   2130
## 129  W.N. CENTRAL      2015   13      2                     10
## 130  W.S. CENTRAL      2015   13      0                     NA
## 131  E.N. CENTRAL      2015   14     15                     90
## 132  E.S. CENTRAL      2015   14      2                      7
## 133 MID. ATLANTIC      2015   14    192                   1132
## 134      MOUNTAIN      2015   14      1                      8
## 135   NEW ENGLAND      2015   14    134                    896
## 136       PACIFIC      2015   14      2                     13
## 137   S. ATLANTIC      2015   14     57                    236
## 138 UNITED STATES      2015   14    423                   2389
## 139  W.N. CENTRAL      2015   14      2                      7
## 140  W.S. CENTRAL      2015   14      0                     NA
## 141  E.N. CENTRAL      2015   15     15                    102
## 142  E.S. CENTRAL      2015   15      2                      8
## 143 MID. ATLANTIC      2015   15    192                   1213
## 144      MOUNTAIN      2015   15      1                      8
## 145   NEW ENGLAND      2015   15    134                    953
## 146       PACIFIC      2015   15      2                     15
## 147   S. ATLANTIC      2015   15     57                    258
## 148 UNITED STATES      2015   15    424                   2564
## 149  W.N. CENTRAL      2015   15      2                      7
## 150  W.S. CENTRAL      2015   15      0                     NA
## 151  E.N. CENTRAL      2015   16     14                    116
## 152  E.S. CENTRAL      2015   16      2                     10
## 153 MID. ATLANTIC      2015   16    192                   1315
## 154      MOUNTAIN      2015   16      1                      8
## 155   NEW ENGLAND      2015   16    137                   1064
## 156       PACIFIC      2015   16      2                     16
## 157   S. ATLANTIC      2015   16     57                    293
## 158 UNITED STATES      2015   16    428                   2830
## 159  W.N. CENTRAL      2015   16      3                      8
## 160  W.S. CENTRAL      2015   16      0                     NA
## 161  E.N. CENTRAL      2015   17     16                    138
## 162  E.S. CENTRAL      2015   17      2                     14
## 163 MID. ATLANTIC      2015   17    187                   1863
## 164      MOUNTAIN      2015   17      1                     10
## 165   NEW ENGLAND      2015   17    137                   1154
## 166       PACIFIC      2015   17      2                     20
## 167   S. ATLANTIC      2015   17     57                    326
## 168 UNITED STATES      2015   17    413                   3534
## 169  W.N. CENTRAL      2015   17      2                      9
## 170  W.S. CENTRAL      2015   17      1                     NA
## 171  E.N. CENTRAL      2015   18     16                    153
## 172  E.S. CENTRAL      2015   18      2                     16
## 173 MID. ATLANTIC      2015   18    201                   2045
## 174      MOUNTAIN      2015   18      1                     10
## 175   NEW ENGLAND      2015   18    129                   1295
## 176       PACIFIC      2015   18      2                     22
## 177   S. ATLANTIC      2015   18     58                    372
## 178 UNITED STATES      2015   18    437                   4029
## 179  W.N. CENTRAL      2015   18     10                    115
## 180  W.S. CENTRAL      2015   18      0                      1
## 181  E.N. CENTRAL      2015   19     13                    172
## 182  E.S. CENTRAL      2015   19      2                     21
## 183 MID. ATLANTIC      2015   19    201                   2284
## 184      MOUNTAIN      2015   19      1                     11
## 185   NEW ENGLAND      2015   19    122                   1430
## 186       PACIFIC      2015   19      2                     24
## 187   S. ATLANTIC      2015   19     58                    403
## 188 UNITED STATES      2015   19    448                   4517
## 189  W.N. CENTRAL      2015   19     10                    169
## 190  W.S. CENTRAL      2015   19      0                      3
## 191  E.N. CENTRAL      2015   20     12                    201
## 192  E.S. CENTRAL      2015   20      2                     23
## 193 MID. ATLANTIC      2015   20    206                   2474
## 194      MOUNTAIN      2015   20      1                     13
## 195   NEW ENGLAND      2015   20    123                   1616
## 196       PACIFIC      2015   20      2                     25
## 197   S. ATLANTIC      2015   20     58                    442
## 198 UNITED STATES      2015   20    456                   4993
## 199  W.N. CENTRAL      2015   20     11                    196
## 200  W.S. CENTRAL      2015   20      0                      3
## 201  E.N. CENTRAL      2015   21     12                    248
## 202  E.S. CENTRAL      2015   21      1                     25
## 203 MID. ATLANTIC      2015   21    196                   1914
## 204      MOUNTAIN      2015   21      1                     16
## 205   NEW ENGLAND      2015   21    114                   1808
## 206       PACIFIC      2015   21      2                     32
## 207   S. ATLANTIC      2015   21     58                    478
## 208 UNITED STATES      2015   21    428                   4746
## 209  W.N. CENTRAL      2015   21     10                    221
## 210  W.S. CENTRAL      2015   21      0                      4
## 211  E.N. CENTRAL      2015   22     14                    296
## 212  E.S. CENTRAL      2015   22      1                     30
## 213 MID. ATLANTIC      2015   22    196                   2095
## 214      MOUNTAIN      2015   22      1                     16
## 215   NEW ENGLAND      2015   22    100                   2037
## 216       PACIFIC      2015   22      2                     36
## 217   S. ATLANTIC      2015   22     59                    522
## 218 UNITED STATES      2015   22    421                   5297
## 219  W.N. CENTRAL      2015   22      9                    260
## 220  W.S. CENTRAL      2015   22      0                      5
## 221  E.N. CENTRAL      2015   23     14                    380
## 222  E.S. CENTRAL      2015   23      1                     35
## 223 MID. ATLANTIC      2015   23    185                   2446
## 224      MOUNTAIN      2015   23      1                     18
## 225   NEW ENGLAND      2015   23     82                   2416
## 226       PACIFIC      2015   23      2                     38
## 227   S. ATLANTIC      2015   23     61                    585
## 228 UNITED STATES      2015   23    406                   6254
## 229  W.N. CENTRAL      2015   23      7                    330
## 230  W.S. CENTRAL      2015   23      0                      6
## 231  E.N. CENTRAL      2015   24     13                    476
## 232  E.S. CENTRAL      2015   24      1                     42
## 233 MID. ATLANTIC      2015   24    190                   2854
## 234      MOUNTAIN      2015   24      1                     18
## 235   NEW ENGLAND      2015   24     79                   2848
## 236       PACIFIC      2015   24      2                     42
## 237   S. ATLANTIC      2015   24     61                    682
## 238 UNITED STATES      2015   24    383                   7367
## 239  W.N. CENTRAL      2015   24      6                    399
## 240  W.S. CENTRAL      2015   24      0                      6
## 241  E.N. CENTRAL      2015   25     16                    586
## 242  E.S. CENTRAL      2015   25      1                     50
## 243 MID. ATLANTIC      2015   25    196                   3375
## 244      MOUNTAIN      2015   25      1                     20
## 245   NEW ENGLAND      2015   25     85                   3460
## 246       PACIFIC      2015   25      2                     47
## 247   S. ATLANTIC      2015   25     61                    807
## 248 UNITED STATES      2015   25    411                   8843
## 249  W.N. CENTRAL      2015   25      6                    491
## 250  W.S. CENTRAL      2015   25      0                      7
## 251  E.N. CENTRAL      2015   26     16                    709
## 252  E.S. CENTRAL      2015   26      1                     59
## 253 MID. ATLANTIC      2015   26    196                   4113
## 254      MOUNTAIN      2015   26      1                     22
## 255   NEW ENGLAND      2015   26     93                   4031
## 256       PACIFIC      2015   26      2                     54
## 257   S. ATLANTIC      2015   26     61                    940
## 258 UNITED STATES      2015   26    421                  10507
## 259  W.N. CENTRAL      2015   26      6                    572
## 260  W.S. CENTRAL      2015   26      0                      7
## 261  E.N. CENTRAL      2015   27     16                    859
## 262  E.S. CENTRAL      2015   27      1                     61
## 263 MID. ATLANTIC      2015   27    196                   4861
## 264      MOUNTAIN      2015   27      1                     22
## 265   NEW ENGLAND      2015   27    103                   4765
## 266       PACIFIC      2015   27      2                     61
## 267   S. ATLANTIC      2015   27     61                   1066
## 268 UNITED STATES      2015   27    421                  12399
## 269  W.N. CENTRAL      2015   27      6                    695
## 270  W.S. CENTRAL      2015   27      0                      9
## 271  E.N. CENTRAL      2015   28     16                   1005
## 272  E.S. CENTRAL      2015   28      1                     65
## 273 MID. ATLANTIC      2015   28    196                   5724
## 274      MOUNTAIN      2015   28      1                     24
## 275   NEW ENGLAND      2015   28    104                   5414
## 276       PACIFIC      2015   28      2                     68
## 277   S. ATLANTIC      2015   28     61                   1215
## 278 UNITED STATES      2015   28    421                  14332
## 279  W.N. CENTRAL      2015   28      6                    808
## 280  W.S. CENTRAL      2015   28      0                      9
## 281  E.N. CENTRAL      2015   29     16                   1121
## 282  E.S. CENTRAL      2015   29      1                     70
## 283 MID. ATLANTIC      2015   29    196                   6603
## 284      MOUNTAIN      2015   29      1                     27
## 285   NEW ENGLAND      2015   29    112                   6180
## 286       PACIFIC      2015   29      2                     76
## 287   S. ATLANTIC      2015   29     61                   1328
## 288 UNITED STATES      2015   29    428                  16351
## 289  W.N. CENTRAL      2015   29      9                    937
## 290  W.S. CENTRAL      2015   29      0                      9
## 291  E.N. CENTRAL      2015   30     14                   1252
## 292  E.S. CENTRAL      2015   30      1                     74
## 293 MID. ATLANTIC      2015   30    196                   7417
## 294      MOUNTAIN      2015   30      1                     29
## 295   NEW ENGLAND      2015   30    115                   6819
## 296       PACIFIC      2015   30      2                     83
## 297   S. ATLANTIC      2015   30     61                   1492
## 298 UNITED STATES      2015   30    428                  18207
## 299  W.N. CENTRAL      2015   30      9                   1026
## 300  W.S. CENTRAL      2015   30      0                     15
## 301  E.N. CENTRAL      2015   31     13                   1360
## 302  E.S. CENTRAL      2015   31      1                     80
## 303 MID. ATLANTIC      2015   31    196                   8286
## 304      MOUNTAIN      2015   31      1                     31
## 305   NEW ENGLAND      2015   31    118                   7355
## 306       PACIFIC      2015   31      2                     86
## 307   S. ATLANTIC      2015   31     61                   1652
## 308 UNITED STATES      2015   31    428                  19985
## 309  W.N. CENTRAL      2015   31      9                   1119
## 310  W.S. CENTRAL      2015   31      0                     16
## 311  E.N. CENTRAL      2015   32     13                   1438
## 312  E.S. CENTRAL      2015   32      1                     83
## 313 MID. ATLANTIC      2015   32    196                   8931
## 314      MOUNTAIN      2015   32      1                     35
## 315   NEW ENGLAND      2015   32    118                   7797
## 316       PACIFIC      2015   32      2                     90
## 317   S. ATLANTIC      2015   32     61                   1780
## 318 UNITED STATES      2015   32    428                  21352
## 319  W.N. CENTRAL      2015   32     10                   1179
## 320  W.S. CENTRAL      2015   32      0                     19
## 321  E.N. CENTRAL      2015   33     12                   1517
## 322  E.S. CENTRAL      2015   33      1                     92
## 323 MID. ATLANTIC      2015   33    196                   9534
## 324      MOUNTAIN      2015   33      1                     35
## 325   NEW ENGLAND      2015   33    115                   8211
## 326       PACIFIC      2015   33      2                     95
## 327   S. ATLANTIC      2015   33     61                   1975
## 328 UNITED STATES      2015   33    428                  22714
## 329  W.N. CENTRAL      2015   33      9                   1235
## 330  W.S. CENTRAL      2015   33      0                     20
## 331  E.N. CENTRAL      2015   34     12                   1553
## 332  E.S. CENTRAL      2015   34      1                    102
## 333 MID. ATLANTIC      2015   34    196                  10024
## 334      MOUNTAIN      2015   34      1                     43
## 335   NEW ENGLAND      2015   34    111                   8568
## 336       PACIFIC      2015   34      2                    100
## 337   S. ATLANTIC      2015   34     61                   2074
## 338 UNITED STATES      2015   34    428                  23763
## 339  W.N. CENTRAL      2015   34      7                   1278
## 340  W.S. CENTRAL      2015   34      0                     21
## 341  E.N. CENTRAL      2015   35     11                   1597
## 342  E.S. CENTRAL      2015   35      1                    107
## 343 MID. ATLANTIC      2015   35    196                  10464
## 344      MOUNTAIN      2015   35      1                     45
## 345   NEW ENGLAND      2015   35    110                   8832
## 346       PACIFIC      2015   35      2                    108
## 347   S. ATLANTIC      2015   35     61                   2229
## 348 UNITED STATES      2015   35    428                  24709
## 349  W.N. CENTRAL      2015   35      7                   1305
## 350  W.S. CENTRAL      2015   35      0                     22
## 351  E.N. CENTRAL      2015   36     11                   1638
## 352  E.S. CENTRAL      2015   36      1                    108
## 353 MID. ATLANTIC      2015   36    196                  10794
## 354      MOUNTAIN      2015   36      1                     47
## 355   NEW ENGLAND      2015   36    111                   9074
## 356       PACIFIC      2015   36      2                    110
## 357   S. ATLANTIC      2015   36     61                   2353
## 358 UNITED STATES      2015   36    428                  25491
## 359  W.N. CENTRAL      2015   36      6                   1345
## 360  W.S. CENTRAL      2015   36      0                     22
## 361  E.N. CENTRAL      2015   37     12                   1675
## 362  E.S. CENTRAL      2015   37      1                    110
## 363 MID. ATLANTIC      2015   37    197                  11152
## 364      MOUNTAIN      2015   37      1                     48
## 365   NEW ENGLAND      2015   37    106                   9311
## 366       PACIFIC      2015   37      2                    110
## 367   S. ATLANTIC      2015   37     61                   2451
## 368 UNITED STATES      2015   37    421                  26266
## 369  W.N. CENTRAL      2015   37      6                   1385
## 370  W.S. CENTRAL      2015   37      0                     24
## 371  E.N. CENTRAL      2015   38     12                   1708
## 372  E.S. CENTRAL      2015   38      1                    111
## 373 MID. ATLANTIC      2015   38    198                  11472
## 374      MOUNTAIN      2015   38      1                     50
## 375   NEW ENGLAND      2015   38    102                   9534
## 376       PACIFIC      2015   38      2                    113
## 377   S. ATLANTIC      2015   38     61                   2534
## 378 UNITED STATES      2015   38    407                  26974
## 379  W.N. CENTRAL      2015   38      5                   1427
## 380  W.S. CENTRAL      2015   38      0                     25
## 381  E.N. CENTRAL      2015   39     12                   1745
## 382  E.S. CENTRAL      2015   39      1                    113
## 383 MID. ATLANTIC      2015   39    196                  11745
## 384      MOUNTAIN      2015   39      1                     50
## 385   NEW ENGLAND      2015   39    100                   9689
## 386       PACIFIC      2015   39      2                    116
## 387   S. ATLANTIC      2015   39     61                   2633
## 388 UNITED STATES      2015   39    407                  27577
## 389  W.N. CENTRAL      2015   39      4                   1460
## 390  W.S. CENTRAL      2015   39      0                     26
## 391  E.N. CENTRAL      2015   40     12                   1787
## 392  E.S. CENTRAL      2015   40      1                    114
## 393 MID. ATLANTIC      2015   40    198                  12020
## 394      MOUNTAIN      2015   40      1                     51
## 395   NEW ENGLAND      2015   40     99                   9855
## 396       PACIFIC      2015   40      2                    120
## 397   S. ATLANTIC      2015   40     61                   2729
## 398 UNITED STATES      2015   40    407                  28198
## 399  W.N. CENTRAL      2015   40      5                   1494
## 400  W.S. CENTRAL      2015   40      0                     28
## 401  E.N. CENTRAL      2015   41     14                   1822
## 402  E.S. CENTRAL      2015   41      1                    114
## 403 MID. ATLANTIC      2015   41    198                  12262
## 404      MOUNTAIN      2015   41      1                     51
## 405   NEW ENGLAND      2015   41     95                  10003
## 406       PACIFIC      2015   41      2                    121
## 407   S. ATLANTIC      2015   41     58                   2791
## 408 UNITED STATES      2015   41    413                  28722
## 409  W.N. CENTRAL      2015   41      5                   1530
## 410  W.S. CENTRAL      2015   41      1                     28
## 411  E.N. CENTRAL      2015   42     14                   1843
## 412  E.S. CENTRAL      2015   42      1                    117
## 413 MID. ATLANTIC      2015   42    192                  12465
## 414      MOUNTAIN      2015   42      1                     52
## 415   NEW ENGLAND      2015   42     94                  10209
## 416       PACIFIC      2015   42      2                    123
## 417   S. ATLANTIC      2015   42     58                   2842
## 418 UNITED STATES      2015   42    407                  29233
## 419  W.N. CENTRAL      2015   42      5                   1553
## 420  W.S. CENTRAL      2015   42      0                     29
## 421  E.N. CENTRAL      2015   43     13                   1857
## 422  E.S. CENTRAL      2015   43      1                    119
## 423 MID. ATLANTIC      2015   43    200                  12643
## 424      MOUNTAIN      2015   43      1                     53
## 425   NEW ENGLAND      2015   43     85                  10350
## 426       PACIFIC      2015   43      2                    127
## 427   S. ATLANTIC      2015   43     58                   2890
## 428 UNITED STATES      2015   43    400                  29648
## 429  W.N. CENTRAL      2015   43      5                   1580
## 430  W.S. CENTRAL      2015   43      0                     29
## 431  E.N. CENTRAL      2015   44     12                   1876
## 432  E.S. CENTRAL      2015   44      2                    120
## 433 MID. ATLANTIC      2015   44    200                  12952
## 434      MOUNTAIN      2015   44      1                     53
## 435   NEW ENGLAND      2015   44     91                  10507
## 436       PACIFIC      2015   44      3                    128
## 437   S. ATLANTIC      2015   44     62                   2939
## 438 UNITED STATES      2015   44    403                  30195
## 439  W.N. CENTRAL      2015   44      4                   1590
## 440  W.S. CENTRAL      2015   44      0                     30
## 441  E.N. CENTRAL      2015   45     13                   1893
## 442  E.S. CENTRAL      2015   45      2                    120
## 443 MID. ATLANTIC      2015   45    209                  13156
## 444      MOUNTAIN      2015   45      1                     55
## 445   NEW ENGLAND      2015   45     84                  10632
## 446       PACIFIC      2015   45      3                    128
## 447   S. ATLANTIC      2015   45     64                   2998
## 448 UNITED STATES      2015   45    385                  30622
## 449  W.N. CENTRAL      2015   45      4                   1609
## 450  W.S. CENTRAL      2015   45      0                     31
## 451  E.N. CENTRAL      2015   46     14                   1905
## 452  E.S. CENTRAL      2015   46      2                    122
## 453 MID. ATLANTIC      2015   46    218                  13375
## 454      MOUNTAIN      2015   46      1                     55
## 455   NEW ENGLAND      2015   46     77                  10753
## 456       PACIFIC      2015   46      3                    129
## 457   S. ATLANTIC      2015   46     66                   3061
## 458 UNITED STATES      2015   46    377                  31051
## 459  W.N. CENTRAL      2015   46      4                   1620
## 460  W.S. CENTRAL      2015   46      0                     31
## 461  E.N. CENTRAL      2015   47     14                   1914
## 462  E.S. CENTRAL      2015   47      2                    122
## 463 MID. ATLANTIC      2015   47    219                  13564
## 464      MOUNTAIN      2015   47      1                     56
## 465   NEW ENGLAND      2015   47     79                  10877
## 466       PACIFIC      2015   47      3                    131
## 467   S. ATLANTIC      2015   47     62                   3187
## 468 UNITED STATES      2015   47    384                  31512
## 469  W.N. CENTRAL      2015   47      4                   1630
## 470  W.S. CENTRAL      2015   47      0                     31
## 471  E.N. CENTRAL      2015   48     18                   1916
## 472  E.S. CENTRAL      2015   48      2                    123
## 473 MID. ATLANTIC      2015   48    227                  13683
## 474      MOUNTAIN      2015   48      1                     57
## 475   NEW ENGLAND      2015   48     81                  10947
## 476       PACIFIC      2015   48      3                    132
## 477   S. ATLANTIC      2015   48     64                   3244
## 478 UNITED STATES      2015   48    410                  31772
## 479  W.N. CENTRAL      2015   48      4                   1639
## 480  W.S. CENTRAL      2015   48      1                     31
## 481  E.N. CENTRAL      2015   49     18                   1927
## 482  E.S. CENTRAL      2015   49      2                    123
## 483 MID. ATLANTIC      2015   49    237                  13853
## 484      MOUNTAIN      2015   49      1                     58
## 485   NEW ENGLAND      2015   49     81                  11041
## 486       PACIFIC      2015   49      3                    133
## 487   S. ATLANTIC      2015   49     65                   3292
## 488 UNITED STATES      2015   49    424                  32105
## 489  W.N. CENTRAL      2015   49      4                   1647
## 490  W.S. CENTRAL      2015   49      1                     31
## 491  E.N. CENTRAL      2015   50     18                   1932
## 492  E.S. CENTRAL      2015   50      2                    124
## 493 MID. ATLANTIC      2015   50    241                  14034
## 494      MOUNTAIN      2015   50      1                     58
## 495   NEW ENGLAND      2015   50     82                  11095
## 496       PACIFIC      2015   50      3                    135
## 497   S. ATLANTIC      2015   50     62                   3410
## 498 UNITED STATES      2015   50    427                  32474
## 499  W.N. CENTRAL      2015   50      4                   1653
## 500  W.S. CENTRAL      2015   50      1                     33
## 501  E.N. CENTRAL      2015   51     20                   1939
## 502  E.S. CENTRAL      2015   51      2                    125
## 503 MID. ATLANTIC      2015   51    242                  14168
## 504      MOUNTAIN      2015   51      1                     58
## 505   NEW ENGLAND      2015   51     83                  11158
## 506       PACIFIC      2015   51      2                    137
## 507   S. ATLANTIC      2015   51     64                   3501
## 508 UNITED STATES      2015   51    429                  32778
## 509  W.N. CENTRAL      2015   51      3                   1658
## 510  W.S. CENTRAL      2015   51      1                     34
## 511  E.N. CENTRAL      2015   52     20                   1944
## 512  E.S. CENTRAL      2015   52      2                    127
## 513 MID. ATLANTIC      2015   52    242                  14251
## 514      MOUNTAIN      2015   52      1                     58
## 515   NEW ENGLAND      2015   52     83                  11184
## 516       PACIFIC      2015   52      2                    137
## 517   S. ATLANTIC      2015   52     65                   3665
## 518 UNITED STATES      2015   52    431                  33063
## 519  W.N. CENTRAL      2015   52      4                   1662
## 520  W.S. CENTRAL      2015   52      1                     35
fit2 <- lm(counts ~ week + areas, data=lymefits2a)
summary(fit2)
## 
## Call:
## lm(formula = counts ~ week + areas, data = lymefits2a)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -38.622  -1.323  -0.100   0.987  42.423 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         13.87534    1.46461   9.474  < 2e-16 ***
## week                 0.01414    0.02694   0.525      0.6    
## areasE.S. CENTRAL  -12.75000    1.80846  -7.050 5.85e-12 ***
## areasMID. ATLANTIC 184.98077    1.80846 102.286  < 2e-16 ***
## areasMOUNTAIN      -13.25000    1.80846  -7.327 9.31e-13 ***
## areasNEW ENGLAND    94.03846    1.80846  51.999  < 2e-16 ***
## areasPACIFIC       -12.11538    1.80846  -6.699 5.56e-11 ***
## areasS. ATLANTIC    44.05769    1.80846  24.362  < 2e-16 ***
## areasUNITED STATES 401.09615    1.80846 221.789  < 2e-16 ***
## areasW.N. CENTRAL   -9.28846    1.80846  -5.136 4.00e-07 ***
## areasW.S. CENTRAL  -14.11538    1.80846  -7.805 3.40e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.221 on 509 degrees of freedom
## Multiple R-squared:  0.9949, Adjusted R-squared:  0.9948 
## F-statistic:  9952 on 10 and 509 DF,  p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(fit2)

Answer: 99.48% of the variations in the observations may be explained by this model. Consequently, there is a correlation between the lyme disease death and the area + the week. We can also say, thanks to the Normal Q-Q plot, that the lyme disease death cases in Us and its regions are not normally distributed (categorical datas).

Conclusion

Overall Conclusion

There is a difference in the proportion of lyme death cases between US regions and within US regions. The regions where there is no difference in the proportion of lyme death cases are W. N. Central (p-value = 0.9595), E. S. Central (p-value = 0.3916), W. N. Central(p-value = 1), Mountain (p-value = 0.9391), Pacific (p-value = 0.9306). The regions where there is a difference in the proportion of lyme death cases are New England (p-value = 8.992e-11), Mid. Atlantic (p-value = 0.0001513), E. N. Central (p-value = 0.0001186), S. Atlantic (9.17e-10).

The odds of dying of the Lyme disease is much greater in the Mid Atlantic compared to others US regions (odds = 47:53) The odds of dying of the Lyme disease is much smaller in the Pacific compared to others US regions (odds = 1:99)

For the US, US’s regions ad US’s states, more than 97% of the variation in the observations may be explained by this model,linear regression, (For US’s states: adjusted R-squared = 0.9782 and for US + US’s regions: adjusted R-squared = 0.9948). There is a correlation between the lyme disease death and the area + the week.

Opinions

  1. I think we need more data (American Samoa, C.N.M.I., Guam, Puerto Rico, Virgin Island) to have a better idea of the number of deaths due to lyme disease in the U.S.

  2. The model is a great modal where all the US regions and states are significant (the one with the smallest lyme disease death report and the one with the highest lyme disease death report). Also, the adjusted r-squared is either repectively 99.48% and 97.82% but the diagnostic plot indicates that this may not be the most appropriate model because it may not be linear. Ignoring the Diagnostic plot it is a great modals.

Lessons Earned

  1. Analyzing a researcher’s dataset is challenging (coding, missing data…).
  2. Appreciating more the researchers’ work (challenges, difficulties to coding, a lot of works and hours).

Bibliography

https://www.kaggle.com/cdc/nndss-lyme-disease-to-meningococcal?select=nndss-table-ii.-lyme-disease-to-meningococcal.csv https://www.cdc.gov/mmwr/about.html#:~:text=The%20Morbidity%20and%20Mortality%20Weekly%20Report%20(MMWR%20)%20series%20is%20prepared,Control%20and%20Prevention%20(CDC)