Topic: Proportions of US residents dead because of the Lyme Disease in 2015 (from Week 1 to Week 52).
Lyme disease, from its Latin name Lyme borreliosis, is caused by Ixodes ticks (bite) through bacterial infection and only if the tick has already bitten an infected animal. It is transmitted to pets and humans. Headache, chills, fever, myalgia are the most common symptoms. A visual means on his body is the appearance of a red or purple circle or mark where the tick has bitten. It is not necessary to consult a doctor only if the infected person is sick. Antibiotics (Doxycycline, or Amoxicillin, or Cefuroxime) treat the disease. It is possible to remove the tick on our own from our body with fine-tipped tweezers or a tick puller. To avoid ticks and their bites, it is possible to spray a tick spray on the clothes.
The datas were collected by observations. Some datas are observational (Weeks, lyme disease death counts, and year 2015) and one categorical (area). It is about the death proportions of US resident who get the lyme disease. There are two bias. First, the datas are incomplete: we are missing data for C.N.M.I, American Samoa, Guam, Puerto Rico, and Virgin Island. Consequently, we exclude those territories for answering the questions. Second, We assume it is human death of lyme disease proportion.
The Centers for Disease Control and Prevention publishes the Morbidity and Mortality Weekly Report (MMWR) series every week (CDC). The MMWR series is CDC’s primary medium for scientific publication of timely, credible, definitive, correct, impartial, and useful public health facts and recommendations. Physicians, nurses, public health professionals, epidemiologists and other scientists, academics, students, and laboratorians are among the most frequent readers of the MMWR. Here, the MMWR begin with the 1st week of 2015 and ends with th 52th week of 2015. The proportions of lyme disease death are reported by Regions and States in the USA.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.4
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.0.6 v dplyr 1.0.3
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lyme <- read_csv("lymeweek.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## Reporting_Area = col_character(),
## `MMWR Year` = col_double(),
## MMWR_Week = col_double(),
## `Lyme disease, Cum 2014` = col_double()
## )
glimpse(lyme)
## Rows: 3,380
## Columns: 4
## $ Reporting_Area <chr> "ALABAMA", "ALASKA", "ARIZONA", "ARKANSAS"...
## $ `MMWR Year` <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, ...
## $ MMWR_Week <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ `Lyme disease, Cum 2014` <dbl> NA, NA, NA, NA, 1, NA, 11, 3, NA, 8, 1, 1,...
view(lyme)
Do the proportion of lyme death cases vary between regions in the USA ?
allregions1 <- c("New England", "Mid Atlantic", "E.N. Central", "W.N. Central", "S. Atlantic", "E.S. Central", "W.S. Central", "Mountain", "Pacific")
allcases1 <- c(124, 186, 13, 2, 44, 1, 0, 1, 2)
df1 <- data.frame(allregions1, allcases1)
df1
## allregions1 allcases1
## 1 New England 124
## 2 Mid Atlantic 186
## 3 E.N. Central 13
## 4 W.N. Central 2
## 5 S. Atlantic 44
## 6 E.S. Central 1
## 7 W.S. Central 0
## 8 Mountain 1
## 9 Pacific 2
plot1 <- df1%>%
ggplot(aes(x=allregions1, y = allcases1, fill = allregions1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 45)) +
ggtitle("Number of Death by Lyme in Different Regions")+
ylab("Reported Cases") +
xlab("Regions")
plot1
Ho: There is no difference in proportion of lyme death cases between US regions. Ha: There is a difference in proportion of lyme death cases between US regions.
392
## [1] 392
392/9
## [1] 43.55556
null.probs = c(44/392, 47/392, 43/392, 43/392, 43/392, 43/392, 43/392, 43/392, 43/392)
allcases1 = c(124, 186, 13, 2, 44, 1, 0, 1, 2)
chisq.test(allcases1, p=null.probs)
##
## Chi-squared test for given probabilities
##
## data: allcases1
## X-squared = 819.53, df = 8, p-value < 2.2e-16
Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed between US regions.
Do the proportion of lyme death cases vary within regions in the USA ?
newenglands1 <- c("Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont")
casenewenglands1 <- c(30, 15, 42, 5, 9, 8)
df2 <- data.frame(newenglands1, casenewenglands1)
df2
## newenglands1 casenewenglands1
## 1 Connecticut 30
## 2 Maine 15
## 3 Massachusetts 42
## 4 New Hampshire 5
## 5 Rhode Island 9
## 6 Vermont 8
plot2 <- df2%>%
ggplot(aes(x=newenglands1, y = casenewenglands1, fill = newenglands1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 30)) +
ggtitle("Number of Death by Lyme in New England")+
ylab("Reported Cases") +
xlab("States")
plot2
Ho: There is no difference in proportion of lyme death cases in the New England. Ha: There is a difference in proportion of lyme death cases in the New England.
30+15+42+5+9+8
## [1] 109
109/6
## [1] 18.16667
null.probs = c(18/109, 18/109, 19/109, 18/109, 18/109, 18./109)
casenewenglands1 = c(30, 15, 42, 5, 9, 8)
chisq.test(casenewenglands1, p=null.probs)
##
## Chi-squared test for given probabilities
##
## data: casenewenglands1
## X-squared = 55.787, df = 5, p-value = 8.992e-11
Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the New England.
ma1 <- c("New Jersey", "New York", "Pennsylvania")
casema1 <- c(39, 44, 79)
df3 <- data.frame(ma1, casema1)
df3
## ma1 casema1
## 1 New Jersey 39
## 2 New York 44
## 3 Pennsylvania 79
plot3 <- df3%>%
ggplot(aes(x=ma1, y = casema1, fill = ma1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 360)) +
ggtitle("Number of Death by Lyme in Mid. Atlantic")+
ylab("Reported Cases") +
xlab("States")
plot3
Ho: There is no difference in proportion of lyme death cases in the Mid Atlantic. Ha: There is a difference in proportion of lyme death cases in the Mid Atlantic.
39+44+79
## [1] 162
162/3
## [1] 54
null.probs = c(54/162, 54/162, 54/162)
casem1 = c(39, 44, 79)
chisq.test(casem1, p=null.probs)
##
## Chi-squared test for given probabilities
##
## data: casem1
## X-squared = 17.593, df = 2, p-value = 0.0001513
Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the Mid Atlantic.
enc1 <- c("Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin")
caseenc1 <- c(1, 1, 1, 1, 10)
df4 <- data.frame(enc1, caseenc1)
df4
## enc1 caseenc1
## 1 Illinois 1
## 2 Indiana 1
## 3 Michigan 1
## 4 Ohio 1
## 5 Wisconsin 10
plot4 <- df4%>%
ggplot(aes(x=enc1, y = caseenc1, fill = enc1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 30)) +
ggtitle("Number of Death by Lyme in E. N. Central")+
ylab("Reported Cases") +
xlab("States")
plot4
Ho: There is no difference in proportion of lyme death cases in the E. N. Central. Ha: There is a difference in proportion of lyme death cases in the E. N. Central.
1+1+1+1+10
## [1] 14
14/5
## [1] 2.8
null.probs = c(2.8/14, 2.8/14, 2.8/14, 2.8/14, 2.8/14)
caseenc1 = c(1,1,1,1,10)
chisq.test(caseenc1, p=null.probs)
## Warning in chisq.test(caseenc1, p = null.probs): Chi-squared approximation may
## be incorrect
##
## Chi-squared test for given probabilities
##
## data: caseenc1
## X-squared = 23.143, df = 4, p-value = 0.0001186
Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the E. N. Central.
wnc1 <- c("Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota")
casewnc1 <- c(1, 0, 0, 0, 0, 0, 0)
df10 <- data.frame(wnc1, casewnc1)
df10
## wnc1 casewnc1
## 1 Iowa 1
## 2 Kansas 0
## 3 Minnesota 0
## 4 Missouri 0
## 5 Nebraska 0
## 6 North Dakota 0
## 7 South Dakota 0
plot10 <- df10 %>%
ggplot(aes(x=wnc1, y = casewnc1, fill = wnc1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 30)) +
ggtitle("Number of Death by Lyme in W. N. Central")+
ylab("Reported Cases") +
xlab("States")
plot10
Ho: There is no difference in proportion of lyme death cases in the W. N. Central. Ha: There is a difference in proportion of lyme death cases in the W. N. Central.
1
## [1] 1
1/7
## [1] 0.1428571
null.probs = c(0.4, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1)
casewnc1 = c(1, 0, 0, 0, 0, 0, 0)
chisq.test(casewnc1, p=null.probs)
## Warning in chisq.test(casewnc1, p = null.probs): Chi-squared approximation may
## be incorrect
##
## Chi-squared test for given probabilities
##
## data: casewnc1
## X-squared = 1.5, df = 6, p-value = 0.9595
Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the E. N. Central.
sa1 <- c("Delaware", "DC", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "West Virginia")
casesa1 <- c(6, 0, 2, 0, 15, 0, 0, 16, 1)
df5 <- data.frame(sa1, casesa1)
df5
## sa1 casesa1
## 1 Delaware 6
## 2 DC 0
## 3 Florida 2
## 4 Georgia 0
## 5 Maryland 15
## 6 North Carolina 0
## 7 South Carolina 0
## 8 Virginia 16
## 9 West Virginia 1
plot5 <- df5%>%
ggplot(aes(x=sa1, y = casesa1, fill = sa1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 30)) +
ggtitle("Number of Death by Lyme in S. Atlantic")+
ylab("Reported Cases") +
xlab("States")
plot5
Ho: There is no difference in proportion of lyme death cases in the S. Atlantic. Ha: There is a difference in proportion of lyme death cases in the S. Atlantic.
6+0+2+0+15+0+0+16+1
## [1] 40
40/9
## [1] 4.444444
null.probs = c(4/40, 4/40, 4/40, 4/40, 4/40, 4/40, 4/40, 8/40, 4/40)
casesa1 = c(6, 0, 2, 0, 15, 0, 0, 16, 1)
chisq.test(casesa1, p=null.probs)
## Warning in chisq.test(casesa1, p = null.probs): Chi-squared approximation may be
## incorrect
##
## Chi-squared test for given probabilities
##
## data: casesa1
## X-squared = 58.5, df = 8, p-value = 9.17e-10
Answer: p-value < α. Reject the null. There is very strong evidence that the proportion of lyme death cases changed in the S. Atlantic.
esc1 <- c("Alabama", "Kentucky", "Mississippi", "Tennessee")
caseesc1 <- c(0, 1, 0, 0)
df6 <- data.frame(esc1, caseesc1)
df6
## esc1 caseesc1
## 1 Alabama 0
## 2 Kentucky 1
## 3 Mississippi 0
## 4 Tennessee 0
plot6 <- df6%>%
ggplot(aes(x=esc1, y = caseesc1, fill = esc1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 360)) +
ggtitle("Number of Death by Lyme in E. S. Central")+
ylab("Reported Cases") +
xlab("States")
plot6
Ho: There is no difference in proportion of lyme death cases in the E. S. Central. Ha: There is a difference in proportion of lyme death cases in the E. S. Central.
1
## [1] 1
1/4
## [1] 0.25
null.probs = c(0.25, 0.25, 0.25, 0.25)
caseesc1 = c(0, 1, 0, 0)
chisq.test(caseesc1, p=null.probs)
## Warning in chisq.test(caseesc1, p = null.probs): Chi-squared approximation may
## be incorrect
##
## Chi-squared test for given probabilities
##
## data: caseesc1
## X-squared = 3, df = 3, p-value = 0.3916
Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the E. S. Central.
wsc1 <- c("Arkansas", "Louisiana", "Oklahoma", "Texas")
casewsc1 <- c(0, 0, 0, 0)
df7 <- data.frame(wsc1, casewsc1)
df7
## wsc1 casewsc1
## 1 Arkansas 0
## 2 Louisiana 0
## 3 Oklahoma 0
## 4 Texas 0
plot7 <- df7%>%
ggplot(aes(x=wsc1, y = casewsc1, fill = wsc1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 360)) +
ggtitle("Number of Death by Lyme in W. S. Central")+
ylab("Reported Cases") +
xlab("States")
plot7
Ho: There is no difference in proportion of lyme death cases in the W. S. Central. Ha: There is a difference in proportion of lyme death cases in the W. S. Central.
Answer: With the eyeball we can say, since all the values are the same, the p-value = 1. Consequently, p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the W. S. Central.
m1 <- c("Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah", "Wyoming")
casem1 <- c(0, 0, 0, 0, 1, 0, 0, 0)
df8 <- data.frame(m1, casem1)
df8
## m1 casem1
## 1 Arizona 0
## 2 Colorado 0
## 3 Idaho 0
## 4 Montana 0
## 5 Nevada 1
## 6 New Mexico 0
## 7 Utah 0
## 8 Wyoming 0
plot8 <- df8%>%
ggplot(aes(x=m1, y = casem1, fill = m1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 30)) +
ggtitle("Number of Death by Lyme in Mountain")+
ylab("Reported Cases") +
xlab("States")
plot8
Ho: There is no difference in proportion of lyme death cases in the Mountain. Ha: There is a difference in proportion of lyme death cases in the Mountain.
1
## [1] 1
1/8
## [1] 0.125
null.probs = c(0.1, 0.1, 0.1, 0.1, 0.3, 0.1, 0.1, 0.1)
casem1 = c(0, 0, 0, 0, 1, 0, 0, 0)
chisq.test(casem1, p=null.probs)
## Warning in chisq.test(casem1, p = null.probs): Chi-squared approximation may be
## incorrect
##
## Chi-squared test for given probabilities
##
## data: casem1
## X-squared = 2.3333, df = 7, p-value = 0.9391
Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the Mountain.
p1 <- c("Alaska", "California", "Hawaii", "Oregon", "Washington")
casep1 <- c(0, 1, 0, 1, 0)
df9 <- data.frame(p1, casep1)
df9
## p1 casep1
## 1 Alaska 0
## 2 California 1
## 3 Hawaii 0
## 4 Oregon 1
## 5 Washington 0
plot9 <- df9%>%
ggplot(aes(x=p1, y = casep1, fill = p1)) +
geom_bar(stat = "identity", position = "dodge") +
theme(axis.text.x = element_text(angle = 360)) +
ggtitle("Number of Death by Lyme in Pacific")+
ylab("Reported Cases") +
xlab("States")
plot9
Ho: There is no difference in proportion of lyme death cases in the Pacific. Ha: There is a difference in proportion of lyme death cases in the Pacific.
1+1
## [1] 2
2/5
## [1] 0.4
null.probs = c(0.1, 0.35, 0.1, 0.35, 0.1)
casep1 = c(0, 1, 0, 1, 0)
chisq.test(casep1, p=null.probs)
## Warning in chisq.test(casep1, p = null.probs): Chi-squared approximation may be
## incorrect
##
## Chi-squared test for given probabilities
##
## data: casep1
## X-squared = 0.85714, df = 4, p-value = 0.9306
Answer: p-value > α. Fail to reject the null. There is no compelling evidence that the proportion of lyme death cases changed in the Pacific.
MidAtlantic <- c(186, 0.4744898)
Others1 <- c(206, 0.5255102)
US <- c(392, 1)
testma <- data.frame(MidAtlantic, Others1, US)
testma
## MidAtlantic Others1 US
## 1 186.0000000 206.0000000 392
## 2 0.4744898 0.5255102 1
Answer: The Mid-Atlantic resident have 0.9029126 times higher risk of dying of lyme disease than other US resident.
Pacific <- c(2, 0.005102041)
Others2 <- c(390, 0.994898)
US <- c(392, 1)
testp <- data.frame(Pacific, Others2, US)
testp
## Pacific Others2 US
## 1 2.000000000 390.000000 392
## 2 0.005102041 0.994898 1
Answer: The Pacific resident have 0.005128205 times higher risk of dying of lyme disease than other US resident.
library(tidyverse)
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lymefits1 <- read_csv("lymefits1.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## Reporting_Area = col_character(),
## MMWR_Year = col_double(),
## MMWR_Week = col_double(),
## `Lyme disease, Previous 52 weeks Med` = col_double(),
## `Lyme disease, Cum 2014` = col_double()
## )
lymefits1a <- lymefits1 %>%
rename(
areas = Reporting_Area,
counts = `Lyme disease, Previous 52 weeks Med`,
week = MMWR_Week
)
lymefits1a
## # A tibble: 2,652 x 5
## areas MMWR_Year week counts `Lyme disease, Cum 2014`
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 ALABAMA 2015 1 0 NA
## 2 ALASKA 2015 1 0 NA
## 3 ARIZONA 2015 1 0 NA
## 4 ARKANSAS 2015 1 0 NA
## 5 CALIFORNIA 2015 1 0 1
## 6 COLORADO 2015 1 0 NA
## 7 CONNECTICUT 2015 1 30 11
## 8 DELAWARE 2015 1 6 3
## 9 DIST. OF COL. 2015 1 0 NA
## 10 FLORIDA 2015 1 2 1
## # ... with 2,642 more rows
glimpse(lymefits1)
## Rows: 2,652
## Columns: 5
## $ Reporting_Area <chr> "ALABAMA", "ALASKA", "ARIZONA...
## $ MMWR_Year <dbl> 2015, 2015, 2015, 2015, 2015,...
## $ MMWR_Week <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ `Lyme disease, Previous 52 weeks Med` <dbl> 0, 0, 0, 0, 0, 0, 30, 6, 0, 2...
## $ `Lyme disease, Cum 2014` <dbl> NA, NA, NA, NA, 1, NA, 11, 3,...
View(lymefits1)
fit1 <- lm(counts ~ week + areas, data=lymefits1a)
summary(fit1)
##
## Call:
## lm(formula = counts ~ week + areas, data = lymefits1a)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.409 -0.101 -0.002 0.081 32.947
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.623015 0.332482 1.874 0.061066 .
## week -0.003191 0.003011 -1.060 0.289451
## areasALASKA -0.538462 0.456456 -1.180 0.238244
## areasARIZONA -0.538462 0.456456 -1.180 0.238244
## areasARKANSAS -0.538462 0.456456 -1.180 0.238244
## areasCALIFORNIA 0.019231 0.456456 0.042 0.966398
## areasCOLORADO -0.538462 0.456456 -1.180 0.238244
## areasCONNECTICUT 24.884615 0.456456 54.517 < 2e-16 ***
## areasDELAWARE 4.903846 0.456456 10.743 < 2e-16 ***
## areasDIST. OF COL. 0.423077 0.456456 0.927 0.354078
## areasFLORIDA 1.730769 0.456456 3.792 0.000153 ***
## areasGEORGIA -0.538462 0.456456 -1.180 0.238244
## areasHAWAII -0.538462 0.456456 -1.180 0.238244
## areasIDAHO -0.538462 0.456456 -1.180 0.238244
## areasILLINOIS 0.730769 0.456456 1.601 0.109506
## areasINDIANA -0.019231 0.456456 -0.042 0.966398
## areasIOWA 0.596154 0.456456 1.306 0.191651
## areasKANSAS -0.538462 0.456456 -1.180 0.238244
## areasKENTUCKY -0.538462 0.456456 -1.180 0.238244
## areasLOUISIANA -0.538462 0.456456 -1.180 0.238244
## areasMAINE 13.576923 0.456456 29.744 < 2e-16 ***
## areasMARYLAND 19.538462 0.456456 42.805 < 2e-16 ***
## areasMASSACHUSETTS 37.942308 0.456456 83.124 < 2e-16 ***
## areasMICHIGAN 0.519231 0.456456 1.138 0.255423
## areasMINNESOTA 0.576923 0.456456 1.264 0.206372
## areasMISSISSIPPI -0.538462 0.456456 -1.180 0.238244
## areasMISSOURI -0.538462 0.456456 -1.180 0.238244
## areasMONTANA -0.538462 0.456456 -1.180 0.238244
## areasNEBRASKA -0.538462 0.456456 -1.180 0.238244
## areasNEVADA -0.538462 0.456456 -1.180 0.238244
## areasNEW HAMPSHIRE 3.961538 0.456456 8.679 < 2e-16 ***
## areasNEW JERSEY 39.500000 0.456456 86.536 < 2e-16 ***
## areasNEW MEXICO -0.538462 0.456456 -1.180 0.238244
## areasNEW YORK 40.846154 0.456456 89.486 < 2e-16 ***
## areasNORTH CAROLINA -0.384615 0.456456 -0.843 0.399522
## areasNORTH DAKOTA -0.538462 0.456456 -1.180 0.238244
## areasOHIO 1.211538 0.456456 2.654 0.007997 **
## areasOKLAHOMA -0.538462 0.456456 -1.180 0.238244
## areasOREGON 0.307692 0.456456 0.674 0.500314
## areasPENNSYLVANIA 88.596154 0.456456 194.096 < 2e-16 ***
## areasRHODE ISLAND 10.076923 0.456456 22.076 < 2e-16 ***
## areasSOUTH CAROLINA 0.423077 0.456456 0.927 0.354078
## areasSOUTH DAKOTA -0.538462 0.456456 -1.180 0.238244
## areasTENNESSEE -0.538462 0.456456 -1.180 0.238244
## areasTEXAS -0.403846 0.456456 -0.885 0.376377
## areasUTAH -0.538462 0.456456 -1.180 0.238244
## areasVERMONT 6.115385 0.456456 13.398 < 2e-16 ***
## areasVIRGINIA 18.750000 0.456456 41.077 < 2e-16 ***
## areasWASHINGTON -0.538462 0.456456 -1.180 0.238244
## areasWEST VIRGINIA 1.307692 0.456456 2.865 0.004205 **
## areasWISCONSIN 8.807692 0.456456 19.296 < 2e-16 ***
## areasWYOMING -0.538462 0.456456 -1.180 0.238244
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.327 on 2600 degrees of freedom
## Multiple R-squared: 0.9786, Adjusted R-squared: 0.9782
## F-statistic: 2337 on 51 and 2600 DF, p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(fit1)
Answer: 97.82% of the variations in the observations may be explained by this model. Consequently, there is a correlation between the lyme disease death and the area + the week. We can also say, thanks to the Normal Q-Q plot, that the lyme disease death cases in the US’s States are not normally distributed (categorical datas).
library(tidyverse)
setwd("C:/Users/auria/OneDrive/Bureau/MATH 217/Final Project")
lymefits2 <- read.csv("lymefits2.csv")
glimpse (lymefits2)
## Rows: 520
## Columns: 5
## $ Reporting_Area <chr> "E.N. CENTRAL", "E.S. CENTRAL", ...
## $ MMWR.Year <int> 2015, 2015, 2015, 2015, 2015, 20...
## $ MMWR_Week <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,...
## $ Lyme.disease..Previous.52.week.Med <int> 13, 1, 186, 1, 124, 2, 44, 392, ...
## $ Lyme.disease..Cum.2014 <int> 8, 1, 77, 1, 53, 2, 11, 156, 3, ...
View(lymefits2)
lymefits2a <- lymefits2 %>%
rename(
areas = Reporting_Area,
counts = `Lyme.disease..Previous.52.week.Med`,
week = MMWR_Week
)
lymefits2a
## areas MMWR.Year week counts Lyme.disease..Cum.2014
## 1 E.N. CENTRAL 2015 1 13 8
## 2 E.S. CENTRAL 2015 1 1 1
## 3 MID. ATLANTIC 2015 1 186 77
## 4 MOUNTAIN 2015 1 1 1
## 5 NEW ENGLAND 2015 1 124 53
## 6 PACIFIC 2015 1 2 2
## 7 S. ATLANTIC 2015 1 44 11
## 8 UNITED STATES 2015 1 392 156
## 9 W.N. CENTRAL 2015 1 2 3
## 10 W.S. CENTRAL 2015 1 0 NA
## 11 E.N. CENTRAL 2015 2 13 17
## 12 E.S. CENTRAL 2015 2 1 1
## 13 MID. ATLANTIC 2015 2 187 143
## 14 MOUNTAIN 2015 2 1 2
## 15 NEW ENGLAND 2015 2 124 126
## 16 PACIFIC 2015 2 2 3
## 17 S. ATLANTIC 2015 2 45 28
## 18 UNITED STATES 2015 2 393 323
## 19 W.N. CENTRAL 2015 2 2 3
## 20 W.S. CENTRAL 2015 2 0 NA
## 21 E.N. CENTRAL 2015 3 13 22
## 22 E.S. CENTRAL 2015 3 1 1
## 23 MID. ATLANTIC 2015 3 187 204
## 24 MOUNTAIN 2015 3 1 2
## 25 NEW ENGLAND 2015 3 121 252
## 26 PACIFIC 2015 3 2 4
## 27 S. ATLANTIC 2015 3 48 41
## 28 UNITED STATES 2015 3 400 530
## 29 W.N. CENTRAL 2015 3 2 4
## 30 W.S. CENTRAL 2015 3 0 NA
## 31 E.N. CENTRAL 2015 4 13 27
## 32 E.S. CENTRAL 2015 4 2 2
## 33 MID. ATLANTIC 2015 4 187 273
## 34 MOUNTAIN 2015 4 1 3
## 35 NEW ENGLAND 2015 4 121 297
## 36 PACIFIC 2015 4 2 2
## 37 S. ATLANTIC 2015 4 48 51
## 38 UNITED STATES 2015 4 401 659
## 39 W.N. CENTRAL 2015 4 2 4
## 40 W.S. CENTRAL 2015 4 0 NA
## 41 E.N. CENTRAL 2015 5 13 34
## 42 E.S. CENTRAL 2015 5 2 2
## 43 MID. ATLANTIC 2015 5 187 360
## 44 MOUNTAIN 2015 5 1 3
## 45 NEW ENGLAND 2015 5 121 360
## 46 PACIFIC 2015 5 2 2
## 47 S. ATLANTIC 2015 5 48 65
## 48 UNITED STATES 2015 5 402 831
## 49 W.N. CENTRAL 2015 5 2 5
## 50 W.S. CENTRAL 2015 5 0 NA
## 51 E.N. CENTRAL 2015 6 15 38
## 52 E.S. CENTRAL 2015 6 2 2
## 53 MID. ATLANTIC 2015 6 187 425
## 54 MOUNTAIN 2015 6 1 4
## 55 NEW ENGLAND 2015 6 122 418
## 56 PACIFIC 2015 6 2 3
## 57 S. ATLANTIC 2015 6 52 88
## 58 UNITED STATES 2015 6 405 984
## 59 W.N. CENTRAL 2015 6 2 6
## 60 W.S. CENTRAL 2015 6 0 NA
## 61 E.N. CENTRAL 2015 7 15 40
## 62 E.S. CENTRAL 2015 7 2 2
## 63 MID. ATLANTIC 2015 7 187 488
## 64 MOUNTAIN 2015 7 1 4
## 65 NEW ENGLAND 2015 7 122 465
## 66 PACIFIC 2015 7 2 3
## 67 S. ATLANTIC 2015 7 52 104
## 68 UNITED STATES 2015 7 406 1112
## 69 W.N. CENTRAL 2015 7 2 6
## 70 W.S. CENTRAL 2015 7 0 NA
## 71 E.N. CENTRAL 2015 8 15 42
## 72 E.S. CENTRAL 2015 8 2 3
## 73 MID. ATLANTIC 2015 8 189 567
## 74 MOUNTAIN 2015 8 1 4
## 75 NEW ENGLAND 2015 8 122 512
## 76 PACIFIC 2015 8 2 4
## 77 S. ATLANTIC 2015 8 53 115
## 78 UNITED STATES 2015 8 408 1253
## 79 W.N. CENTRAL 2015 8 2 6
## 80 W.S. CENTRAL 2015 8 0 NA
## 81 E.N. CENTRAL 2015 9 15 53
## 82 E.S. CENTRAL 2015 9 2 3
## 83 MID. ATLANTIC 2015 9 191 681
## 84 MOUNTAIN 2015 9 1 4
## 85 NEW ENGLAND 2015 9 125 560
## 86 PACIFIC 2015 9 2 3
## 87 S. ATLANTIC 2015 9 53 132
## 88 UNITED STATES 2015 9 413 1443
## 89 W.N. CENTRAL 2015 9 2 7
## 90 W.S. CENTRAL 2015 9 0 NA
## 91 E.N. CENTRAL 2015 10 15 62
## 92 E.S. CENTRAL 2015 10 2 4
## 93 MID. ATLANTIC 2015 10 192 776
## 94 MOUNTAIN 2015 10 1 6
## 95 NEW ENGLAND 2015 10 130 617
## 96 PACIFIC 2015 10 2 8
## 97 S. ATLANTIC 2015 10 55 145
## 98 UNITED STATES 2015 10 419 1627
## 99 W.N. CENTRAL 2015 10 2 9
## 100 W.S. CENTRAL 2015 10 0 NA
## 101 E.N. CENTRAL 2015 11 15 71
## 102 E.S. CENTRAL 2015 11 2 4
## 103 MID. ATLANTIC 2015 11 192 844
## 104 MOUNTAIN 2015 11 1 6
## 105 NEW ENGLAND 2015 11 130 689
## 106 PACIFIC 2015 11 2 9
## 107 S. ATLANTIC 2015 11 56 160
## 108 UNITED STATES 2015 11 420 1792
## 109 W.N. CENTRAL 2015 11 2 9
## 110 W.S. CENTRAL 2015 11 0 NA
## 111 E.N. CENTRAL 2015 12 15 73
## 112 E.S. CENTRAL 2015 12 2 4
## 113 MID. ATLANTIC 2015 12 192 931
## 114 MOUNTAIN 2015 12 1 7
## 115 NEW ENGLAND 2015 12 130 753
## 116 PACIFIC 2015 12 2 12
## 117 S. ATLANTIC 2015 12 56 185
## 118 UNITED STATES 2015 12 421 1974
## 119 W.N. CENTRAL 2015 12 2 9
## 120 W.S. CENTRAL 2015 12 0 NA
## 121 E.N. CENTRAL 2015 13 15 77
## 122 E.S. CENTRAL 2015 13 2 5
## 123 MID. ATLANTIC 2015 13 192 1001
## 124 MOUNTAIN 2015 13 1 8
## 125 NEW ENGLAND 2015 13 131 806
## 126 PACIFIC 2015 13 2 13
## 127 S. ATLANTIC 2015 13 57 210
## 128 UNITED STATES 2015 13 421 2130
## 129 W.N. CENTRAL 2015 13 2 10
## 130 W.S. CENTRAL 2015 13 0 NA
## 131 E.N. CENTRAL 2015 14 15 90
## 132 E.S. CENTRAL 2015 14 2 7
## 133 MID. ATLANTIC 2015 14 192 1132
## 134 MOUNTAIN 2015 14 1 8
## 135 NEW ENGLAND 2015 14 134 896
## 136 PACIFIC 2015 14 2 13
## 137 S. ATLANTIC 2015 14 57 236
## 138 UNITED STATES 2015 14 423 2389
## 139 W.N. CENTRAL 2015 14 2 7
## 140 W.S. CENTRAL 2015 14 0 NA
## 141 E.N. CENTRAL 2015 15 15 102
## 142 E.S. CENTRAL 2015 15 2 8
## 143 MID. ATLANTIC 2015 15 192 1213
## 144 MOUNTAIN 2015 15 1 8
## 145 NEW ENGLAND 2015 15 134 953
## 146 PACIFIC 2015 15 2 15
## 147 S. ATLANTIC 2015 15 57 258
## 148 UNITED STATES 2015 15 424 2564
## 149 W.N. CENTRAL 2015 15 2 7
## 150 W.S. CENTRAL 2015 15 0 NA
## 151 E.N. CENTRAL 2015 16 14 116
## 152 E.S. CENTRAL 2015 16 2 10
## 153 MID. ATLANTIC 2015 16 192 1315
## 154 MOUNTAIN 2015 16 1 8
## 155 NEW ENGLAND 2015 16 137 1064
## 156 PACIFIC 2015 16 2 16
## 157 S. ATLANTIC 2015 16 57 293
## 158 UNITED STATES 2015 16 428 2830
## 159 W.N. CENTRAL 2015 16 3 8
## 160 W.S. CENTRAL 2015 16 0 NA
## 161 E.N. CENTRAL 2015 17 16 138
## 162 E.S. CENTRAL 2015 17 2 14
## 163 MID. ATLANTIC 2015 17 187 1863
## 164 MOUNTAIN 2015 17 1 10
## 165 NEW ENGLAND 2015 17 137 1154
## 166 PACIFIC 2015 17 2 20
## 167 S. ATLANTIC 2015 17 57 326
## 168 UNITED STATES 2015 17 413 3534
## 169 W.N. CENTRAL 2015 17 2 9
## 170 W.S. CENTRAL 2015 17 1 NA
## 171 E.N. CENTRAL 2015 18 16 153
## 172 E.S. CENTRAL 2015 18 2 16
## 173 MID. ATLANTIC 2015 18 201 2045
## 174 MOUNTAIN 2015 18 1 10
## 175 NEW ENGLAND 2015 18 129 1295
## 176 PACIFIC 2015 18 2 22
## 177 S. ATLANTIC 2015 18 58 372
## 178 UNITED STATES 2015 18 437 4029
## 179 W.N. CENTRAL 2015 18 10 115
## 180 W.S. CENTRAL 2015 18 0 1
## 181 E.N. CENTRAL 2015 19 13 172
## 182 E.S. CENTRAL 2015 19 2 21
## 183 MID. ATLANTIC 2015 19 201 2284
## 184 MOUNTAIN 2015 19 1 11
## 185 NEW ENGLAND 2015 19 122 1430
## 186 PACIFIC 2015 19 2 24
## 187 S. ATLANTIC 2015 19 58 403
## 188 UNITED STATES 2015 19 448 4517
## 189 W.N. CENTRAL 2015 19 10 169
## 190 W.S. CENTRAL 2015 19 0 3
## 191 E.N. CENTRAL 2015 20 12 201
## 192 E.S. CENTRAL 2015 20 2 23
## 193 MID. ATLANTIC 2015 20 206 2474
## 194 MOUNTAIN 2015 20 1 13
## 195 NEW ENGLAND 2015 20 123 1616
## 196 PACIFIC 2015 20 2 25
## 197 S. ATLANTIC 2015 20 58 442
## 198 UNITED STATES 2015 20 456 4993
## 199 W.N. CENTRAL 2015 20 11 196
## 200 W.S. CENTRAL 2015 20 0 3
## 201 E.N. CENTRAL 2015 21 12 248
## 202 E.S. CENTRAL 2015 21 1 25
## 203 MID. ATLANTIC 2015 21 196 1914
## 204 MOUNTAIN 2015 21 1 16
## 205 NEW ENGLAND 2015 21 114 1808
## 206 PACIFIC 2015 21 2 32
## 207 S. ATLANTIC 2015 21 58 478
## 208 UNITED STATES 2015 21 428 4746
## 209 W.N. CENTRAL 2015 21 10 221
## 210 W.S. CENTRAL 2015 21 0 4
## 211 E.N. CENTRAL 2015 22 14 296
## 212 E.S. CENTRAL 2015 22 1 30
## 213 MID. ATLANTIC 2015 22 196 2095
## 214 MOUNTAIN 2015 22 1 16
## 215 NEW ENGLAND 2015 22 100 2037
## 216 PACIFIC 2015 22 2 36
## 217 S. ATLANTIC 2015 22 59 522
## 218 UNITED STATES 2015 22 421 5297
## 219 W.N. CENTRAL 2015 22 9 260
## 220 W.S. CENTRAL 2015 22 0 5
## 221 E.N. CENTRAL 2015 23 14 380
## 222 E.S. CENTRAL 2015 23 1 35
## 223 MID. ATLANTIC 2015 23 185 2446
## 224 MOUNTAIN 2015 23 1 18
## 225 NEW ENGLAND 2015 23 82 2416
## 226 PACIFIC 2015 23 2 38
## 227 S. ATLANTIC 2015 23 61 585
## 228 UNITED STATES 2015 23 406 6254
## 229 W.N. CENTRAL 2015 23 7 330
## 230 W.S. CENTRAL 2015 23 0 6
## 231 E.N. CENTRAL 2015 24 13 476
## 232 E.S. CENTRAL 2015 24 1 42
## 233 MID. ATLANTIC 2015 24 190 2854
## 234 MOUNTAIN 2015 24 1 18
## 235 NEW ENGLAND 2015 24 79 2848
## 236 PACIFIC 2015 24 2 42
## 237 S. ATLANTIC 2015 24 61 682
## 238 UNITED STATES 2015 24 383 7367
## 239 W.N. CENTRAL 2015 24 6 399
## 240 W.S. CENTRAL 2015 24 0 6
## 241 E.N. CENTRAL 2015 25 16 586
## 242 E.S. CENTRAL 2015 25 1 50
## 243 MID. ATLANTIC 2015 25 196 3375
## 244 MOUNTAIN 2015 25 1 20
## 245 NEW ENGLAND 2015 25 85 3460
## 246 PACIFIC 2015 25 2 47
## 247 S. ATLANTIC 2015 25 61 807
## 248 UNITED STATES 2015 25 411 8843
## 249 W.N. CENTRAL 2015 25 6 491
## 250 W.S. CENTRAL 2015 25 0 7
## 251 E.N. CENTRAL 2015 26 16 709
## 252 E.S. CENTRAL 2015 26 1 59
## 253 MID. ATLANTIC 2015 26 196 4113
## 254 MOUNTAIN 2015 26 1 22
## 255 NEW ENGLAND 2015 26 93 4031
## 256 PACIFIC 2015 26 2 54
## 257 S. ATLANTIC 2015 26 61 940
## 258 UNITED STATES 2015 26 421 10507
## 259 W.N. CENTRAL 2015 26 6 572
## 260 W.S. CENTRAL 2015 26 0 7
## 261 E.N. CENTRAL 2015 27 16 859
## 262 E.S. CENTRAL 2015 27 1 61
## 263 MID. ATLANTIC 2015 27 196 4861
## 264 MOUNTAIN 2015 27 1 22
## 265 NEW ENGLAND 2015 27 103 4765
## 266 PACIFIC 2015 27 2 61
## 267 S. ATLANTIC 2015 27 61 1066
## 268 UNITED STATES 2015 27 421 12399
## 269 W.N. CENTRAL 2015 27 6 695
## 270 W.S. CENTRAL 2015 27 0 9
## 271 E.N. CENTRAL 2015 28 16 1005
## 272 E.S. CENTRAL 2015 28 1 65
## 273 MID. ATLANTIC 2015 28 196 5724
## 274 MOUNTAIN 2015 28 1 24
## 275 NEW ENGLAND 2015 28 104 5414
## 276 PACIFIC 2015 28 2 68
## 277 S. ATLANTIC 2015 28 61 1215
## 278 UNITED STATES 2015 28 421 14332
## 279 W.N. CENTRAL 2015 28 6 808
## 280 W.S. CENTRAL 2015 28 0 9
## 281 E.N. CENTRAL 2015 29 16 1121
## 282 E.S. CENTRAL 2015 29 1 70
## 283 MID. ATLANTIC 2015 29 196 6603
## 284 MOUNTAIN 2015 29 1 27
## 285 NEW ENGLAND 2015 29 112 6180
## 286 PACIFIC 2015 29 2 76
## 287 S. ATLANTIC 2015 29 61 1328
## 288 UNITED STATES 2015 29 428 16351
## 289 W.N. CENTRAL 2015 29 9 937
## 290 W.S. CENTRAL 2015 29 0 9
## 291 E.N. CENTRAL 2015 30 14 1252
## 292 E.S. CENTRAL 2015 30 1 74
## 293 MID. ATLANTIC 2015 30 196 7417
## 294 MOUNTAIN 2015 30 1 29
## 295 NEW ENGLAND 2015 30 115 6819
## 296 PACIFIC 2015 30 2 83
## 297 S. ATLANTIC 2015 30 61 1492
## 298 UNITED STATES 2015 30 428 18207
## 299 W.N. CENTRAL 2015 30 9 1026
## 300 W.S. CENTRAL 2015 30 0 15
## 301 E.N. CENTRAL 2015 31 13 1360
## 302 E.S. CENTRAL 2015 31 1 80
## 303 MID. ATLANTIC 2015 31 196 8286
## 304 MOUNTAIN 2015 31 1 31
## 305 NEW ENGLAND 2015 31 118 7355
## 306 PACIFIC 2015 31 2 86
## 307 S. ATLANTIC 2015 31 61 1652
## 308 UNITED STATES 2015 31 428 19985
## 309 W.N. CENTRAL 2015 31 9 1119
## 310 W.S. CENTRAL 2015 31 0 16
## 311 E.N. CENTRAL 2015 32 13 1438
## 312 E.S. CENTRAL 2015 32 1 83
## 313 MID. ATLANTIC 2015 32 196 8931
## 314 MOUNTAIN 2015 32 1 35
## 315 NEW ENGLAND 2015 32 118 7797
## 316 PACIFIC 2015 32 2 90
## 317 S. ATLANTIC 2015 32 61 1780
## 318 UNITED STATES 2015 32 428 21352
## 319 W.N. CENTRAL 2015 32 10 1179
## 320 W.S. CENTRAL 2015 32 0 19
## 321 E.N. CENTRAL 2015 33 12 1517
## 322 E.S. CENTRAL 2015 33 1 92
## 323 MID. ATLANTIC 2015 33 196 9534
## 324 MOUNTAIN 2015 33 1 35
## 325 NEW ENGLAND 2015 33 115 8211
## 326 PACIFIC 2015 33 2 95
## 327 S. ATLANTIC 2015 33 61 1975
## 328 UNITED STATES 2015 33 428 22714
## 329 W.N. CENTRAL 2015 33 9 1235
## 330 W.S. CENTRAL 2015 33 0 20
## 331 E.N. CENTRAL 2015 34 12 1553
## 332 E.S. CENTRAL 2015 34 1 102
## 333 MID. ATLANTIC 2015 34 196 10024
## 334 MOUNTAIN 2015 34 1 43
## 335 NEW ENGLAND 2015 34 111 8568
## 336 PACIFIC 2015 34 2 100
## 337 S. ATLANTIC 2015 34 61 2074
## 338 UNITED STATES 2015 34 428 23763
## 339 W.N. CENTRAL 2015 34 7 1278
## 340 W.S. CENTRAL 2015 34 0 21
## 341 E.N. CENTRAL 2015 35 11 1597
## 342 E.S. CENTRAL 2015 35 1 107
## 343 MID. ATLANTIC 2015 35 196 10464
## 344 MOUNTAIN 2015 35 1 45
## 345 NEW ENGLAND 2015 35 110 8832
## 346 PACIFIC 2015 35 2 108
## 347 S. ATLANTIC 2015 35 61 2229
## 348 UNITED STATES 2015 35 428 24709
## 349 W.N. CENTRAL 2015 35 7 1305
## 350 W.S. CENTRAL 2015 35 0 22
## 351 E.N. CENTRAL 2015 36 11 1638
## 352 E.S. CENTRAL 2015 36 1 108
## 353 MID. ATLANTIC 2015 36 196 10794
## 354 MOUNTAIN 2015 36 1 47
## 355 NEW ENGLAND 2015 36 111 9074
## 356 PACIFIC 2015 36 2 110
## 357 S. ATLANTIC 2015 36 61 2353
## 358 UNITED STATES 2015 36 428 25491
## 359 W.N. CENTRAL 2015 36 6 1345
## 360 W.S. CENTRAL 2015 36 0 22
## 361 E.N. CENTRAL 2015 37 12 1675
## 362 E.S. CENTRAL 2015 37 1 110
## 363 MID. ATLANTIC 2015 37 197 11152
## 364 MOUNTAIN 2015 37 1 48
## 365 NEW ENGLAND 2015 37 106 9311
## 366 PACIFIC 2015 37 2 110
## 367 S. ATLANTIC 2015 37 61 2451
## 368 UNITED STATES 2015 37 421 26266
## 369 W.N. CENTRAL 2015 37 6 1385
## 370 W.S. CENTRAL 2015 37 0 24
## 371 E.N. CENTRAL 2015 38 12 1708
## 372 E.S. CENTRAL 2015 38 1 111
## 373 MID. ATLANTIC 2015 38 198 11472
## 374 MOUNTAIN 2015 38 1 50
## 375 NEW ENGLAND 2015 38 102 9534
## 376 PACIFIC 2015 38 2 113
## 377 S. ATLANTIC 2015 38 61 2534
## 378 UNITED STATES 2015 38 407 26974
## 379 W.N. CENTRAL 2015 38 5 1427
## 380 W.S. CENTRAL 2015 38 0 25
## 381 E.N. CENTRAL 2015 39 12 1745
## 382 E.S. CENTRAL 2015 39 1 113
## 383 MID. ATLANTIC 2015 39 196 11745
## 384 MOUNTAIN 2015 39 1 50
## 385 NEW ENGLAND 2015 39 100 9689
## 386 PACIFIC 2015 39 2 116
## 387 S. ATLANTIC 2015 39 61 2633
## 388 UNITED STATES 2015 39 407 27577
## 389 W.N. CENTRAL 2015 39 4 1460
## 390 W.S. CENTRAL 2015 39 0 26
## 391 E.N. CENTRAL 2015 40 12 1787
## 392 E.S. CENTRAL 2015 40 1 114
## 393 MID. ATLANTIC 2015 40 198 12020
## 394 MOUNTAIN 2015 40 1 51
## 395 NEW ENGLAND 2015 40 99 9855
## 396 PACIFIC 2015 40 2 120
## 397 S. ATLANTIC 2015 40 61 2729
## 398 UNITED STATES 2015 40 407 28198
## 399 W.N. CENTRAL 2015 40 5 1494
## 400 W.S. CENTRAL 2015 40 0 28
## 401 E.N. CENTRAL 2015 41 14 1822
## 402 E.S. CENTRAL 2015 41 1 114
## 403 MID. ATLANTIC 2015 41 198 12262
## 404 MOUNTAIN 2015 41 1 51
## 405 NEW ENGLAND 2015 41 95 10003
## 406 PACIFIC 2015 41 2 121
## 407 S. ATLANTIC 2015 41 58 2791
## 408 UNITED STATES 2015 41 413 28722
## 409 W.N. CENTRAL 2015 41 5 1530
## 410 W.S. CENTRAL 2015 41 1 28
## 411 E.N. CENTRAL 2015 42 14 1843
## 412 E.S. CENTRAL 2015 42 1 117
## 413 MID. ATLANTIC 2015 42 192 12465
## 414 MOUNTAIN 2015 42 1 52
## 415 NEW ENGLAND 2015 42 94 10209
## 416 PACIFIC 2015 42 2 123
## 417 S. ATLANTIC 2015 42 58 2842
## 418 UNITED STATES 2015 42 407 29233
## 419 W.N. CENTRAL 2015 42 5 1553
## 420 W.S. CENTRAL 2015 42 0 29
## 421 E.N. CENTRAL 2015 43 13 1857
## 422 E.S. CENTRAL 2015 43 1 119
## 423 MID. ATLANTIC 2015 43 200 12643
## 424 MOUNTAIN 2015 43 1 53
## 425 NEW ENGLAND 2015 43 85 10350
## 426 PACIFIC 2015 43 2 127
## 427 S. ATLANTIC 2015 43 58 2890
## 428 UNITED STATES 2015 43 400 29648
## 429 W.N. CENTRAL 2015 43 5 1580
## 430 W.S. CENTRAL 2015 43 0 29
## 431 E.N. CENTRAL 2015 44 12 1876
## 432 E.S. CENTRAL 2015 44 2 120
## 433 MID. ATLANTIC 2015 44 200 12952
## 434 MOUNTAIN 2015 44 1 53
## 435 NEW ENGLAND 2015 44 91 10507
## 436 PACIFIC 2015 44 3 128
## 437 S. ATLANTIC 2015 44 62 2939
## 438 UNITED STATES 2015 44 403 30195
## 439 W.N. CENTRAL 2015 44 4 1590
## 440 W.S. CENTRAL 2015 44 0 30
## 441 E.N. CENTRAL 2015 45 13 1893
## 442 E.S. CENTRAL 2015 45 2 120
## 443 MID. ATLANTIC 2015 45 209 13156
## 444 MOUNTAIN 2015 45 1 55
## 445 NEW ENGLAND 2015 45 84 10632
## 446 PACIFIC 2015 45 3 128
## 447 S. ATLANTIC 2015 45 64 2998
## 448 UNITED STATES 2015 45 385 30622
## 449 W.N. CENTRAL 2015 45 4 1609
## 450 W.S. CENTRAL 2015 45 0 31
## 451 E.N. CENTRAL 2015 46 14 1905
## 452 E.S. CENTRAL 2015 46 2 122
## 453 MID. ATLANTIC 2015 46 218 13375
## 454 MOUNTAIN 2015 46 1 55
## 455 NEW ENGLAND 2015 46 77 10753
## 456 PACIFIC 2015 46 3 129
## 457 S. ATLANTIC 2015 46 66 3061
## 458 UNITED STATES 2015 46 377 31051
## 459 W.N. CENTRAL 2015 46 4 1620
## 460 W.S. CENTRAL 2015 46 0 31
## 461 E.N. CENTRAL 2015 47 14 1914
## 462 E.S. CENTRAL 2015 47 2 122
## 463 MID. ATLANTIC 2015 47 219 13564
## 464 MOUNTAIN 2015 47 1 56
## 465 NEW ENGLAND 2015 47 79 10877
## 466 PACIFIC 2015 47 3 131
## 467 S. ATLANTIC 2015 47 62 3187
## 468 UNITED STATES 2015 47 384 31512
## 469 W.N. CENTRAL 2015 47 4 1630
## 470 W.S. CENTRAL 2015 47 0 31
## 471 E.N. CENTRAL 2015 48 18 1916
## 472 E.S. CENTRAL 2015 48 2 123
## 473 MID. ATLANTIC 2015 48 227 13683
## 474 MOUNTAIN 2015 48 1 57
## 475 NEW ENGLAND 2015 48 81 10947
## 476 PACIFIC 2015 48 3 132
## 477 S. ATLANTIC 2015 48 64 3244
## 478 UNITED STATES 2015 48 410 31772
## 479 W.N. CENTRAL 2015 48 4 1639
## 480 W.S. CENTRAL 2015 48 1 31
## 481 E.N. CENTRAL 2015 49 18 1927
## 482 E.S. CENTRAL 2015 49 2 123
## 483 MID. ATLANTIC 2015 49 237 13853
## 484 MOUNTAIN 2015 49 1 58
## 485 NEW ENGLAND 2015 49 81 11041
## 486 PACIFIC 2015 49 3 133
## 487 S. ATLANTIC 2015 49 65 3292
## 488 UNITED STATES 2015 49 424 32105
## 489 W.N. CENTRAL 2015 49 4 1647
## 490 W.S. CENTRAL 2015 49 1 31
## 491 E.N. CENTRAL 2015 50 18 1932
## 492 E.S. CENTRAL 2015 50 2 124
## 493 MID. ATLANTIC 2015 50 241 14034
## 494 MOUNTAIN 2015 50 1 58
## 495 NEW ENGLAND 2015 50 82 11095
## 496 PACIFIC 2015 50 3 135
## 497 S. ATLANTIC 2015 50 62 3410
## 498 UNITED STATES 2015 50 427 32474
## 499 W.N. CENTRAL 2015 50 4 1653
## 500 W.S. CENTRAL 2015 50 1 33
## 501 E.N. CENTRAL 2015 51 20 1939
## 502 E.S. CENTRAL 2015 51 2 125
## 503 MID. ATLANTIC 2015 51 242 14168
## 504 MOUNTAIN 2015 51 1 58
## 505 NEW ENGLAND 2015 51 83 11158
## 506 PACIFIC 2015 51 2 137
## 507 S. ATLANTIC 2015 51 64 3501
## 508 UNITED STATES 2015 51 429 32778
## 509 W.N. CENTRAL 2015 51 3 1658
## 510 W.S. CENTRAL 2015 51 1 34
## 511 E.N. CENTRAL 2015 52 20 1944
## 512 E.S. CENTRAL 2015 52 2 127
## 513 MID. ATLANTIC 2015 52 242 14251
## 514 MOUNTAIN 2015 52 1 58
## 515 NEW ENGLAND 2015 52 83 11184
## 516 PACIFIC 2015 52 2 137
## 517 S. ATLANTIC 2015 52 65 3665
## 518 UNITED STATES 2015 52 431 33063
## 519 W.N. CENTRAL 2015 52 4 1662
## 520 W.S. CENTRAL 2015 52 1 35
fit2 <- lm(counts ~ week + areas, data=lymefits2a)
summary(fit2)
##
## Call:
## lm(formula = counts ~ week + areas, data = lymefits2a)
##
## Residuals:
## Min 1Q Median 3Q Max
## -38.622 -1.323 -0.100 0.987 42.423
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.87534 1.46461 9.474 < 2e-16 ***
## week 0.01414 0.02694 0.525 0.6
## areasE.S. CENTRAL -12.75000 1.80846 -7.050 5.85e-12 ***
## areasMID. ATLANTIC 184.98077 1.80846 102.286 < 2e-16 ***
## areasMOUNTAIN -13.25000 1.80846 -7.327 9.31e-13 ***
## areasNEW ENGLAND 94.03846 1.80846 51.999 < 2e-16 ***
## areasPACIFIC -12.11538 1.80846 -6.699 5.56e-11 ***
## areasS. ATLANTIC 44.05769 1.80846 24.362 < 2e-16 ***
## areasUNITED STATES 401.09615 1.80846 221.789 < 2e-16 ***
## areasW.N. CENTRAL -9.28846 1.80846 -5.136 4.00e-07 ***
## areasW.S. CENTRAL -14.11538 1.80846 -7.805 3.40e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.221 on 509 degrees of freedom
## Multiple R-squared: 0.9949, Adjusted R-squared: 0.9948
## F-statistic: 9952 on 10 and 509 DF, p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(fit2)
Answer: 99.48% of the variations in the observations may be explained by this model. Consequently, there is a correlation between the lyme disease death and the area + the week. We can also say, thanks to the Normal Q-Q plot, that the lyme disease death cases in Us and its regions are not normally distributed (categorical datas).
There is a difference in the proportion of lyme death cases between US regions and within US regions. The regions where there is no difference in the proportion of lyme death cases are W. N. Central (p-value = 0.9595), E. S. Central (p-value = 0.3916), W. N. Central(p-value = 1), Mountain (p-value = 0.9391), Pacific (p-value = 0.9306). The regions where there is a difference in the proportion of lyme death cases are New England (p-value = 8.992e-11), Mid. Atlantic (p-value = 0.0001513), E. N. Central (p-value = 0.0001186), S. Atlantic (9.17e-10).
The odds of dying of the Lyme disease is much greater in the Mid Atlantic compared to others US regions (odds = 47:53) The odds of dying of the Lyme disease is much smaller in the Pacific compared to others US regions (odds = 1:99)
For the US, US’s regions ad US’s states, more than 97% of the variation in the observations may be explained by this model,linear regression, (For US’s states: adjusted R-squared = 0.9782 and for US + US’s regions: adjusted R-squared = 0.9948). There is a correlation between the lyme disease death and the area + the week.
I think we need more data (American Samoa, C.N.M.I., Guam, Puerto Rico, Virgin Island) to have a better idea of the number of deaths due to lyme disease in the U.S.
The model is a great modal where all the US regions and states are significant (the one with the smallest lyme disease death report and the one with the highest lyme disease death report). Also, the adjusted r-squared is either repectively 99.48% and 97.82% but the diagnostic plot indicates that this may not be the most appropriate model because it may not be linear. Ignoring the Diagnostic plot it is a great modals.
https://www.kaggle.com/cdc/nndss-lyme-disease-to-meningococcal?select=nndss-table-ii.-lyme-disease-to-meningococcal.csv https://www.cdc.gov/mmwr/about.html#:~:text=The%20Morbidity%20and%20Mortality%20Weekly%20Report%20(MMWR%20)%20series%20is%20prepared,Control%20and%20Prevention%20(CDC)