More Work

Homer White

2021-01-21

Import the Data

df <- read_xlsx(
  "data/Blastexposure.xlsx"
)

Summarize the data set to get an idea of the variables and their values:

summary(df)
##     PartNum          Rank             HighRank             Ref           
##  Min.   :  1.0   Length:1008        Length:1008        Length:1008       
##  1st Qu.:215.8   Class :character   Class :character   Class :character  
##  Median :425.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :434.8                                                           
##  3rd Qu.:649.2                                                           
##  Max.   :892.0                                                           
##                                                                          
##    NumBlasts        DiagA          
##  Min.   :0.000   Length:1008       
##  1st Qu.:1.000   Class :character  
##  Median :2.000   Mode  :character  
##  Mean   :1.637                     
##  3rd Qu.:2.000                     
##  Max.   :2.000                     
##  NA's   :309

That didn’t tell me much. I should figure out what’s in DiagA:

df %>% 
  pull(DiagA) %>% 
  unique() %>% 
  sort()
## [1] "OTHER" "PTSD"

Seems clear enough!

Some Tables

OK, let’s see if there is a relationship between exposure to blasts and PTSD:

tab2 <- xtabs(~ NumBlasts + DiagA, data = df)
tab2
##          DiagA
## NumBlasts OTHER PTSD
##         0    41   15
##         1    47   69
##         2   108  331

Holy moly! Looks like the more blasts you get, the greater the likelihood of PTSD.

Let’s rename and recode a bit:

df2 <- 
  df %>% 
  rename(diagnosis = DiagA) %>% 
  mutate(blast_exposure = case_when(
    NumBlasts == 0 ~ "0 blasts",
    NumBlasts == 1 ~ "1 blast",
    NumBlasts == 2 ~ "2+ blasts"
    )) %>% 
  mutate(blast_exposure = factor(
    blast_exposure,
    levels = c("0 blasts", "1 blast", "2+ blasts")
  ))

Another table:

df2 %>%
  drop_na(blast_exposure, diagnosis) %>% 
  group_by(blast_exposure) %>% 
  summarize(
    PTSD = sum(diagnosis == "PTSD"),
    total = n(),
    percent = PTSD / total
  )
## # A tibble: 3 x 4
##   blast_exposure  PTSD total percent
## * <fct>          <int> <int>   <dbl>
## 1 0 blasts          15    56   0.268
## 2 1 blast           69   116   0.595
## 3 2+ blasts        331   439   0.754

Some Graphs

Go for a bar graphs:

df2 %>% 
  xtabs(~ blast_exposure + diagnosis, data = .) %>% 
  mosaicplot(
    main = "Mosic Plot of Diagnosi vs. Blast Exposure",
    xlab = "Number of blasts",
    ylab = "Diagnosis"
  )
nice caption

nice caption

Some Inferential Statistics

A \(\chi^2\)-test is appropriate here:

chisqtestGC(
  ~ blast_exposure + diagnosis,
  data = df2
)
## Pearson's Chi-squared test 
## 
## Observed Counts:
##               diagnosis
## blast_exposure OTHER PTSD
##      0 blasts     41   15
##      1 blast      47   69
##      2+ blasts   108  331
## 
## Counts Expected by Null:
##               diagnosis
## blast_exposure  OTHER   PTSD
##      0 blasts   17.96  38.04
##      1 blast    37.21  78.79
##      2+ blasts 140.82 298.18
## 
## Contributions to the chi-square statistic:
##               diagnosis
## blast_exposure OTHER  PTSD
##      0 blasts  29.54 13.95
##      1 blast    2.58  1.22
##      2+ blasts  7.65  3.61
## 
## 
## Chi-Square Statistic = 58.5475 
## Degrees of Freedom of the table = 2 
## P-Value = 0

We have overwhelmingly strong evidence that folks exposed to more blasts are more likely to be diagnosed with PTSD.

LS0tCnRpdGxlOiAiTW9yZSBXb3JrIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCmF1dGhvcjogIEhvbWVyIFdoaXRlCm91dHB1dDoKICBybWRmb3JtYXRzOjpkb3duY3V0ZToKICAgIHRvY19kZXB0aDogMgogICAgc2VsZl9jb250YWluZWQ6IHRydWUKICAgIGhpZ2hsaWdodDogIHRhbmdvCiAgICBsaWdodGJveDogdHJ1ZQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCgpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0KbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkocmVhZHhsKQpsaWJyYXJ5KHRpZ2Vyc3RhdHMpCiMjIEdsb2JhbCBvcHRpb25zCm9wdGlvbnMoZHBseXIuaW5mb3JtLnN1bW1hcml6ZSA9IEZBTFNFKQprbml0cjo6b3B0c19jaHVuayRzZXQoCiAgb3V0LndpZHRoID0gIjkwJSIsCiAgZmlnLmFsaWduID0gImNlbnRlciIsCiAgdGlkeSA9IEZBTFNFLAogIHdhcm5pbmcgPSBGQUxTRSwKICBtZXNzYWdlID0gRkFMU0UKKQpgYGAKCgoKIyMgSW1wb3J0IHRoZSBEYXRhCgpgYGB7cn0KZGYgPC0gcmVhZF94bHN4KAogICJkYXRhL0JsYXN0ZXhwb3N1cmUueGxzeCIKKQpgYGAKClN1bW1hcml6ZSB0aGUgZGF0YSBzZXQgdG8gZ2V0IGFuIGlkZWEgb2YgdGhlIHZhcmlhYmxlcyBhbmQgdGhlaXIgdmFsdWVzOgoKYGBge3J9CnN1bW1hcnkoZGYpCmBgYAoKVGhhdCBkaWRuJ3QgdGVsbCBtZSBtdWNoLiAgSSBzaG91bGQgZmlndXJlIG91dCB3aGF0J3MgaW4gYERpYWdBYDoKCmBgYHtyfQpkZiAlPiUgCiAgcHVsbChEaWFnQSkgJT4lIAogIHVuaXF1ZSgpICU+JSAKICBzb3J0KCkKYGBgCgpTZWVtcyBjbGVhciBlbm91Z2ghCgojIyBTb21lIFRhYmxlcwoKT0ssIGxldCdzIHNlZSBpZiB0aGVyZSBpcyBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGV4cG9zdXJlIHRvIGJsYXN0cyBhbmQgUFRTRDoKCmBgYHtyfQp0YWIyIDwtIHh0YWJzKH4gTnVtQmxhc3RzICsgRGlhZ0EsIGRhdGEgPSBkZikKdGFiMgpgYGAKCkhvbHkgbW9seSEgIExvb2tzIGxpa2UgdGhlIG1vcmUgYmxhc3RzIHlvdSBnZXQsIHRoZSBncmVhdGVyIHRoZSBsaWtlbGlob29kIG9mIFBUU0QuCgpMZXQncyByZW5hbWUgYW5kIHJlY29kZSBhIGJpdDoKCmBgYHtyfQpkZjIgPC0gCiAgZGYgJT4lIAogIHJlbmFtZShkaWFnbm9zaXMgPSBEaWFnQSkgJT4lIAogIG11dGF0ZShibGFzdF9leHBvc3VyZSA9IGNhc2Vfd2hlbigKICAgIE51bUJsYXN0cyA9PSAwIH4gIjAgYmxhc3RzIiwKICAgIE51bUJsYXN0cyA9PSAxIH4gIjEgYmxhc3QiLAogICAgTnVtQmxhc3RzID09IDIgfiAiMisgYmxhc3RzIgogICAgKSkgJT4lIAogIG11dGF0ZShibGFzdF9leHBvc3VyZSA9IGZhY3RvcigKICAgIGJsYXN0X2V4cG9zdXJlLAogICAgbGV2ZWxzID0gYygiMCBibGFzdHMiLCAiMSBibGFzdCIsICIyKyBibGFzdHMiKQogICkpCmBgYAoKCkFub3RoZXIgdGFibGU6CgpgYGB7cn0KZGYyICU+JQogIGRyb3BfbmEoYmxhc3RfZXhwb3N1cmUsIGRpYWdub3NpcykgJT4lIAogIGdyb3VwX2J5KGJsYXN0X2V4cG9zdXJlKSAlPiUgCiAgc3VtbWFyaXplKAogICAgUFRTRCA9IHN1bShkaWFnbm9zaXMgPT0gIlBUU0QiKSwKICAgIHRvdGFsID0gbigpLAogICAgcGVyY2VudCA9IFBUU0QgLyB0b3RhbAogICkKYGBgCgojIyBTb21lIEdyYXBocwoKR28gZm9yIGEgYmFyIGdyYXBoczoKCmBgYHtyIGZpZy5jYXAgPSAibmljZSBjYXB0aW9uIn0KZGYyICU+JSAKICB4dGFicyh+IGJsYXN0X2V4cG9zdXJlICsgZGlhZ25vc2lzLCBkYXRhID0gLikgJT4lIAogIG1vc2FpY3Bsb3QoCiAgICBtYWluID0gIk1vc2ljIFBsb3Qgb2YgRGlhZ25vc2kgdnMuIEJsYXN0IEV4cG9zdXJlIiwKICAgIHhsYWIgPSAiTnVtYmVyIG9mIGJsYXN0cyIsCiAgICB5bGFiID0gIkRpYWdub3NpcyIKICApCmBgYAoKIyMgU29tZSBJbmZlcmVudGlhbCBTdGF0aXN0aWNzCgpBICRcY2hpXjIkLXRlc3QgaXMgYXBwcm9wcmlhdGUgaGVyZToKCmBgYHtyfQpjaGlzcXRlc3RHQygKICB+IGJsYXN0X2V4cG9zdXJlICsgZGlhZ25vc2lzLAogIGRhdGEgPSBkZjIKKQpgYGAKCldlIGhhdmUgb3ZlcndoZWxtaW5nbHkgc3Ryb25nIGV2aWRlbmNlIHRoYXQgZm9sa3MgZXhwb3NlZCB0byBtb3JlIGJsYXN0cyBhcmUgbW9yZSBsaWtlbHkgdG8gYmUgZGlhZ25vc2VkIHdpdGggUFRTRC4KCgoK