Import data

# excel file
diagnosed <- read_excel("../00_data/my_data_4.xlsx")
diagnosed
## # A tibble: 15 × 5
##    service component severity         diagnosed  year
##    <chr>   <chr>     <chr>                <dbl> <dbl>
##  1 Army    Active    Penetrating            246  2011
##  2 Army    Active    Severe                 155  2011
##  3 Army    Active    Moderate              1046  2011
##  4 Army    Active    Mild                 13074  2011
##  5 Army    Active    Not Classifiable      1238  2011
##  6 Army    Guard     Penetrating             41  2011
##  7 Army    Guard     Severe                  32  2011
##  8 Army    Guard     Moderate               221  2011
##  9 Army    Guard     Mild                  2852  2011
## 10 Army    Guard     Not Classifiable       549  2011
## 11 Army    Reserve   Penetrating             19  2011
## 12 Army    Reserve   Severe                  24  2011
## 13 Army    Reserve   Moderate               102  2011
## 14 Army    Reserve   Mild                  1353  2011
## 15 Army    Reserve   Not Classifiable       201  2011

State one question

How do the numbers of diagnoses compare across Active, Guard, and Reserve components in the Army in 2011?

Plot data

ggplot(diagnosed, aes(x = component, y = diagnosed)) +
  geom_bar(stat = "identity") +
    ggtitle("Number of Diagnoses by Army Component (2011)")

Interpret

The chart shows that in 2011, the Active Army had far more diagnoses than the Guard and Reserve, with the Guard reporting a much smaller number and the Reserve the fewest diagnoses overall.