Burtin’s Bacteria

After World War II, antibiotics earned the moniker “wonder drugs” for quickly treating previously-incurable diseases. Data was gathered to determine which drug worked best for each bacterial infection. Comparing drug performance was an enormous aid for practitioners and scientists alike.

In the fall of 1951, German graphic designer William Burtin gathered data on the minimum inhibitory concentration (MIC) of three antibiotics for 16 different bacteria. The MIC is minimum antibacterial concentration resulting in microbial death. The smaller the MIC the more effective the antibiotic on the bacteria.

Each bacteria was classified according to Gram Staining: Positive/Negative. Gram stain or Gram staining is a method of staining used to distinguish and classify bacterial species into two large groups: gram-positive bacteria and gram-negative bacteria.

In this project, you will create visualizations to answer three questions. One of the goals of the design task is to answer all three questions with no more than three visualizations, preferably fewer (i.e., try to answer all three questions in fewer than 3 visualizations). All visualizations will be organized on a single static dashboard using the flexdashboard package. Find more information here: https://rmarkdown.rstudio.com/flexdashboard/index.html. I have provided an example in this project folder (warning - it contains bad visualizations for demonstration purposes).

Here are the questions to be answered:

antibio <- read_csv("Burtin_Antibiotics.csv")
Rows: 16 Columns: 6
-- Column specification --------------------------------------------------------
Delimiter: ","
chr (3): Bacteria, BacteriaAbbr, Gram_Staining
dbl (3): Penicilin, Streptom, Neomycin

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
antibio %>% head()
# A tibble: 6 x 6
  Bacteria               BacteriaAbbr Penicilin Streptom Neomycin Gram_Staining
  <chr>                  <chr>            <dbl>    <dbl>    <dbl> <chr>        
1 Aerobacter aerogenes   Aerobact       870         1       1.6   negative     
2 Brucella abortus       Brucella ab      1         2       0.02  negative     
3 Brucella antracis      Brucella an      0.001     0.01    0.007 positive     
4 Diplococcus pneumoniae Diplococc        0.005    11      10     positive     
5 Escherichia coli       Escherichi     100         0.4     0.1   negative     
6 Klebsiella pneumoniae  Klebsiella     850         1.2     1     negative     
antibio2 <- antibio %>% 
  pivot_longer(cols = 3:5, names_to = "Antibiotic", values_to = "MIC")
antibio2 %>%  head()
# A tibble: 6 x 5
  Bacteria             BacteriaAbbr Gram_Staining Antibiotic    MIC
  <chr>                <chr>        <chr>         <chr>       <dbl>
1 Aerobacter aerogenes Aerobact     negative      Penicilin  870   
2 Aerobacter aerogenes Aerobact     negative      Streptom     1   
3 Aerobacter aerogenes Aerobact     negative      Neomycin     1.6 
4 Brucella abortus     Brucella ab  negative      Penicilin    1   
5 Brucella abortus     Brucella ab  negative      Streptom     2   
6 Brucella abortus     Brucella ab  negative      Neomycin     0.02
antibio3 <- antibio %>% 
  mutate(MIC_total = Penicilin + Streptom + Neomycin) 
antibio3 %>% head()
# A tibble: 6 x 7
  Bacteria               BacteriaAbbr Penicilin Streptom Neomy~1 Gram_~2 MIC_t~3
  <chr>                  <chr>            <dbl>    <dbl>   <dbl> <chr>     <dbl>
1 Aerobacter aerogenes   Aerobact       870         1      1.6   negati~ 873.   
2 Brucella abortus       Brucella ab      1         2      0.02  negati~   3.02 
3 Brucella antracis      Brucella an      0.001     0.01   0.007 positi~   0.018
4 Diplococcus pneumoniae Diplococc        0.005    11     10     positi~  21.0  
5 Escherichia coli       Escherichi     100         0.4    0.1   negati~ 100.   
6 Klebsiella pneumoniae  Klebsiella     850         1.2    1     negati~ 852.   
# ... with abbreviated variable names 1: Neomycin, 2: Gram_Staining,
#   3: MIC_total

Here is another reformulation of the same data that may or may not be useful, depending on the visualization you design.

antibio2 <- antibio %>% 
  pivot_longer(cols = 3:5, names_to = "Antibiotic", values_to = "MIC")
antibio2 %>%  head()
# A tibble: 6 x 5
  Bacteria             BacteriaAbbr Gram_Staining Antibiotic    MIC
  <chr>                <chr>        <chr>         <chr>       <dbl>
1 Aerobacter aerogenes Aerobact     negative      Penicilin  870   
2 Aerobacter aerogenes Aerobact     negative      Streptom     1   
3 Aerobacter aerogenes Aerobact     negative      Neomycin     1.6 
4 Brucella abortus     Brucella ab  negative      Penicilin    1   
5 Brucella abortus     Brucella ab  negative      Streptom     2   
6 Brucella abortus     Brucella ab  negative      Neomycin     0.02

Remember, you can also create new variables if that is helpful, for example:

antibio3 <- antibio %>% 
  mutate(MIC_total = Penicilin + Streptom + Neomycin) 
antibio3 %>% head()
# A tibble: 6 x 7
  Bacteria               BacteriaAbbr Penicilin Streptom Neomy~1 Gram_~2 MIC_t~3
  <chr>                  <chr>            <dbl>    <dbl>   <dbl> <chr>     <dbl>
1 Aerobacter aerogenes   Aerobact       870         1      1.6   negati~ 873.   
2 Brucella abortus       Brucella ab      1         2      0.02  negati~   3.02 
3 Brucella antracis      Brucella an      0.001     0.01   0.007 positi~   0.018
4 Diplococcus pneumoniae Diplococc        0.005    11     10     positi~  21.0  
5 Escherichia coli       Escherichi     100         0.4    0.1   negati~ 100.   
6 Klebsiella pneumoniae  Klebsiella     850         1.2    1     negati~ 852.   
# ... with abbreviated variable names 1: Neomycin, 2: Gram_Staining,
#   3: MIC_total

Visualizations

After you create your visualizations here, you can transfer the code to dashboard for each visualization.

Remember you can have AT MOST 3 visualizations to helpy you answer the three questions:

antibio <- read_csv("Burtin_Antibiotics.csv")
Rows: 16 Columns: 6
-- Column specification --------------------------------------------------------
Delimiter: ","
chr (3): Bacteria, BacteriaAbbr, Gram_Staining
dbl (3): Penicilin, Streptom, Neomycin

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
antibio2 <- antibio %>% 
  pivot_longer(cols = 3:5, names_to = "Antibiotic", values_to = "MIC")

#grabing only the positive data from the antibio2 DF and making it into positive and negative values only 


posbac <- antibio2 %>% 
  filter(Gram_Staining == "positive")

negbac <- antibio2 %>% 
  filter(Gram_Staining == "negative")

antibio3 <- antibio %>% 
  mutate(MIC_total = Penicilin + Streptom + Neomycin) 
tantibio2 <- antibio2%>%
  mutate(MIC = log(MIC))

transformedMIC <- antibio3 %>% 
  mutate(antibio3, MIC = log(MIC_total))

transformedposbac <- posbac %>% 
  mutate(posbac, MIC = log(MIC))

transformednegbac <- negbac %>% 
  mutate(negbac, MIC = log(MIC))

Column

Bacteria : Highest MIC -> Lowest MIC

ggplot(transformedMIC) +
 aes(
    y = reorder(BacteriaAbbr, MIC),
    fill = Gram_Staining,
    weight = MIC
  ) +
  geom_bar()+ 
  labs(title = "Bacteria with the highest MIC to lowest MIC", subtitle = "If MIC > 1, the bar is on the right" )+
  xlab("Amount of MIC Needed (less than 0 means less than 1 MIC) ")+
  ylab("Abbreviated Bacteria Name")+
  scale_fill_viridis_d( guide = guide_legend(title = "Gram Staining"))

Column

Positivie Bacteria Stain

ggplot(transformedposbac) +
  aes(
    y = reorder(BacteriaAbbr, MIC),
    fill = Antibiotic,
    weight = MIC,
    ) +
  geom_bar(position = "dodge") +
  labs(title = "Positive Bacteria Stains", subtitle = "MIC > 0 = More Effective" )+
  xlab("Amount of MIC Needed (less than 0 means less than 1 MIC) ")+
  ylab("Abbreviated Bacteria Name")+
  scale_fill_viridis_d( guide = guide_legend(title = "Antibiotic"))

Column

Negative Bacteria Stain

ggplot(transformednegbac) +
  aes(
    y = reorder(BacteriaAbbr, MIC),
    fill = Antibiotic,
    weight = MIC,
    ) +
  geom_bar(position = "dodge") +
  labs(title = "Negative Bacteria Stains", subtitle = "MIC > 0 = More Effective" )+
  xlab("Amount of MIC Needed (less than 0 means less than 1 MIC) ")+
  ylab("Abbreviated Bacteria Name")+
  scale_fill_viridis_d( guide = guide_legend(title = "Antibiotic"))

Discussion

After you create your visualizations, provide answers to each of the three questions based on the visualizations.

Consider adding annotations to your visualizations and dashboards to help clarify the message of your visualizations.

Which bacteria was the hardest to kill? Easiest to kill?

From the graph we can see that Aerobact is the hardest to kill with the highest MIC value combined. The easiest to kill bacteria is on the bottom. This bacteria is Brucella An. It has the lowest MIC combined.

Was one antibiotic the most effective overall?

the most effective antibiotic overall was Neomycin because on average it has the lowest MIC values for both negative and positive stains. if you were to average out the scores, Neomycin would be the smallest MIC in both positive and negative stains. penicilin would only work for Positive stains.

Did antibiotic effectiveness vary by gram staining?

yes antibiotic effectiveness does vary by gram staining as we saw in two of the graphs. We saw that if there is a negative stain then penicilin is the worst, otherwise Neomycin is the best on average.