Analysis Report: Trends in MDR-TB Cases on Treatment This report summarizes the trends and patterns observed in Multidrug-Resistant Tuberculosis (MDR-TB) cases on treatment, using the visualizations generated from the provided dataset. 1. Overall Trend of MDR-TB Cases on Treatment This line plot illustrates the aggregated number of MDR-TB cases on treatment across all countries over the years. It provides a general overview of the temporal progression, indicating whether the total caseload is increasing, decreasing, or remaining stable.

R Markdown

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.5.2

Import Data

df <- read.csv("C:/Users/Aarohan/OneDrive/Documents/dataMDR TB treated 3.csv")
head(df)
##   IndicatorCode                         Indicator ValueType ParentLocationCode
## 1   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
## 2   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
## 3   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
## 4   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
## 5   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
## 6   TB_c_mdr_tx Cases started on MDR-TB treatment   numeric                EMR
##          ParentLocation Location.type SpatialDimValueCode     Country
## 1 Eastern Mediterranean       Country                 AFG Afghanistan
## 2 Eastern Mediterranean       Country                 AFG Afghanistan
## 3 Eastern Mediterranean       Country                 AFG Afghanistan
## 4 Eastern Mediterranean       Country                 AFG Afghanistan
## 5 Eastern Mediterranean       Country                 AFG Afghanistan
## 6 Eastern Mediterranean       Country                 AFG Afghanistan
##   Period.type Year cases Value
## 1        Year 2008     0     0
## 2        Year 2010     0     0
## 3        Year 2011    21    21
## 4        Year 2012    38    38
## 5        Year 2013    49    49
## 6        Year 2014    88    88
colnames(df)
##  [1] "IndicatorCode"       "Indicator"           "ValueType"          
##  [4] "ParentLocationCode"  "ParentLocation"      "Location.type"      
##  [7] "SpatialDimValueCode" "Country"             "Period.type"        
## [10] "Year"                "cases"               "Value"

Part 1

Overall Trend of MDR-TB Cases by Year

In my FIRST figure, I am going to create a line graph that shows the overall trend of MDR-TB cases on treatment across different years.

fig_dat1 <- df %>%
  group_by(Year) %>%
  summarise(
    Total_Cases = sum(as.numeric(Value), na.rm = TRUE)
  )
## Warning: There were 15 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `Total_Cases = sum(as.numeric(Value), na.rm = TRUE)`.
## ℹ In group 3: `Year = 2010`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 14 remaining warnings.
fig_dat1
## # A tibble: 17 × 2
##     Year Total_Cases
##    <int>       <dbl>
##  1  2008        4650
##  2  2009       30499
##  3  2010       31391
##  4  2011       38244
##  5  2012       46234
##  6  2013       44829
##  7  2014       52905
##  8  2015       59205
##  9  2016       62193
## 10  2017       64763
## 11  2018       81146
## 12  2019       77600
## 13  2020       63724
## 14  2021       65834
## 15  2022       78418
## 16  2023       78329
## 17  2024       75485
ggplot(fig_dat1, aes(x = Year, y = Total_Cases)) +
  geom_line(color = "steelblue", linewidth = 1.5) +
  geom_point(color = "red", size = 3) +
  labs(
    title = "Trend of MDR-TB Cases on Treatment by Year",
    x = "Year",
    y = "Total MDR-TB Cases"
  ) +
  theme_minimal()

Part 2

MDR-TB Cases by Country

In my SECOND figure, I am going to create a bar plot showing the countries with the highest MDR-TB cases on treatment.

library(ggplot2)

fig_dat2 <- df %>%
  group_by(Country) %>%
  summarise(
    Total_Cases = sum(as.numeric(Value), na.rm = TRUE)
  ) %>%
  arrange(desc(Total_Cases)) %>%
  slice(1:10)

fig_dat2
## # A tibble: 10 × 2
##    Country      Total_Cases
##    <chr>              <dbl>
##  1 Ukraine           100349
##  2 Kazakhstan         88809
##  3 South Africa       75876
##  4 Philippines        73420
##  5 Indonesia          56996
##  6 Pakistan           37602
##  7 China              35775
##  8 Viet Nam           33273
##  9 Uzbekistan         31988
## 10 Peru               30295


``` r
ggplot(fig_dat2, aes(x = reorder(Country, Total_Cases),
                     y = Total_Cases)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Top 10 Countries by MDR-TB Cases",
    x = "Country",
    y = "Total MDR-TB Cases"
  ) +
  theme_minimal()

Part 3

Distribution of MDR-TB Cases

In my THIRD figure, I am going to create a histogram showing the distribution of MDR-TB cases on treatment.

ggplot(df, aes(x = as.numeric(Value))) +
  geom_histogram(
    bins = 20,
    fill = "darkgreen",
    color = "black"
  ) +
  labs(
    title = "Distribution of MDR-TB Cases",
    x = "Number of Cases",
    y = "Frequency"
  ) +
  theme_minimal()
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 39 rows containing non-finite outside the scale range
## (`stat_bin()`).

Part 4

Comparison of Cases Across Years

In my FOURTH figure, I am going to create a boxplot comparing MDR-TB cases across years.

ggplot(df, aes(x = factor(Year),
               y = as.numeric(Value))) +
  geom_boxplot(fill = "orange") +
  labs(
    title = "Comparison of MDR-TB Cases Across Years",
    x = "Year",
    y = "Cases"
  ) +
  theme_minimal()
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 39 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

Part 5

Scatterplot of Cases Over Time

In my FIFTH figure, I am going to create a scatterplot of MDR-TB cases over years.

library(ggplot2)

ggplot(df,
       aes(x = Year,
           y = as.numeric(Value))) +
  geom_point(color = "purple", alpha = 0.7) +
  labs(
    title = "Scatterplot of MDR-TB Cases by Year",
    x = "Year",
    y = "Cases"
  ) +
  theme_minimal()

# my Sixth figure is Share of MDR-TB cases among top countries.


``` r
pie_data <- df %>%
  group_by(Country) %>%
  summarise(
    Total = sum(as.numeric(Value), na.rm = TRUE)
  ) %>%
  arrange(desc(Total)) %>%
  slice(1:5)
## Warning: There were 4 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `Total = sum(as.numeric(Value), na.rm = TRUE)`.
## ℹ In group 36: `Country = "China"`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.
ggplot(pie_data,
       aes(x = "", y = Total, fill = Country)) +
  geom_col(width = 1) +
  coord_polar("y") +
  labs(
    title = "Top 5 Countries by MDR-TB Cases"
  ) +
  theme_void()

# my  Seventh figure is Cumulative MDR-TB cases across years.



``` r
ggplot(fig_dat1,
       aes(x = Year,
           y = Total_Cases)) +
  geom_area(fill = "skyblue")

#

library(plotly)
## Warning: package 'plotly' was built under R version 4.5.3
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
p <- ggplot(fig_dat1,
            aes(x = Year,
                y = Total_Cases)) +
  geom_line(color = "blue") +
  geom_point()

ggplotly(p)

#The analysis demonstrates substantial variation in MDR-TB burden across countries and years. Visualizations reveal changing treatment trends over time and highlight countries contributing most heavily to global MDR-TB cases. Interactive visualization further improves exploration and interpretation of the data.