Analysis Report: Trends in MDR-TB Cases on Treatment This report summarizes the trends and patterns observed in Multidrug-Resistant Tuberculosis (MDR-TB) cases on treatment, using the visualizations generated from the provided dataset. 1. Overall Trend of MDR-TB Cases on Treatment This line plot illustrates the aggregated number of MDR-TB cases on treatment across all countries over the years. It provides a general overview of the temporal progression, indicating whether the total caseload is increasing, decreasing, or remaining stable.
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.2
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.5.2
df <- read.csv("C:/Users/Aarohan/OneDrive/Documents/dataMDR TB treated 3.csv")
head(df)
## IndicatorCode Indicator ValueType ParentLocationCode
## 1 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## 2 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## 3 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## 4 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## 5 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## 6 TB_c_mdr_tx Cases started on MDR-TB treatment numeric EMR
## ParentLocation Location.type SpatialDimValueCode Country
## 1 Eastern Mediterranean Country AFG Afghanistan
## 2 Eastern Mediterranean Country AFG Afghanistan
## 3 Eastern Mediterranean Country AFG Afghanistan
## 4 Eastern Mediterranean Country AFG Afghanistan
## 5 Eastern Mediterranean Country AFG Afghanistan
## 6 Eastern Mediterranean Country AFG Afghanistan
## Period.type Year cases Value
## 1 Year 2008 0 0
## 2 Year 2010 0 0
## 3 Year 2011 21 21
## 4 Year 2012 38 38
## 5 Year 2013 49 49
## 6 Year 2014 88 88
colnames(df)
## [1] "IndicatorCode" "Indicator" "ValueType"
## [4] "ParentLocationCode" "ParentLocation" "Location.type"
## [7] "SpatialDimValueCode" "Country" "Period.type"
## [10] "Year" "cases" "Value"
In my FIRST figure, I am going to create a line graph that shows the overall trend of MDR-TB cases on treatment across different years.
fig_dat1 <- df %>%
group_by(Year) %>%
summarise(
Total_Cases = sum(as.numeric(Value), na.rm = TRUE)
)
## Warning: There were 15 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `Total_Cases = sum(as.numeric(Value), na.rm = TRUE)`.
## ℹ In group 3: `Year = 2010`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 14 remaining warnings.
fig_dat1
## # A tibble: 17 × 2
## Year Total_Cases
## <int> <dbl>
## 1 2008 4650
## 2 2009 30499
## 3 2010 31391
## 4 2011 38244
## 5 2012 46234
## 6 2013 44829
## 7 2014 52905
## 8 2015 59205
## 9 2016 62193
## 10 2017 64763
## 11 2018 81146
## 12 2019 77600
## 13 2020 63724
## 14 2021 65834
## 15 2022 78418
## 16 2023 78329
## 17 2024 75485
ggplot(fig_dat1, aes(x = Year, y = Total_Cases)) +
geom_line(color = "steelblue", linewidth = 1.5) +
geom_point(color = "red", size = 3) +
labs(
title = "Trend of MDR-TB Cases on Treatment by Year",
x = "Year",
y = "Total MDR-TB Cases"
) +
theme_minimal()
In my SECOND figure, I am going to create a bar plot showing the countries with the highest MDR-TB cases on treatment.
library(ggplot2)
fig_dat2 <- df %>%
group_by(Country) %>%
summarise(
Total_Cases = sum(as.numeric(Value), na.rm = TRUE)
) %>%
arrange(desc(Total_Cases)) %>%
slice(1:10)
fig_dat2
## # A tibble: 10 × 2
## Country Total_Cases
## <chr> <dbl>
## 1 Ukraine 100349
## 2 Kazakhstan 88809
## 3 South Africa 75876
## 4 Philippines 73420
## 5 Indonesia 56996
## 6 Pakistan 37602
## 7 China 35775
## 8 Viet Nam 33273
## 9 Uzbekistan 31988
## 10 Peru 30295
``` r
ggplot(fig_dat2, aes(x = reorder(Country, Total_Cases),
y = Total_Cases)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(
title = "Top 10 Countries by MDR-TB Cases",
x = "Country",
y = "Total MDR-TB Cases"
) +
theme_minimal()
In my THIRD figure, I am going to create a histogram showing the distribution of MDR-TB cases on treatment.
ggplot(df, aes(x = as.numeric(Value))) +
geom_histogram(
bins = 20,
fill = "darkgreen",
color = "black"
) +
labs(
title = "Distribution of MDR-TB Cases",
x = "Number of Cases",
y = "Frequency"
) +
theme_minimal()
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 39 rows containing non-finite outside the scale range
## (`stat_bin()`).
In my FOURTH figure, I am going to create a boxplot comparing MDR-TB cases across years.
ggplot(df, aes(x = factor(Year),
y = as.numeric(Value))) +
geom_boxplot(fill = "orange") +
labs(
title = "Comparison of MDR-TB Cases Across Years",
x = "Year",
y = "Cases"
) +
theme_minimal()
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 39 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
In my FIFTH figure, I am going to create a scatterplot of MDR-TB cases over years.
library(ggplot2)
ggplot(df,
aes(x = Year,
y = as.numeric(Value))) +
geom_point(color = "purple", alpha = 0.7) +
labs(
title = "Scatterplot of MDR-TB Cases by Year",
x = "Year",
y = "Cases"
) +
theme_minimal()
# my Sixth figure is Share of MDR-TB cases among top countries.
``` r
pie_data <- df %>%
group_by(Country) %>%
summarise(
Total = sum(as.numeric(Value), na.rm = TRUE)
) %>%
arrange(desc(Total)) %>%
slice(1:5)
## Warning: There were 4 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `Total = sum(as.numeric(Value), na.rm = TRUE)`.
## ℹ In group 36: `Country = "China"`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 3 remaining warnings.
ggplot(pie_data,
aes(x = "", y = Total, fill = Country)) +
geom_col(width = 1) +
coord_polar("y") +
labs(
title = "Top 5 Countries by MDR-TB Cases"
) +
theme_void()
# my Seventh figure is Cumulative MDR-TB cases across years.
``` r
ggplot(fig_dat1,
aes(x = Year,
y = Total_Cases)) +
geom_area(fill = "skyblue")
#
library(plotly)
## Warning: package 'plotly' was built under R version 4.5.3
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
p <- ggplot(fig_dat1,
aes(x = Year,
y = Total_Cases)) +
geom_line(color = "blue") +
geom_point()
ggplotly(p)
#The analysis demonstrates substantial variation in MDR-TB burden across countries and years. Visualizations reveal changing treatment trends over time and highlight countries contributing most heavily to global MDR-TB cases. Interactive visualization further improves exploration and interpretation of the data.