2025 CHUKA UNIVERSITY-AMMNET WORKSHOP ON MALARIA MODELING
setwd("~/2025_CDAM_WORKSHOP_1/2025_DataViZ")
library(rio) ## For easy data importation
SMdata = import("SMdata.csv") # Mock dataset(SMdata)
library(gganimate) ## Adds dynamic animations to your visualizations
# color point in the plot by Region
p1 = SMdata |> ggplot( aes(x = Total, y = Positive, color = Region)) +
geom_point(size=4) +
labs(title="Malaria Positive Cases vs Total by Region",
subtitle="Time: {frame_time}", # Display time as subtitle
x= "Total Cases",
y= "Number of Positive Cases",
caption = " Source: CDAM Experts, 2025") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) # Align the title to the center
# Add animation with gganimate
animated_plot <- p1 +
transition_time (Year) + # Animate over 'time' variable
ease_aes('linear') # Smooth linear transitions
animated_plot
1 Introduction
Data visualization in R with no coding refers to the use of graphical user interfaces (GUIs) tools built on R that allow users to create charts, graphs, and explore data visually without writing R code manually. Here are some appropriate tools and packages for this:
1.1 R Commander (Rcmdr)
Description: A GUI for basic statistical analysis and graphics.
Strengths: Good for beginners and teaching environments.
Limitation: Limited in terms of modern visualization customization.
Installation:
#install.packages("Rcmdr")
#library(Rcmdr)
1.2 esquisse
Description: A ggplot2-based GUI for interactive data visualization.
Strengths: Lets you drag and drop variables to create high-quality plots.
Output: Exports the R code used to generate the plot — useful for learning.
Installation:
#install.packages("esquisse")
#library(esquisse)
#esquisse::esquisser()
1.3 BlueSky Statistics
Description: A point-and-click GUI for R, similar to SPSS, with strong visualization and statistical tools.
Platform: Windows (standalone installer).
Strengths: Clean interface, easy to use for social science researchers.
Website: https://www.blueskystatistics.com
1.4 JASP
Description: Not built in R but integrates with R for backend computation.
Strengths: Easy UI for statistics and plots; focused on reproducibility.
Website: https://jasp-stats.org
1.5 jamovi
Description: GUI for statistical analysis built on R.
Strengths: Open-source, user-friendly, modern interface with ggplot2 integration.
Website: https://www.jamovi.org
1.6 GwalkR
Description: R wrapper for the open-source Graphic Walker, a modern, browser-based visual data exploration tool.
Platform: Runs in a web browser, initiated from R
Strengths: GUI for no-code plotting and ED; Ideal for presentations, dashboard prototyping, and rapid insights
Website: GitHub (R package): https://github.com/Kanaries/GWalkR
1.7 Summary table of GUI tools
2 Esquisse R package
The Esquisse R package is an interactive graphical user interface (GUI) that allows you to build and customize ggplot2 visualizations without writing any code. It’s especially helpful for:
Beginners with no programming experience
Educators and students in statistics
Quick exploratory data analysis
2.1 What Esquisse Does?
2.2 Esquisse helps you:
Drag and drop variables to define axes, colors, groups, or facets.
Choose plot types like scatter plots, bar charts, boxplots, histograms, etc.
Modify themes, labels, and axis settings interactively.
Automatically generate the corresponding ggplot2 code — so you can copy, tweak, and use it in scripts later.
2.3 How to Launch Esquisse in RStudio
Install and load the package: (esquisse)
Launch by the function esquisser()
Load a dataset: You can choose from built-in datasets or upload your own.
2.4 Benefits of Esquisse
No coding needed to create powerful plots.
Encourages learning ggplot2 by showing live-generated code.
Fast prototyping for data exploration.
2.5 Package Information
Name: esquisse
Author: Julien Barnier
GitHub: https://github.com/dreamRs/esquisse
Official website: https://dreamrs.github.io/esquisse/
2.6 Getting Started
3 Set a working directory
This is a default location where R looks for files and saves outputs
setwd("~/2025_CDAM_WORKSHOP_1/2025_DataViZ") # It tells R where to look for files and where to save files
4 Install and Load necessary packages
This package provides a set of useful functions for data manipulation and visualization.
# Load the packages
library(GWalkR) ## For interactive exploratory data analysis
library(esquisse) ## GUI for creating ggplot2 plots interactively
library(plotly) ## Convert to interactive plot
library(rio) ## For easy data import, export(saving) and conversion
library(patchwork) ## Easily combines multiple plots into cohesive layouts
5 Load the dataset
library(rio) ## For easy data importation
SMdata = import("SMdata.csv") # Mock dataset(SMdata)
6 Exploratory data analysis (EDA)
Before we start visualizing our data, we need to understand the characteristics of our data.
EDA is a critical step before building models, as it helps in:
✅ Understanding the data structure and identifying inconsistencies.
✅ Detecting missing values, outliers, and unusual patterns.
✅ Selecting appropriate features for predictive modeling.
✅ Improving data preprocessing and transformation steps.
✅ Summarize key characteristics of a dataset.
6.1 Explore the dataset
#head(SMdata)
#dim(SMdata) # for dimensions of dataset
#str(SMdata) # for structure of dataset
#names(SMdata) # for features/ variable names in the dataset
7 Data Visualization
Data visualization helps in understanding patterns, trends, and relationships in data.
It is a crucial element in scientific research, enabling researchers to interpret and communicate their results effectively
8 Best Practices
✅ Choose the Right Chart Type (e.g., bar charts for categories, line charts for trends).
✅ Follow Design Principles (simplicity, consistency, accessibility).
✅ Use Storytelling to highlight key insights and structure visuals logically.
✅ Avoid Common Pitfalls (misleading scales, cluttered visuals, unnecessary 3D charts).
8.1 Box Plot: Used for Detecting outliers and understanding the spread of data.
#esquisse::esquisser(SMdata) # Opens the GUI interface in RStudio
library(ggplot2)
ggplot(SMdata) +
aes(x = Region, y = Positive, fill = Region) +
geom_boxplot() +
scale_fill_brewer(palette = "Set1",
direction = 1) +
labs(title = "Boxplot of Positive cases by Region") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(legend.position = "none")
##️ Histogram: Used for understanding the distribution of a single variable.
library(ggplot2)
ggplot(SMdata) +
aes(x = Positive) +
geom_histogram(bins = 10L, fill = "#4513D9",color= "white") +
labs(x = "Positive Malaria Cases",
title = "Histogram") +
theme_minimal() +
theme(plot.title = element_text(size = 16L, face = "bold",
hjust = 0.5))
#esquisse::esquisser(SMdata) # Opens the GUI interface in RStudio
8.2 Scatter Plot: Used for understanding relationships between two numerical variables.
library(ggplot2)
ggplot(SMdata) +
aes(x = Total, y = Positive, colour = Region) +
geom_point(size = 2.4, shape = "circle") +
scale_color_hue(direction = 1) +
labs(x = "Total Malaria Cases", y = "Positive Cases", title = "Scatter Plot of Positive cases vs Total malaria cases",
caption = "Source: CDAM Experts, 2025") +
theme_minimal() +
theme(legend.position = "bottom", plot.title = element_text(size = 16L,
face = "bold", hjust = 0.5), plot.caption = element_text(size = 12L, face = "bold.italic"), axis.title.y = element_text(size = 12L,
face = "bold"), axis.title.x = element_text(size = 12L, face = "bold"), axis.text.y = element_text(size = 12L),
axis.text.x = element_text(size = 12L), legend.text = element_text(size = 12L), legend.title = element_text(face = "bold",
size = 12L))
# esquisse::esquisser(SMdata)
8.3 Scatter Plot: Used for understanding relationships between two numerical variables.
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Total, y = Positive, colour = Region, size = Positive) +
geom_point() +
scale_color_hue(direction = 1) +
theme_classic() +
facet_wrap(vars(Year))
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Total, y = Positive, colour = Region) +
geom_point(size = 3.05, shape = "diamond") +
scale_color_hue(direction = 1) +
labs(x = "Total Cases", y = "No. of Positive Cases", title = "Scatter plot by Age group",
caption = "Source: Author, 2025") +
theme_bw() +
theme(legend.position = "bottom", plot.title = element_text(size = 20L,
face = "bold", hjust = 0.5), plot.caption = element_text(size = 12L, face = "bold.italic"), axis.title.y = element_text(size = 15L,
face = "bold"), axis.title.x = element_text(size = 15L, face = "bold"), axis.text.y = element_text(size = 15L),
axis.text.x = element_text(size = 15L), legend.text = element_text(size = 12L)) +
facet_wrap(vars(Age))
## Line Plot: Used for showing trends over time or continuous data.
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Total, y = Positive) +
geom_line(colour = "#E1010D") +
labs(title = "Line plot of positive cases vs Total malaria cases",
caption = "Source: CDAM Expert, 2025") +
theme_minimal() +
theme(plot.title = element_text(size = 15L,
face = "bold", hjust = 0.5), plot.caption = element_text(size = 12L), axis.title.y = element_text(size = 12L,
face = "bold"), axis.title.x = element_text(size = 12L, face = "bold"), axis.text.y = element_text(size = 12L),
axis.text.x = element_text(size = 12L))
8.4 Bar Chart: Used for comparing categorical data.
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Intervention_Type, fill = Intervention_Type) +
geom_bar() +
scale_fill_hue(direction = 1) +
labs(x = "Intervention Method",
y = "Count of Malaria Cases",
title = "BAR CHART OF MALARIA CASES BY INTERVENTION METHOD",
caption = "Source: Mock data, 2025") +
theme_classic() +
theme(plot.title = element_text(size = 13L,
face = "bold", hjust = 0.5),
plot.caption = element_text(size = 11L, face = "bold.italic"),
axis.title.y = element_text(size = 12L,
face = "bold"), axis.title.x = element_text(size = 12L, face = "bold")) +
theme(legend.position = "none")
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Intervention_Type, fill = Intervention_Type) +
geom_bar() +
scale_fill_hue(direction = 1) +
labs(x = "Intervention Method", y = "Count of Malaria Cases", title = "BAR CHART OF MALARIA CASES BY INTERVENTION METHOD",
caption = "Source: CDAM Expert, 2025") +
coord_flip() +
theme_classic() +
theme(plot.title = element_text(size = 13L,
face = "bold", hjust = 0.5), plot.caption = element_text(size = 11L, face = "bold.italic"), axis.title.y = element_text(size = 12L,
face = "bold"), axis.title.x = element_text(size = 12L, face = "bold")) +
theme(legend.position = "none")
ggplot(SMdata, aes(Region, Positive, fill = Region)) +
geom_col() +
coord_flip() +
facet_wrap(vars(Year), scales = "free_y") +
theme_bw()
8.5 Pie charts make it easy to see what portion of the total each category represents.
ggplot(SMdata, aes(x = "", y = Positive, fill = Region)) +
geom_bar(stat = "identity", width = 1) +
coord_polar(theta = "y") +
theme_void() +
labs(fill = "Cylinders")
8.6 Violin Plot: Used for understanding the distribution of a variable across categories.
#esquisse::esquisser(SMdata)
library(ggplot2)
ggplot(SMdata) +
aes(x = Intervention_Type, y = Incidence_Rate) +
geom_col(fill = "#112446") +
geom_violin(aes(x = Intervention_Type, y = Positive), adjust = 1L, fill = "#0C57DE") +
labs(x = "Intervention Method", y = "Incidence Rate",
title = "Violin plot") +
theme_classic() +
theme(plot.title = element_text(size = 16L, face = "bold",
hjust = 0.5), axis.title.y = element_text(size = 14L, face = "bold"), axis.title.x = element_text(size = 14L,
face = "bold"), axis.text.y = element_text(size = 14L), axis.text.x = element_text(size = 14L))
8.7 Bubble Chart: Used for adding a third variable to a scatter plot(Comparing three numerical variables)
library(ggplot2)
p = ggplot(SMdata, aes(x = Total, y = Positive, size = Incidence_Rate)) +
geom_point(alpha = 0.2, color = "blue") +
scale_size_continuous(range = c(3, 8)) +
labs(
title = "Bubble Chart on Malaria Incidence",
x = "Total Tests",
y = "Positive Cases",
size = "Incidence Rate") +
theme_classic()
print(p)
8.8 Interactive Visualizations: Use tools like Plotly to enable user interaction.
library(plotly) ## Convert to interactive plot
ggplotly(p)
8.9 Adds dynamic animations to your visualizations
library(rio) ## For easy data importation
SMdata = import("SMdata.csv") # Mock dataset(SMdata)
library(gganimate) ## Adds dynamic animations to your visualizations
# color point in the plot by Region
p1 = SMdata |> ggplot( aes(x = Total, y = Positive, color = Region)) +
geom_point(size=4) +
labs(title="Malaria Positive Cases vs Total by Region",
subtitle="Time: {frame_time}", # Display time as subtitle
x= "Total Cases",
y= "Number of Positive Cases",
caption = " Source: CDAM Experts, 2025") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5)) # Align the title to the center
# Add animation with gganimate
animated_plot <- p1 +
transition_time (Year) + # Animate over 'time' variable
ease_aes('linear') # Smooth linear transitions
animated_plot
#ggplotly(p1)
# create a boxplot
p2 = ggplot(SMdata) +
aes(x = Region, y = Positive, fill = Region) +
geom_boxplot() +
scale_fill_brewer(palette = "Set1",
direction = 1) +
labs(title = "Boxplot of Positive cases by Region") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5)) +
theme(legend.position = "none")
# Arrange in a grid layout
combine_plot <- (p| p2 ) +
plot_annotation(title = "Combined Plots Example")
# Display the comine plot
print(combine_plot)
ggsave("Plot_2.png")
library(ggplot2)
pp = ggplot(SMdata) +
aes(x = Age, y = Positive, fill = Age, alpha = Age) +
geom_col(position = "fill") +
geom_boxplot(aes(fill = Age,
alpha = Age, x = Age, y = Total)) +
scale_fill_hue(direction = 1) +
theme_minimal()
# Create the plot with boxplot, jitter, and statistical comparisons
pp
#esquisse::esquisser(SMdata)
library(ggpubr)
pp +
stat_compare_means(method = "kruskal.test", label.y = 700) + # Overall Kruskal-Wallis test
stat_compare_means(comparisons = list(c("5 to 14", "Above >14"),
c("5 to 14", "Below <5"),
c("Above >14", "Below <5")),
method = "wilcox.test", label.y = 550) # Pairwise Wilcoxon tests
9 GwalkR package
The GWalkR package in R is a powerful tool for data visualization and exploration. Designed to simplify complex data analysis, GWalkR transforms raw data into interactive visualizations, making it easier to understand and interpret your data sets.
9.1 Key features of GWalkR
1️⃣ Interactive Plots: Generate dynamic and interactive plots with minimal code. This helps in identifying trends and patterns in your data quickly.
2️⃣ Ease of Use: With a user-friendly interface, GWalkR is accessible even for those who are new to R. The package integrates seamlessly with other R libraries, enhancing your data analysis workflow.
3️⃣ Customization: GWalkR offers a variety of customization options, allowing you to tailor visualizations to meet your specific needs. From color schemes to plot types, you have complete control over the appearance of your data.
4️⃣ Efficiency: Save time and effort in data analysis by automating the creation of visualizations. GWalkR processes large data sets efficiently, ensuring smooth performance even with extensive data.
5️⃣ Community Support: Benefit from a growing community of users and contributors who share tips, tricks, and support. This makes it easier to troubleshoot issues and stay updated with the latest features.
9.2 Launch GwalkR tool
library(GWalkR) ## For interactive exploratory data analysis
GWalkR::gwalkr(SMdata)