Author: Amira Mandour
Biostatistician | Clinical Trials & Statistical Modeling
Welcome to My Biostatistics Portfolio
I am Amira Mandour, a biostatistician with expertise in data analysis, statistical modeling, data visualization, and R programming. I specialize in using R and RMarkdown to automate reports and create dynamic, reproducible analysis workflows. In this portfolio, you’ll find a diverse set of projects involving complex statistical analysis and data visualization, showcasing various statistical techniques from exploratory data analysis and hypothesis testing to predictive modeling and survival analysis. Each project applies a unique set of statistical tests and methodologies.
Below are examples of formal reports I created for projects’ statistical analysis.
Each with a brief description and a link to the full analysis.
Tools used: R, R Markdown, ggplot2, ggpubr, corrplot, dplyr, tidyr, gtsummary, psych, finalfit.
Key Techniques Used: Cronbach’s alpha for reliability, KAP scoring system, descriptive statistics, Correlation analysis, logistic regression, regression modeling, data visualization with ggplot2 and corrplot.
Project: View Project
Code: View Code on GitHub
Tools Used: R, R Markdown, ggplot2, plotly, naniar, gt, kableExtra, lmtest, ROCR, qqplot, performance, glm, dplyr, tidyr, caret, broom.
Key Techniques Used: Odds ratios, Logistic regression, multivariable analysis, variance inflation factor (VIF), residual diagnostics (residuals vs. fitted values, normal Q-Q plot, scale-location plot), ROC curve analysis, model comparison, and visualizations (forest plot, logistic regression curves).
Project: View Project
Code: View Code on GitHub
Tools Used: gtsummary, dplyr, survival, gt, ggplot2, ggtext.
Key Techniques Used: Odds ratio estimation, Conditional logistic regression, matched case-control study design, model fit testing (Likelihood ratio test, Wald test, Score test), pseudo R², forest plot visualization, and predicted risk modeling.
Project: View Project
Code: View Code on GitHub
Tools Used: geepack, MESS, emmeans, ggplot2, dplyr, gt, gtsummary.
Key Techniques Used: Generalized Estimating Equations (GEE), Repeated measures analysis, longitudinal data analysis, interaction modeling, outcome comparison between groups, and visualization (time-series plots).
Project: View Project
Code: View Code on GitHub
Tools Used: ggplot2, dplyr, gtsummary, ggpubr, gt, glm.
Key Techniques Used: Count data analysis, Poisson regression, incidence rate ratio (IRR) estimation, model comparison by treatment type, controlling for age, and visualizations with ggplot2.
Project: View Project
Code: View Code on GitHub
Tools Used: datarium, rstatix, tidyverse, ggpubr, survival, psych, emmeans, ggplot2, dplyr, gtsummary, ggboxplot, ggqqplot, ggline.
Key Techniques Used: Repeated-measures ANOVA, Mauchly’s test for sphericity, Bonferroni-corrected post-hoc comparisons, effect size (η²), and visualizations (mean plots, boxplots).
Project: View Project
Code: View Code on GitHub
Tools Used: ordinal, gt, gtsummary, broom.mixed, performance, dplyr.
Key Techniques Used: Handling ordinal outcomes, Ordinal logistic regression (Proportional Odds Model), likelihood ratio test, model comparison, AIC and log-likelihood, pseudo R² (Nagelkerke), proportional odds assumption test, odds ratios with 95% CI, Interpretation of ordered categories, significance testing (α = 0.05), and visualizations (bar plots, box plots).
Project: View Project
Code: View Code on GitHub
Tools Used: survival, survfit, coxph, dplyr, tibble, knitr, kableExtra, survminer.
Key Techniques Used: subgroup analysis (gender differences), Kaplan-Meier survival analysis, log-rank test, Cox proportional hazards model, hazard ratios with 95% confidence intervals, testing of proportional hazards assumption (Schoenfeld residuals), stratified Cox model, Progression-Free Survival (PFS) analysis, and visualizations (survival curves, PFS curves).
Project: View Project
Code: View Code on GitHub
Tools Used: survival, survfit, coxph, dplyr, gt, gtsummary, tibble, knitr, kableExtra, survminer.
Key Techniques Used: Hazard ratios with 95% confidence intervals, Cox proportional hazards regression, testing of proportional hazards assumption (Schoenfeld residuals), stratified Cox model, model evaluation (C-index, likelihood ratio test, Wald test, score test), and visualizations (survival curves, Schoenfeld residuals plots).
Project: View Project
Code: View Code on GitHub
R & RMarkdown: Expertise in using R for data analysis, statistical modeling, and report automation.
Data Visualization: Creating clear and impactful visualizations using ggplot2, plotly, and other visualization libraries.
Statistical Analysis: Experience with hypothesis testing, regression analysis, survival analysis, and more.
Sample Size Calculation: Performing sample size calculations and power analysis for clinical trials and research studies, ensuring optimal study design for reliable results.
Predictive Modeling & Repeated Measures: Building predictive models using techniques such as linear and logistic regression, repeated measures and mixed-effects models for longitudinal and clustered data.
Data Cleaning & Preprocessing: Expertise in data wrangling, cleaning, and preprocessing to ensure high-quality datasets ready for analysis, using tools like dplyr, tidyr packages.
Survival Analysis: Specializing in analyzing time-to-event data, including Kaplan-Meier survival curves, Cox regression models, and log-rank tests for clinical and epidemiological research.
You can reach me through the following channels:
Feel free to reach out for any questions or project inquiries!