COSMIC_STUDY: Microsoft Fabric Notebook, Data Analysis Workflow.

JM Waweru 15th January 2026

This Microsoft Fabric notebook documents a transparent, auditable and replicable Microsoft ecosystem-based statistical analysis workflow.

It also enables direct methodological comparison with Python and R analytical pipelines.


Study title:

“Comparison of Medical Evacuation Requirements and Mortality Between Personnel with Chronic Medical Conditions and Healthy Counterparts During Military Deployment.”


Abstract

Background:

Military personnel with stable chronic medical conditions have steadily been allowed into operation areas over the recent past. Operation area environment imposes unique physiological and psychological stressors that challenge the health of deployed personnel and more particularly, those with chronic medical conditions.

Main Objective:

To compare the need for medical evacuation and mortality between personnel with chronic medical conditions and healthy counterparts in military deployment.

Specific Objectives:

  1. Identify personnel with and without chronic medical illness
  2. Compare medical evacuation (MEDEVAC) needs between groups
  3. Compare mortality outcomes between groups

1. Study design:

2. Data Source (Excel Ingestion Layer)

Raw Data Description

Key Variables


3. Data Ingestion (Fabric Conceptual Layer)

Excel Workbook → Power Query Editor
Source: CST_01 … CST_18 worksheets
Target: Unified analytical dataset

4. Data Integration and Cleaning (Power Query)

Step 4.1: Import Worksheets

Excel → Data → Get Data → From Workbook
Select CST_01 … CST_18
Load to Power Query Editor

Step 4.2: Append Queries

Power Query Editor:
Home → Append Queries → Append as New
Input: All CST worksheets
Output: Master_Medical_Dataset

Step 4.3: Data Cleaning Operations

- Remove duplicate Personnel_ID entries
- Standardize categorical fields (Yes/No)
- Validate chronic illness classification
- Convert date fields to Date format
- Enforce numeric data types (Age)
- Filter records with missing outcome indicators

5. Missing Data Handling

Pairwise case analysis:
- Missing values excluded only for the specific test
- No listwise deletion applied

6. Analytical Group Definition

Group A: Personnel with chronic medical illness
Group B: Personnel without chronic medical illness

Derived variables:


7. Descriptive Statistics (Excel Analytics Layer)

=COUNT()
=COUNTIF()
=AVERAGE()
=MEDIAN()
=STDEV.S()

Outputs:


8. Visualization (Power BI Experience)

Power BI Desktop:
Get Data → Excel Workbook
Select Master_Medical_Dataset
Load to Data Model

Visuals:


9. Inferential Statistics (XRealStatistics®)

Significance Level

α = 0.05

Student’s t-Test for testing a difference in mean weight and BMI between the two groups

Real Statistics → T Tests → Two-Sample t-Test
Input: Chronic vs Non-chronic groups
Output: t-statistic, df, p-value

Mann–Whitney U Test for a test of difference in median age between the two groups

Real Statistics → Nonparametric Tests → Mann–Whitney
Input: Chronic vs Non-chronic groups
Output: U statistic, z-score, p-value

10. Validation and Quality Control

- Cross-check outputs against raw data
- Verify Power Query steps
- Manual validation of statistical results