Workforce Drivers of Employee Performance

A board-level analysis of inq Digital Nigeria Limited

Yewande Amund — Head of Human Capital, inq Digital Nigeria Limited

2026-05-19

Audience: inq Digital Board of Directors · Author: Yewande Amund, Head of Human Capital · Purpose: evidence-based recommendation on the workforce factor that most strongly influences employee performance. The analysis is fully reproducible: every figure on every page below is produced from the underlying HR appraisal extract (inq_Digital_HR_Data.xlsx), which sits in the same folder as this report.

1 Executive Summary

We applied five Exploratory & Inferential Analytics techniques — EDA, Visualisation, Hypothesis Testing, Correlation Analysis and Regression — to the FY 2025 / 26 performance-appraisal dataset for inq Digital Nigeria Limited (n = 103 employees across 8 departments). The headline finding is that departmental membership explains far more of the variation in employee performance than training, attendance and tenure combined. A regression on the three candidate workforce levers alone explains only 5 % of the variation in performance and is not statistically significant. Adding department as a control lifts the explained variation sevenfold to 35 %, and the model becomes highly significant (F-test p < 0.001). Performance also differs significantly across departments (ANOVA F = 5.77, p < 0.001): Legal and Finance lead the company at a mean score of ≈ 3.5 / 5, while Administration trails at 2.7 / 5. The widely-held expectation that more training hours produce better performance is not supported in this data — the correlation is in fact slightly negative (r = −0.20, p = 0.05), almost certainly because training is currently assigned reactively to underperformers. The single recommendation we ask the Board to consider is to redesign training allocation as a pro-active development cohort and to invest in first-line management coaching in the two departments where the performance gap is widest.

2 Professional Disclosure

I am Yewande Amund, Head of Human Capital at inq Digital Nigeria Limited — a privately-held digital technology services company that delivers enterprise connectivity, cloud and managed-service solutions to corporate and public-sector clients across Nigeria and sub-Saharan Africa. The five techniques in this paper map directly to live operational decisions on my desk:

Exploratory Data Analysis (EDA) is the always-on substrate that runs at the start of every people-analytics cycle: missing-value scans, distribution checks and outlier flags before any modelling begins. It is how we make sure we are operating on the same set of facts as Finance and the Executive.
Data Visualisation is how findings travel from my team to the Executive Committee and the Board. Histograms, boxplots and scatterplots are the lingua franca that lets a non-technical leader follow the story in seconds.
Hypothesis Testing is how I separate signal from noise. With n = 103 across eight departments, formally stated null and alternative hypotheses — and p-values paired with effect sizes — keep the People-Operations conversation honest.
Correlation Analysis is the first lens I use to identify candidate drivers of performance and to decide which variables earn a place in the regression model.
Regression is the workhorse for the analytical question. Coefficients, partial effects and R² together turn a noisy table of HR observations into a ranked list of levers I can present to the Board.

3 Data Collection & Sampling

Field	Value
Source	inq Digital’s Human-Capital Information System (HRIS). Performance appraisal scores are entered by line managers, calibrated by HR Business Partners, and locked at the end of each fiscal year.
Collection method	Direct workbook export from the HRIS — sheet `Performance Appraisal`.
Sampling frame	All active permanent staff at the cut-off date (FY 2025/26 mid-year window) across the company’s eight departments.
Sample size	n = 103 employees (full census; not a sample).
Time period	Performance covers fiscal years FY 2022/23 → FY 2025/26 (FY 25/26 is the focal outcome). Tenure, training and attendance are point-in-time snapshots at the cut-off.
Ethics & consent	All employee identifiers are pseudonymised at source (e.g. `NGA5745`). The dataset is held under inq Digital’s data-protection policy aligned with the Nigeria Data Protection Act (NDPA, 2023). Analysis was performed on a controlled environment with no row-level data leaving company systems. The Human-Capital Business-Partner team has approved use of the dataset for analytics development.

4 Data Description

The dataset contains 103 permanent staff across 8 departments, with the focal outcome FY 25/26 Overall Performance Score (0–5 scale) plus three candidate predictors — Tenure, Training Hours and Attendance Rate — and three historical year-on-year scores. There are no missing values for the focal outcome or the predictors. Historic-year scores carry some missingness (employees who joined later have no prior-year score) and the Biannual indicator is essentially empty; both are excluded from the analyses.

5 The Analytical Question

Which workforce factor most strongly influences employee performance in a digital technology company?

Each of the five techniques below contributes one piece of evidence towards this single question, and the Integrated Findings section combines them into one Board-level recommendation.

6 Analysis 1 — Exploratory Data Analysis

6.1 Theory recap

EDA is the disciplined first look: descriptive statistics, missing- value scans, outlier flags and shape diagnostics (skew, kurtosis) before any modelling. The combination of summary() and Tukey’s 1.5 × IQR rule answers “what does the workforce look like?” on a single page.

6.2 Business justification

Before recommending HR investments to the Board, we need to know whether the workforce is homogeneous or polarised, whether outlier scores exist, and whether the data quality is adequate to support a formal recommendation.

What the EDA tells the Board. Performance is mildly left-skewed (most staff cluster between 3 and 4 with a smaller tail at the lower end). Tenure is strongly right-skewed (a small group of long-tenured staff sits well above the mean of ≈ 9 years). Training hours are also right-skewed (mean ≈ 21, max 55) — a handful of employees consume far more training than typical. Attendance is tightly clustered (sd 0.04) — variation here is small. There are no extreme data-quality outliers that would invalidate the downstream analysis.

7 Analysis 2 — Data Visualisation

7.1 Theory recap

A statistic summarises; a chart shows. Five visuals are sufficient to tell the HR story coherently: the distribution of the outcome, the distribution of attendance, training across departments, and the two key bivariate relationships (training vs performance, attendance vs performance). All charts below are interactive — hover for exact values, click the legend to filter, drag to zoom.

7.2 Business justification

The Board meets monthly. Interactive charts let any director probe a data-point without asking a follow-up question, and remain in a single page of the read-ahead pack.

Visual 1. Distribution of FY 25/26 Performance Scores

Visual 2. Training hours by department

Visual 3. Attendance rate distribution

Visual 4. Training Hours vs Performance (with linear fit)

Visual 5. Attendance Rate vs Performance (with linear fit)

Visual story. Performance clusters in the mid-range; attendance is remarkably uniform; training hours vary by department. The bivariate scatters tell different stories: training shows a slight negative slope (more training, slightly lower performance) — almost certainly because training is allocated reactively to weak performers; attendance shows a slight positive slope. Both relationships are weak; we test them formally next.

8 Analysis 3 — Hypothesis Testing

8.1 Theory recap

A hypothesis test starts with a null (H₀ — usually “no effect”), an alternative (H₁), an α (typically 0.05) and an appropriate test statistic. For continuous data we use Pearson’s correlation test (for bivariate strength) and one-way ANOVA (for differences across ≥ 3 groups).

8.2 Business justification

The Board needs binary “is this real, or chance?” answers on two questions: (1) does training relate to performance? and (2) does performance differ across departments? Testing them formally — rather than eyeballing the visuals — is what justifies any follow-on investment recommendation.

H1 interpretation. Training hours and performance are correlated at r = −0.20 (p ≈ 0.05) — just significant, with a counter-intuitive negative sign. The most plausible operational explanation is selection bias: training is currently assigned reactively to staff who scored low in the previous cycle, so the correlation reflects “weak performers get sent on training” rather than “training reduces performance”. This finding alone is enough to justify a Performance-Operations review of how training is allocated.

H2 interpretation. F = 5.77, p < 0.001 — strongly significant. Legal (mean 3.70 / 5) and Finance (3.52) sit at the top; Administration (2.66) is well below the rest. This is the strongest single signal in the dataset and points to department-level practices — calibration norms, role design, line-management quality — as the primary lever on performance.

9 Analysis 4 — Correlation Analysis

9.1 Theory recap

Pearson’s correlation matrix summarises pairwise linear relationships among numeric variables, with values in [−1, +1]. A heatmap renders the matrix at a glance. Significance for each pair is tested with the t = r √(n − 2) / √(1 − r²) statistic.

9.2 Business justification

Before fitting a regression we want a ranked list of candidate predictors and a check for redundancy among them. The matrix tells us which pairs to keep, which to drop and which signals are mechanical (e.g. YOY is computed from FY 25/26 and FY 24/25 scores).

Correlation heatmap (interactive — hover for exact values)

Reading the matrix.

Strongest non-mechanical relationship: Score 25/26 ↔︎ Score 24/25 at r = +0.23 (p ≈ 0.03) — last year’s score is a modestly useful predictor of this year’s.
Weakest relationship: Attendance ↔︎ Training Hours at r = −0.01 — essentially zero.
The Training–Performance pair (r = −0.20, p = 0.05) is the one that triggered the H1 finding above.
Managerial implication. None of the three candidate predictors — tenure, training, attendance — exceeds |r| ≈ 0.20 with the focal outcome. We therefore expect a regression on these alone to explain very little, and look to department effects to do the heavy lifting in the regression that follows.

10 Analysis 5 — Regression

10.1 Theory recap

OLS regression fits y = β₀ + β₁x₁ + β₂x₂ + … + ε by minimising the sum of squared residuals. Each β̂ⱼ is the partial effect of xⱼ on y holding the other predictors constant. R-squared is the share of variance explained; the F-test asks whether the model collectively explains more variance than a mean-only model.

10.2 Business justification

The Board’s question — “which workforce factor most strongly influences employee performance?” — is exactly what regression answers. We fit the brief’s simple defendable model first, then add department as a categorical control to quantify how much extra variation departmental practice accounts for.

Variance in performance explained by each model

Residual-diagnostic check on the extended model

Direct answer to the analytical question.

The simple model (Training + Attendance + Tenure) explains only ~5 % of the variation in performance and is not statistically significant (F-test p ≈ 0.18). Training is the only term that even approaches significance — and it does so with a negative sign (consistent with reactive allocation, not a causal effect).
The extended model that adds department explains ~35 % of the variation — a sevenfold improvement, statistically significant at p ≈ 10⁻⁵.
Therefore the workforce factor that most strongly influences employee performance at inq Digital is departmental membership (the bundle of calibration norms, role design and line-management quality that travels with it). Training hours and tenure individually contribute little; attendance contributes a small, non-significant positive signal.

11 Integrated Findings

Single recommendation to the Board. Reallocate the next-quarter Learning & Development spend from generic training to (a) a pro-active development cohort that breaks the “training-as-remediation” pattern, and (b) targeted first-line- management coaching in Administration and Sales & Marketing, where the performance gap with the leading departments (Legal, Finance) is widest. Maintain the existing programme in Legal and Finance — they are the practices we want to model elsewhere.

12 Limitations & Further Work

Modest sample size. n = 103 across eight departments; some sub-groups are small (Legal n = 4, Human Capital n = 3, Innovations & Partnerships n = 7). The ANOVA result is driven by the contrast between Administration and the Legal/Finance leaders; pair-wise Tukey HSD with more data would give a sharper picture.
Causality. Every coefficient here is observational. The negative training-performance correlation is almost certainly a selection artefact; a quasi-experimental design (propensity-score matching of trained vs untrained employees, or a randomised opt-in for a new development programme) is required for a causal claim.
Latent confounders. Role design, line-manager quality and calibration norms differ across departments and are not captured in the dataset. A future iteration should add manager-effectiveness scores and role-grade.
Outcome scaling. All eight departments are pooled on a 0–5 scale even though calibration norms may differ. With more time we would build department-fixed-effects + standardised z-scores within department.
Missingness in historic scores. ~25 % of FY 24/25 scores are missing (joiners with < 1 year of tenure). Restrict historic- comparison analyses to longer-tenure staff, or impute with care.

References

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates.
Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.
Wickham, H., Averick, M., Bryan, J., et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag.
Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC.
Robinson, D., Hayes, A., & Couch, S. (2024). broom: Convert Statistical Objects into Tidy Tibbles. R package version 1.0.x. citation("broom")
Revelle, W. (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. citation("psych")
Xie, Y., Cheng, J., & Tan, X. (2024). DT: A wrapper of the JavaScript library “DataTables”. R package version 0.30. citation("DT")
Federal Republic of Nigeria. (2023). Nigeria Data Protection Act. National Assembly.
Mark Analytics. (2025). AI-Powered Data Analytics: a reproducible reporting workflow. https://markanalytics.online/ai-powered-data-analytics/

Appendix — AI Usage Statement

The author used Claude (Anthropic) for two specific tasks: (1) drafting the boilerplate scaffold of the Quarto YAML and section headings to match the required submission rubric, and (2) double- checking R syntax for ggplot2, plotly, lm(), aov() and DT. The research question, the choice of techniques, the interpretation of the negative training-performance correlation as a selection artefact (rather than a causal claim), the recommendation to investigate department-level practices over generic training spend, and the narrative throughout — all of these are the independent professional judgement of Yewande Amund, Head of Human Capital at inq Digital Nigeria Limited. Every numerical result is computed live in this document on the 103-row inq Digital appraisal dataset (inq_Digital_HR_Data.xlsx) and is reproducible end-to-end via quarto render.