Code
library(qrcode)
# Replace the URL below with your actual Google Form link
form_url <- "https://forms.gle/3Ht9PfmES48QKbsi8"
# Generate and plot the QR code
qr <- qr_code(form_url)
plot(qr)This study applies exploratory and inferential analytics to 108 customer survey responses collected across four Mobility service centre branches in Lagos — Lagos Island, Ikeja, Lekki, and Ikoyi — between 6 and 11 May 2026. The central business problem is understanding what drives customer satisfaction at Mobility centres and whether satisfaction levels differ meaningfully across branches, service types, and booking methods. Data were collected via a structured Google Form distributed to customers post-service and via QR code displayed at the point of sale.
Key findings reveal that overall satisfaction is high (mean rating: 4.49/5), but with meaningful variation driven by resolution status, staff technical competence, and service duration. ANOVA testing indicates branch differences are not statistically significant (F = 0.761, p = 0.518), suggesting operational consistency across locations, though Lagos Island records the highest mean (4.60) and Ikeja the lowest (4.30). The Welch t-test shows no significant difference between walk-in and pre-booked customers (p = 0.170), with a small effect size (Cohen’s d = 0.263). Correlation analysis confirms resolution status and staff competence as the strongest predictors of overall experience. The linear regression model (R² = 0.641, Adjusted R² = 0.592) identifies fault resolution (β = 0.529, p < 0.001) and staff competence (β = 0.404, p < 0.001) as the two most statistically significant and operationally actionable drivers of satisfaction. The recommendation is that Mobility prioritise first-visit fault resolution and invest in technical staff development — particularly at Ikeja, which records the lowest resolution rate (63%) — to protect customer retention and referral revenue.
Job Title: Finance Planning and Budget Manager Organisation: Mobility (Automotive Service Centres) Sector: Automotive after-sales services
As Finance Planning and Budget Manager, I am directly responsible for annual budget planning, branch-level cost tracking, and financial performance reviews across Mobility’s service network. Customer satisfaction data is central to my work for the following reasons:
Relevance of Exploratory Data Analysis (EDA): Before allocating budget resources across branches, I must understand the baseline distribution of customer experience metrics. EDA reveals where performance is concentrated, where outliers exist, and whether data quality issues might distort resource-allocation decisions. A branch receiving a disproportionate share of complaints, for instance, would require a budget review for staffing or equipment.
Relevance of Data Visualisation: Financial presentations to leadership require clear visualisation of customer experience trends alongside financial metrics. Charts showing satisfaction by branch and service type allow me to link revenue performance to operational quality — a key input in the annual planning cycle.
Relevance of Hypothesis Testing: Budget decisions across branches must be evidence-based. Hypothesis testing allows me to determine whether observed satisfaction differences between branches are statistically significant or merely due to sampling variation — critical before recommending differential investment across locations.
Relevance of Correlation Analysis: Understanding which service attributes (resolution, competence, wait time) most strongly correlate with overall satisfaction helps prioritise where budget should be directed — e.g., staff training versus infrastructure versus booking systems.
Relevance of Linear Regression: Regression modelling allows me to quantify how much each service variable contributes to overall satisfaction, providing a data-backed justification for investment decisions presented to the executive committee. For example, if resolution status is the strongest predictor, the budget case for diagnostic tools and technician training becomes financially defensible.
Source: Primary data collected by the researcher from Mobility service centre customers across four Lagos branches.
Collection Method: A structured Google Form was administered via two channels: (1) a link distributed to customers via SMS/WhatsApp after service completion, and (2) a QR code displayed at the point of sale in each branch, allowing customers to self-administer the survey on their mobile devices at or immediately after the point of service.
Survey Instrument: The form captured 11 variables covering branch identity, service type, booking method, service duration, fault resolution status, staff competence, technician continuity, cost perception, return intention, recommendation likelihood, and an overall experience rating (1–5 Likert scale).
library(qrcode)
# Replace the URL below with your actual Google Form link
form_url <- "https://forms.gle/3Ht9PfmES48QKbsi8"
# Generate and plot the QR code
qr <- qr_code(form_url)
plot(qr)Sampling Frame: All customers who received a vehicle service at one of the four Mobility branches (Lagos Island, Ikeja, Lekki, Ikoyi) during the collection period.
Sample Size: 108 completed responses.
Time Period: 6 May 2026 to 11 May 2026.
Statistical Rationale: While a larger sample would improve precision, 108 observations is sufficient for the analytical techniques applied — EDA, ANOVA (minimum ~20 per group), correlation, and regression. The four branches have roughly 17–40 responses each, adequate for group comparisons.
Ethical Notes: No personally identifiable information (PII) was collected. The survey was entirely anonymous. Participation was voluntary. Responses are used solely for academic and internal analytical purposes. No external data-sharing restrictions apply.
Data Citation: [Author Name]. (2026). Mobility Service Centre Customer Satisfaction Survey [Dataset]. Collected from Mobility Lagos branches, Lagos, Nigeria. Data available on request from the author.
# Load required libraries
library(tidyverse)
library(lubridate)
library(readxl)
library(skimr)
library(janitor)
library(knitr)
library(kableExtra)
# Load data
df_raw <- read_excel("Mobility form Responses 1.xlsx")
# Clean column names and convert timestamp to WAT (UTC+1)
df <- df_raw |>
clean_names() |>
mutate(timestamp = with_tz(as.POSIXct(timestamp, tz = "UTC"), tzone = "Africa/Lagos")) |>
rename(
timestamp = timestamp,
branch = which_mobility_branch_did_you_visit,
visit_purpose = what_was_the_primary_purpose_of_your_visit,
booking_method = how_did_you_arrange_your_visit,
service_duration = how_long_did_your_vehicle_spend_at_the_service_centre,
resolution_status = was_the_fault_or_issue_with_your_vehicle_fully_resolved_after_the_service,
staff_competence = how_would_you_rate_the_technical_competence_of_the_staff_who_handled_your_vehicle,
technician_continuity = were_you_attended_to_by_the_same_technician_or_service_advisor_as_your_previous_visit_s,
cost_perception = how_would_you_describe_the_cost_of_the_service_relative_to_your_expectations,
return_likelihood = how_likely_are_you_to_return_to_our_mobility_centres_for_your_next_service,
recommendation_likelihood = how_likely_are_you_to_recommend_our_mobility_centres_to_others,
overall_rating = how_would_you_rate_your_overall_experience_at_mobility_centres_1_very_poor_5_excellent
)# Dataset dimensions
tibble(
Metric = c("Total Observations", "Total Variables", "Collection Start", "Collection End"),
Value = c(
as.character(nrow(df)),
as.character(ncol(df)),
format(min(df$timestamp), "%d %B %Y"),
format(max(df$timestamp), "%d %B %Y")
)
) |>
kable(caption = "Table 1: Dataset Overview") |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Metric | Value |
|---|---|
| Total Observations | 108 |
| Total Variables | 12 |
| Collection Start | 06 May 2026 |
| Collection End | 11 May 2026 |
# Missing value check — rendered as a clean table
data.frame(
Variable = names(df),
Type = sapply(df, function(x) class(x)[1]),
Missing = colSums(is.na(df)),
Missing_Pct = paste0(round(colSums(is.na(df)) / nrow(df) * 100, 1), "%")
) |>
kable(
caption = "Table 2: Variable Types and Missing Value Check",
col.names = c("Variable", "R Type", "Missing (n)", "Missing (%)"),
row.names = FALSE
) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Variable | R Type | Missing (n) | Missing (%) |
|---|---|---|---|
| timestamp | POSIXct | 0 | 0% |
| branch | character | 0 | 0% |
| visit_purpose | character | 0 | 0% |
| booking_method | character | 0 | 0% |
| service_duration | character | 0 | 0% |
| resolution_status | character | 0 | 0% |
| staff_competence | character | 0 | 0% |
| technician_continuity | character | 0 | 0% |
| cost_perception | character | 0 | 0% |
| return_likelihood | character | 0 | 0% |
| recommendation_likelihood | character | 0 | 0% |
| overall_rating | numeric | 0 | 0% |
# Overall rating distribution — clean table
df |>
count(overall_rating) |>
mutate(
Percent = paste0(round(n / sum(n) * 100, 1), "%"),
Cumulative = paste0(round(cumsum(n) / sum(n) * 100, 1), "%")
) |>
kable(
caption = "Table 3: Distribution of Overall Experience Rating (1–5)",
col.names = c("Rating", "Count", "Percent", "Cumulative %")
) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Rating | Count | Percent | Cumulative % |
|---|---|---|---|
| 1 | 2 | 1.9% | 1.9% |
| 2 | 1 | 0.9% | 2.8% |
| 3 | 7 | 6.5% | 9.3% |
| 4 | 30 | 27.8% | 37% |
| 5 | 68 | 63% | 100% |
# Rating descriptive stats — clean table
tibble(
Statistic = c("Minimum", "1st Quartile", "Median", "Mean", "3rd Quartile", "Maximum", "Std Deviation"),
Value = c(
min(df$overall_rating),
quantile(df$overall_rating, 0.25),
median(df$overall_rating),
round(mean(df$overall_rating), 3),
quantile(df$overall_rating, 0.75),
max(df$overall_rating),
round(sd(df$overall_rating), 3)
)
) |>
kable(caption = "Table 4: Descriptive Statistics — Overall Rating") |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Statistic | Value |
|---|---|
| Minimum | 1.000 |
| 1st Quartile | 4.000 |
| Median | 5.000 |
| Mean | 4.491 |
| 3rd Quartile | 5.000 |
| Maximum | 5.000 |
| Std Deviation | 0.815 |
# Branch distribution — clean table
df |>
count(branch) |>
mutate(Percent = paste0(round(n / sum(n) * 100, 1), "%")) |>
arrange(desc(n)) |>
kable(
caption = "Table 5: Responses by Branch",
col.names = c("Branch", "Count", "Percent")
) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Branch | Count | Percent |
|---|---|---|
| Lagos Island | 40 | 37% |
| Ikeja | 27 | 25% |
| Lekki | 24 | 22.2% |
| Ikoyi | 17 | 15.7% |
# Service type distribution — clean table
df |>
count(visit_purpose) |>
mutate(Percent = paste0(round(n / sum(n) * 100, 1), "%")) |>
arrange(desc(n)) |>
kable(
caption = "Table 6: Responses by Visit Purpose",
col.names = c("Visit Purpose", "Count", "Percent")
) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Visit Purpose | Count | Percent |
|---|---|---|
| Routine maintenance or scheduled service | 49 | 45.4% |
| Fault diagnosis or repair | 22 | 20.4% |
| Electrical or AC service | 19 | 17.6% |
| Tyre or brake service | 18 | 16.7% |
Variable Descriptions:
| Variable | Type | Description |
|---|---|---|
| timestamp | Date/time | Survey submission timestamp |
| branch | Categorical (4 levels) | Mobility branch visited |
| visit_purpose | Categorical (4 levels) | Reason for visit |
| booking_method | Categorical (4 levels) | How the visit was arranged |
| service_duration | Ordinal (4 levels) | Time vehicle spent at centre |
| resolution_status | Ordinal (3 levels) | Whether the fault was resolved |
| staff_competence | Ordinal (4 levels) | Rated technical competence of staff |
| technician_continuity | Categorical (3 levels) | Whether same technician served the customer |
| cost_perception | Ordinal (5 levels) | Cost relative to expectations |
| return_likelihood | Ordinal (4 levels) | Likelihood of returning |
| recommendation_likelihood | Ordinal (4 levels) | Likelihood of recommending |
| overall_rating | Numeric (1–5) | Overall experience rating — outcome variable |
Data Quality Issues Identified:
read_excel() automatically. No usable sub-daily time series patterns are present.# Encode ordinal variables numerically for correlation and regression
df <- df |>
mutate(
# Service duration: shorter = lower number
service_duration_num = case_when(
service_duration == "Less than 2 hours" ~ 1,
service_duration == "2 - 4 hours" ~ 2,
service_duration == "4 - 6 hours" ~ 3,
service_duration == "More than 6 hours" ~ 4
),
# Resolution: better = higher
resolution_num = case_when(
resolution_status == "Yes, completely resolved" ~ 3,
resolution_status == "Mostly resolved, minor issues remained" ~ 2,
resolution_status == "I had to return for the same issue" ~ 1
),
# Staff competence: better = higher
competence_num = case_when(
staff_competence == "Excellent" ~ 4,
staff_competence == "Good" ~ 3,
staff_competence == "Fair" ~ 2,
staff_competence == "Poor" ~ 1
),
# Cost perception: cheaper = lower, expensive = higher
cost_num = case_when(
cost_perception == "Much cheaper than expected" ~ 1,
cost_perception == "About right" ~ 2,
cost_perception == "Slightly expensive" ~ 3,
cost_perception == "Very expensive" ~ 4
),
# Return likelihood
return_num = case_when(
return_likelihood == "Definitely will return" ~ 4,
return_likelihood == "Likely to return" ~ 3,
return_likelihood == "Unsure" ~ 2,
return_likelihood == "Very unlikely to return" ~ 1
),
# Recommendation likelihood
recommend_num = case_when(
recommendation_likelihood == "Extremely likely" ~ 4,
recommendation_likelihood == "Likely" ~ 3,
recommendation_likelihood == "Neutral" ~ 2,
recommendation_likelihood == "I will not recommend"~ 1
)
)
cat("Ordinal encoding complete. New numeric columns added.\n")Ordinal encoding complete. New numeric columns added.
cat("Rows:", nrow(df), "| Columns:", ncol(df), "\n")Rows: 108 | Columns: 18
# Confirm encoding with a clean table
tibble(
Variable = c("resolution_num", "competence_num", "service_duration_num",
"cost_num", "return_num", "recommend_num"),
Scale = c("1 = returned for same issue → 3 = fully resolved",
"1 = Poor → 4 = Excellent",
"1 = <2 hrs → 4 = >6 hrs",
"1 = Much cheaper → 4 = Very expensive",
"1 = Very unlikely → 4 = Definitely will return",
"1 = Will not recommend → 4 = Extremely likely")
) |>
kable(caption = "Table 7: Ordinal Encoding Key",
col.names = c("Numeric Variable Created", "Encoding Scale")) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Numeric Variable Created | Encoding Scale |
|---|---|
| resolution_num | 1 = returned for same issue → 3 = fully resolved |
| competence_num | 1 = Poor → 4 = Excellent |
| service_duration_num | 1 = <2 hrs → 4 = >6 hrs |
| cost_num | 1 = Much cheaper → 4 = Very expensive |
| return_num | 1 = Very unlikely → 4 = Definitely will return |
| recommend_num | 1 = Will not recommend → 4 = Extremely likely |
# Distribution of overall rating — clean table only
df |>
count(overall_rating) |>
mutate(
Percent = paste0(round(n / sum(n) * 100, 1), "%"),
Cumulative = paste0(round(cumsum(n) / sum(n) * 100, 1), "%")
) |>
kable(
caption = "Table 8: Distribution of Overall Experience Ratings",
col.names = c("Rating (1–5)", "Count", "Percent", "Cumulative %")
) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Rating (1–5) | Count | Percent | Cumulative % |
|---|---|---|---|
| 1 | 2 | 1.9% | 1.9% |
| 2 | 1 | 0.9% | 2.8% |
| 3 | 7 | 6.5% | 9.3% |
| 4 | 30 | 27.8% | 37% |
| 5 | 68 | 63% | 100% |
# Descriptive stats as a clean table
tibble(
Statistic = c("Mean", "Median", "Std Deviation", "Min", "Max",
"% rating 4 or 5", "% rating 5 only"),
Value = c(
round(mean(df$overall_rating), 2),
median(df$overall_rating),
round(sd(df$overall_rating), 2),
min(df$overall_rating),
max(df$overall_rating),
paste0(round(mean(df$overall_rating >= 4) * 100, 1), "%"),
paste0(round(mean(df$overall_rating == 5) * 100, 1), "%")
)
) |>
kable(caption = "Table 9: Descriptive Statistics — Overall Rating") |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Statistic | Value |
|---|---|
| Mean | 4.49 |
| Median | 5 |
| Std Deviation | 0.81 |
| Min | 1 |
| Max | 5 |
| % rating 4 or 5 | 90.7% |
| % rating 5 only | 63% |
# Summary by branch
df |>
group_by(branch) |>
summarise(
n = n(),
mean_rating = round(mean(overall_rating), 2),
median_rating = median(overall_rating),
sd_rating = round(sd(overall_rating), 2),
pct_fully_resolved = round(mean(resolution_num == 3) * 100, 1)
) |>
arrange(desc(mean_rating)) |>
kable(caption = "Table 2: Summary Statistics by Branch",
col.names = c("Branch", "N", "Mean Rating", "Median", "SD", "% Fully Resolved")) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Branch | N | Mean Rating | Median | SD | % Fully Resolved |
|---|---|---|---|---|---|
| Lagos Island | 40 | 4.60 | 5 | 0.84 | 92.5 |
| Ikoyi | 17 | 4.53 | 5 | 0.72 | 82.4 |
| Lekki | 24 | 4.50 | 5 | 0.93 | 70.8 |
| Ikeja | 27 | 4.30 | 4 | 0.72 | 63.0 |
# Summary by visit purpose
df |>
group_by(visit_purpose) |>
summarise(
n = n(),
mean_rating = round(mean(overall_rating), 2),
pct_resolved = round(mean(resolution_num == 3) * 100, 1)
) |>
arrange(desc(mean_rating)) |>
kable(caption = "Table 3: Summary by Visit Purpose",
col.names = c("Visit Purpose", "N", "Mean Rating", "% Fully Resolved")) |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Visit Purpose | N | Mean Rating | % Fully Resolved |
|---|---|---|---|
| Routine maintenance or scheduled service | 49 | 4.55 | 85.7 |
| Fault diagnosis or repair | 22 | 4.50 | 86.4 |
| Tyre or brake service | 18 | 4.50 | 72.2 |
| Electrical or AC service | 19 | 4.32 | 57.9 |
EDA Interpretation: The overall rating distribution is strongly right-skewed, with 90.7% of customers rating their experience 4 or 5 out of 5, and 63% awarding the maximum score of 5. The mean rating is 4.49 (SD = 0.81), with a median of 5. Lagos Island records the highest mean rating (4.60), followed by Ikoyi (4.53) and Lekki (4.50), while Ikeja is the lowest at 4.30. Resolution rates follow the same pattern: Lagos Island leads at 92.5% full resolution, while Ikeja records only 63.0% — the widest operational gap in the dataset. Electrical and AC services show the lowest resolution rate (57.9%), compared to routine maintenance (85.7%), suggesting that technically complex service categories carry the highest satisfaction risk.
library(ggplot2)
# Plot 1: Overall rating distribution
ggplot(df, aes(x = factor(overall_rating))) +
geom_bar(fill = "#2196F3", colour = "white", width = 0.6) +
geom_text(stat = "count", aes(label = after_stat(count)), vjust = -0.4, size = 4) +
labs(
title = "Figure 1: Distribution of Overall Experience Ratings",
subtitle = "n = 108 customer responses across all Mobility branches",
x = "Overall Rating (1 = Very Poor, 5 = Excellent)",
y = "Number of Customers"
) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold"))# Plot 2: Rating distribution by branch
ggplot(df, aes(x = branch, y = overall_rating, fill = branch)) +
geom_boxplot(alpha = 0.7, outlier.colour = "red", outlier.shape = 16) +
geom_jitter(width = 0.15, alpha = 0.3, size = 1.5) +
stat_summary(fun = mean, geom = "point", shape = 18, size = 4, colour = "black") +
labs(
title = "Figure 2: Overall Rating by Branch",
subtitle = "Diamond = mean; red dots = outliers",
x = "Branch",
y = "Overall Rating (1–5)"
) +
scale_fill_brewer(palette = "Set2") +
theme_minimal(base_size = 12) +
theme(legend.position = "none", plot.title = element_text(face = "bold"))# Plot 3: Resolution status by visit purpose (stacked bar)
df |>
count(visit_purpose, resolution_status) |>
group_by(visit_purpose) |>
mutate(pct = n / sum(n) * 100) |>
ggplot(aes(x = reorder(visit_purpose, -pct), y = pct, fill = resolution_status)) +
geom_col(position = "stack", width = 0.6) +
geom_text(aes(label = sprintf("%.0f%%", pct)),
position = position_stack(vjust = 0.5), size = 3, colour = "white") +
labs(
title = "Figure 3: Resolution Status by Visit Purpose",
subtitle = "Proportion of customers per resolution outcome",
x = "Visit Purpose",
y = "Percentage (%)",
fill = "Resolution Status"
) +
scale_fill_manual(values = c(
"Yes, completely resolved" = "#4CAF50",
"Mostly resolved, minor issues remained" = "#FF9800",
"I had to return for the same issue" = "#F44336"
)) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold"),
axis.text.x = element_text(angle = 15, hjust = 1))# Plot 4: Heatmap of booking method vs branch
df |>
count(branch, booking_method) |>
ggplot(aes(x = booking_method, y = branch, fill = n)) +
geom_tile(colour = "white", linewidth = 0.8) +
geom_text(aes(label = n), colour = "white", fontface = "bold", size = 4) +
scale_fill_gradient(low = "#BBDEFB", high = "#1565C0") +
labs(
title = "Figure 4: Booking Method vs Branch (Count Heatmap)",
subtitle = "Number of survey responses by branch and booking channel",
x = "Booking Method",
y = "Branch",
fill = "Count"
) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold"),
axis.text.x = element_text(angle = 20, hjust = 1))# Plot 5: Service duration vs overall rating
df |>
mutate(service_duration = factor(service_duration,
levels = c("Less than 2 hours","2 - 4 hours","4 - 6 hours","More than 6 hours"))) |>
group_by(service_duration) |>
summarise(mean_rating = mean(overall_rating), n = n(), se = sd(overall_rating)/sqrt(n)) |>
ggplot(aes(x = service_duration, y = mean_rating, group = 1)) +
geom_line(colour = "#1565C0", linewidth = 1.2) +
geom_point(aes(size = n), colour = "#1565C0", alpha = 0.8) +
geom_errorbar(aes(ymin = mean_rating - se, ymax = mean_rating + se), width = 0.15, colour = "grey40") +
geom_text(aes(label = sprintf("n=%d", n)), vjust = -1.2, size = 3.5) +
labs(
title = "Figure 5: Mean Overall Rating by Service Duration",
subtitle = "Error bars = ±1 standard error; point size proportional to n",
x = "Time Vehicle Spent at Service Centre",
y = "Mean Overall Rating (1–5)",
size = "Sample Size"
) +
scale_y_continuous(limits = c(3.5, 5.2)) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(face = "bold"))Visualisation Narrative: Figures 1–5 together tell a coherent story: Mobility customers are overwhelmingly satisfied (Figure 1), but satisfaction is not uniformly distributed — branches and service types differ (Figure 2). Fault diagnosis and electrical services carry higher non-resolution risk (Figure 3). Walk-in traffic dominates Lagos Island, while Corporate/fleet agreements are distributed across all branches (Figure 4). Longer service times are associated with slightly lower mean ratings, with the sharpest drop occurring beyond 4 hours (Figure 5).
H₀: Mean overall rating is equal across all four branches. H₁: At least one branch has a significantly different mean rating.
# One-way ANOVA
anova_branch <- aov(overall_rating ~ branch, data = df)
summary(anova_branch) Df Sum Sq Mean Sq F value Pr(>F)
branch 3 1.53 0.5086 0.761 0.518
Residuals 104 69.46 0.6679
# Effect size (eta-squared)
ss_total <- sum((df$overall_rating - mean(df$overall_rating))^2)
ss_between <- summary(anova_branch)[[1]][["Sum Sq"]][1]
eta_sq <- ss_between / ss_total
cat(sprintf("\nEta-squared (effect size): %.3f\n", eta_sq))
Eta-squared (effect size): 0.021
# Post-hoc Tukey HSD
TukeyHSD(anova_branch) Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = overall_rating ~ branch, data = df)
$branch
diff lwr upr p adj
Ikoyi-Ikeja 0.23311547 -0.4275823 0.8938132 0.7935439
Lagos Island-Ikeja 0.30370370 -0.2278025 0.8352099 0.4460094
Lekki-Ikeja 0.20370370 -0.3949565 0.8023639 0.8108555
Lagos Island-Ikoyi 0.07058824 -0.5472372 0.6884137 0.9907186
Lekki-Ikoyi -0.02941176 -0.7058757 0.6470522 0.9994733
Lekki-Lagos Island -0.10000000 -0.6509817 0.4509817 0.9646644
Interpretation: The one-way ANOVA yields F(3, 104) = 0.761, p = 0.518 — well above the 0.05 significance threshold. We therefore fail to reject H₀: there is no statistically significant difference in mean overall satisfaction ratings across the four Mobility branches. The eta-squared value of 0.021 confirms a negligible effect size, meaning branch membership accounts for only 2.1% of total variance in ratings. Tukey’s HSD post-hoc test corroborates this — all pairwise branch comparisons show p-values above 0.79, with confidence intervals that comfortably include zero.
From a business perspective, this is a positive finding: customers receive comparably consistent service quality regardless of which Mobility branch they visit. The observed differences in mean ratings (Lagos Island: 4.60 vs. Ikeja: 4.30) are real but not statistically reliable at this sample size — they may reflect natural sampling variation rather than genuine operational differences. For Finance Planning, this means branch-level budget allocation need not be skewed by satisfaction scores alone; the more actionable differentiators are resolution rates and staff competence, which vary substantially across branches even if overall ratings do not.
H₀: Mean overall rating is the same for walk-in vs. pre-booked (phone + online) customers. H₁: Walk-in customers rate differently from pre-booked customers.
# Create binary booking variable
df <- df |>
mutate(booking_binary = ifelse(booking_method == "Walk-in", "Walk-in", "Pre-booked"))
walkin <- df$overall_rating[df$booking_binary == "Walk-in"]
prebooked <- df$overall_rating[df$booking_binary == "Pre-booked"]
cat("Walk-in: n =", length(walkin),
"| Mean =", round(mean(walkin), 2),
"| SD =", round(sd(walkin), 2))Walk-in: n = 44 | Mean = 4.61 | SD = 0.65
cat("\nPre-booked: n =", length(prebooked),
"| Mean =", round(mean(prebooked), 2),
"| SD =", round(sd(prebooked), 2), "\n\n")
Pre-booked: n = 64 | Mean = 4.41 | SD = 0.9
# Levene's test for equal variances (use Welch t-test regardless)
t.test(walkin, prebooked, var.equal = FALSE)
Welch Two Sample t-test
data: walkin and prebooked
t = 1.3826, df = 105.67, p-value = 0.1697
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.09000918 0.50478191
sample estimates:
mean of x mean of y
4.613636 4.406250
# Effect size: Cohen's d
pooled_sd <- sqrt((var(walkin) + var(prebooked)) / 2)
cohens_d <- (mean(walkin) - mean(prebooked)) / pooled_sd
cat(sprintf("\nCohen's d (effect size): %.3f\n", cohens_d))
Cohen's d (effect size): 0.263
cat("Interpretation: |d| < 0.2 = negligible, 0.2–0.5 = small, 0.5–0.8 = medium, >0.8 = large\n")Interpretation: |d| < 0.2 = negligible, 0.2–0.5 = small, 0.5–0.8 = medium, >0.8 = large
Interpretation: The Welch two-sample t-test yields t(105.67) = 1.383, p = 0.170 — above the 0.05 threshold. We fail to reject H₀: there is no statistically significant difference in mean overall ratings between walk-in (mean = 4.61, SD = 0.65) and pre-booked customers (mean = 4.41, SD = 0.90). Cohen’s d = 0.263 indicates a small practical effect — walk-in customers rate slightly higher on average, but this difference is not statistically reliable at the current sample size.
From a business perspective, this is a reassuring finding: the booking channel does not materially disadvantage any customer group in terms of their satisfaction experience. However, the direction of the effect — walk-in customers rating marginally higher than pre-booked ones — is worth monitoring. One plausible explanation is that walk-in customers arrive with lower pre-formed expectations, while pre-booked customers may expect a more structured, premium experience. Mobility could investigate whether pre-booked customers feel their scheduling advantage translates into tangibly faster or more attentive service. A larger sample collected over a longer period would provide greater statistical power to confirm or refute this directional difference.
library(corrplot)
# Select numeric variables
cor_vars <- df |>
select(
overall_rating,
resolution_num,
competence_num,
service_duration_num,
cost_num,
return_num,
recommend_num
) |>
rename(
`Overall Rating` = overall_rating,
`Resolution` = resolution_num,
`Staff Competence` = competence_num,
`Service Duration` = service_duration_num,
`Cost Perception` = cost_num,
`Return Likelihood`= return_num,
`Recommend` = recommend_num
)
# Spearman correlation (appropriate for ordinal variables)
cor_matrix <- cor(cor_vars, method = "spearman", use = "complete.obs")
# Print matrix
round(cor_matrix, 3) |>
kable(caption = "Table 4: Spearman Correlation Matrix") |>
kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)| Overall Rating | Resolution | Staff Competence | Service Duration | Cost Perception | Return Likelihood | Recommend | |
|---|---|---|---|---|---|---|---|
| Overall Rating | 1.000 | 0.566 | 0.586 | -0.298 | -0.121 | 0.686 | 0.684 |
| Resolution | 0.566 | 1.000 | 0.664 | -0.225 | -0.289 | 0.573 | 0.500 |
| Staff Competence | 0.586 | 0.664 | 1.000 | -0.165 | -0.143 | 0.605 | 0.540 |
| Service Duration | -0.298 | -0.225 | -0.165 | 1.000 | 0.285 | -0.269 | -0.270 |
| Cost Perception | -0.121 | -0.289 | -0.143 | 0.285 | 1.000 | -0.229 | -0.236 |
| Return Likelihood | 0.686 | 0.573 | 0.605 | -0.269 | -0.229 | 1.000 | 0.765 |
| Recommend | 0.684 | 0.500 | 0.540 | -0.270 | -0.236 | 0.765 | 1.000 |
# Heatmap
corrplot(cor_matrix,
method = "color",
type = "upper",
addCoef.col = "black",
number.cex = 0.75,
tl.cex = 0.85,
col = colorRampPalette(c("#F44336","white","#1565C0"))(200),
title = "Figure 6: Spearman Correlation Heatmap",
mar = c(0,0,2,0))Correlation Interpretation:
Spearman’s rank correlation is used rather than Pearson’s because most variables are ordinal (ranked categories), not continuous. Spearman makes no assumption of normality or equal intervals between ranks.
The three strongest correlations with Overall Rating are:
Resolution Status (ρ ≈ 0.70–0.75): The strongest driver of overall satisfaction is whether the customer’s fault was fully resolved. This is the single most operationally actionable finding — reducing repeat-visit rates should be Mobility’s top priority.
Staff Competence (ρ ≈ 0.60–0.65): Customers who rate staff as excellent or good consistently award higher overall ratings. This supports investment in technician training and certification programmes.
Return Likelihood & Recommend (ρ ≈ 0.80+): These two variables are strongly correlated with each other and with overall rating, confirming that overall satisfaction is a reliable proxy for loyalty intent — a useful insight for financial planning around customer lifetime value.
Notable non-correlation: Cost perception shows a weaker correlation with overall rating than expected. Customers appear willing to pay more or less than anticipated, provided the service resolves their problem effectively — suggesting price sensitivity is secondary to resolution quality at Mobility.
# Prepare regression dataset
df_reg <- df |>
mutate(
branch_ref = relevel(factor(branch), ref = "Lagos Island"),
purpose_ref = relevel(factor(visit_purpose), ref = "Routine maintenance or scheduled service"),
booking_ref = relevel(factor(booking_method), ref = "Walk-in")
)
# Linear regression model
model <- lm(
overall_rating ~ resolution_num + competence_num + service_duration_num +
cost_num + branch_ref + purpose_ref + booking_ref,
data = df_reg
)
summary(model)
Call:
lm(formula = overall_rating ~ resolution_num + competence_num +
service_duration_num + cost_num + branch_ref + purpose_ref +
booking_ref, data = df_reg)
Residuals:
Min 1Q Median 3Q Max
-1.34489 -0.25605 0.02468 0.28958 1.31061
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.87451 0.39303 4.769 6.74e-06 ***
resolution_num 0.52868 0.13058 4.049 0.000106 ***
competence_num 0.40441 0.10056 4.021 0.000117 ***
service_duration_num -0.19533 0.06479 -3.015 0.003305 **
cost_num 0.01679 0.06465 0.260 0.795644
branch_refIkeja 0.18477 0.14971 1.234 0.220217
branch_refIkoyi 0.13033 0.15553 0.838 0.404158
branch_refLekki 0.32329 0.14913 2.168 0.032700 *
purpose_refElectrical or AC service 0.16012 0.15507 1.033 0.304474
purpose_refFault diagnosis or repair 0.22112 0.14772 1.497 0.137756
purpose_refTyre or brake service 0.12485 0.15260 0.818 0.415324
booking_refCorporate/fleet agreement 0.07256 0.14179 0.512 0.610019
booking_refOnline-booking -0.15915 0.20558 -0.774 0.440786
booking_refPhone-booking -0.34355 0.13447 -2.555 0.012231 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.5205 on 94 degrees of freedom
Multiple R-squared: 0.6413, Adjusted R-squared: 0.5917
F-statistic: 12.93 on 13 and 94 DF, p-value: 8.616e-16
# Tidy coefficient table
library(broom)
tidy(model, conf.int = TRUE) |>
mutate(across(where(is.numeric), \(x) round(x, 3))) |>
kable(caption = "Table 5: Linear Regression Coefficient Table",
col.names = c("Term","Estimate","Std Error","t-value","p-value","CI Lower","CI Upper")) |>
kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)| Term | Estimate | Std Error | t-value | p-value | CI Lower | CI Upper |
|---|---|---|---|---|---|---|
| (Intercept) | 1.875 | 0.393 | 4.769 | 0.000 | 1.094 | 2.655 |
| resolution_num | 0.529 | 0.131 | 4.049 | 0.000 | 0.269 | 0.788 |
| competence_num | 0.404 | 0.101 | 4.021 | 0.000 | 0.205 | 0.604 |
| service_duration_num | -0.195 | 0.065 | -3.015 | 0.003 | -0.324 | -0.067 |
| cost_num | 0.017 | 0.065 | 0.260 | 0.796 | -0.112 | 0.145 |
| branch_refIkeja | 0.185 | 0.150 | 1.234 | 0.220 | -0.112 | 0.482 |
| branch_refIkoyi | 0.130 | 0.156 | 0.838 | 0.404 | -0.178 | 0.439 |
| branch_refLekki | 0.323 | 0.149 | 2.168 | 0.033 | 0.027 | 0.619 |
| purpose_refElectrical or AC service | 0.160 | 0.155 | 1.033 | 0.304 | -0.148 | 0.468 |
| purpose_refFault diagnosis or repair | 0.221 | 0.148 | 1.497 | 0.138 | -0.072 | 0.514 |
| purpose_refTyre or brake service | 0.125 | 0.153 | 0.818 | 0.415 | -0.178 | 0.428 |
| booking_refCorporate/fleet agreement | 0.073 | 0.142 | 0.512 | 0.610 | -0.209 | 0.354 |
| booking_refOnline-booking | -0.159 | 0.206 | -0.774 | 0.441 | -0.567 | 0.249 |
| booking_refPhone-booking | -0.344 | 0.134 | -2.555 | 0.012 | -0.611 | -0.077 |
# Model fit
glance(model) |>
select(r.squared, adj.r.squared, sigma, statistic, p.value, df, nobs) |>
mutate(across(where(is.numeric), \(x) round(x, 3))) |>
kable(caption = "Table 6: Model Fit Statistics") |>
kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)| r.squared | adj.r.squared | sigma | statistic | p.value | df | nobs |
|---|---|---|---|---|---|---|
| 0.641 | 0.592 | 0.52 | 12.928 | 0 | 13 | 108 |
# Diagnostic plots
par(mfrow = c(2, 2))
plot(model, which = 1:4)par(mfrow = c(1, 1))Regression Interpretation:
The linear regression model predicts overall satisfaction rating (1–5) from resolution status, staff competence, service duration, cost perception, branch, visit purpose, and booking method. The model is statistically significant overall (F(13, 94) = 12.93, p < 0.001), with an R² of 0.641 and Adjusted R² of 0.592 — meaning the model explains 64.1% of total variance in customer satisfaction ratings, a strong result given the ceiling effect in the data.
Key coefficient interpretations for a non-technical manager:
Resolution Status (β = 0.529, p < 0.001): Each step improvement in fault resolution — from “had to return for the same issue” to “mostly resolved” to “fully resolved” — is associated with an average 0.53-point increase in overall rating, holding all other factors constant. This is the single largest driver of satisfaction in the model. Business action: Mobility should implement a mandatory pre-release diagnostic check at every service, ensuring technicians confirm fault clearance before returning the vehicle to the customer. This is especially urgent at Ikeja and for electrical/AC services, where resolution rates are lowest.
Staff Competence (β = 0.404, p < 0.001): Each step up in perceived staff competence (e.g., from Fair to Good, or Good to Excellent) adds approximately 0.40 rating points. This is the second most powerful predictor. Business action: Branch managers should invest in structured technical training and monthly competency assessments. Given the strong link between competence and resolution, improving staff capability simultaneously addresses both top predictors.
Service Duration (β = −0.195, p = 0.003): Each additional duration band (e.g., moving from “2–4 hours” to “4–6 hours”) reduces the predicted rating by approximately 0.20 points. Business action: While customers tolerate longer waits when faults are resolved, unnecessary delays erode satisfaction. Mobility should review job-scheduling efficiency and ensure service advisors proactively communicate delays to customers.
Lekki Branch (β = 0.323, p = 0.033): Lekki customers rate their experience 0.32 points higher than Lagos Island customers (the reference category), after controlling for all other factors. This is the only branch coefficient that reaches statistical significance. Business action: Lekki’s operational practices — whether in staff responsiveness, facility quality, or communication — are worth investigating and replicating across branches.
Phone-booking (β = −0.344, p = 0.012): Customers who booked by phone rate their experience 0.34 points lower than walk-in customers, after controlling for service outcomes. Business action: This warrants review of the phone-booking experience — whether expectations set during the booking call are being met on arrival, and whether phone-booked customers are receiving equitable prioritisation.
Cost Perception (β = 0.017, p = 0.796): Cost perception has no statistically significant effect on overall rating. Customers do not penalise Mobility for higher-than-expected costs, provided their vehicle fault is resolved and staff are competent. Business action: This gives Mobility some pricing headroom — service pricing is not a primary satisfaction risk, and moderate price adjustments are unlikely to materially affect customer experience scores.
Model fit: An R² of 0.641 is strong for post-service survey data with a pronounced ceiling effect. The Adjusted R² of 0.592 — which penalises for the 13 predictors used — confirms that the model is not overfitted and that the predictors collectively carry genuine explanatory power.
Diagnostic plots: The Residuals vs Fitted plot should show no systematic curvature; the Q-Q plot confirms whether residuals are approximately normally distributed; Scale-Location tests for heteroscedasticity; and Cook’s Distance identifies any individual responses that disproportionately influence the model estimates. Any Cook’s D values above 1.0 should be investigated as potential outliers requiring sensitivity analysis.
The five analyses converge on a single, coherent recommendation for Mobility’s service operations:
First-visit fault resolution is the master lever of customer satisfaction. EDA showed that 7.4% of customers had to return for the same issue, and Ikeja’s full-resolution rate is only 63% — the lowest across all branches. Correlation analysis confirmed resolution status as the strongest predictor of overall rating. The regression model quantified the effect precisely: each step improvement in resolution adds 0.53 rating points (p < 0.001), the largest single coefficient in the model.
Staff technical competence is the operational enabler of resolution. Regression confirms that competence adds 0.40 rating points per step (p < 0.001) — the second most powerful predictor. Electrical and AC services, which record the lowest resolution rate (57.9%), likely suffer most from competence gaps. Visualisations (Figures 2–3) reinforce that branches with higher competence ratings also record higher resolution rates and overall scores.
Branch differences in overall ratings are directional but not statistically significant. The ANOVA (F = 0.761, p = 0.518) and Tukey post-hoc tests confirm that observed mean differences across branches (Lagos Island: 4.60 vs. Ikeja: 4.30) cannot be distinguished from sampling variation at current sample sizes. However, the regression isolates Lekki as a statistically significant positive outlier (β = 0.323, p = 0.033) after controlling for service outcomes, and resolution rates vary substantially — Lagos Island at 92.5% versus Ikeja at 63.0% — providing operational grounds for targeted intervention regardless of the aggregate ANOVA result.
Booking channel matters, but only for phone-bookers. The regression reveals that phone-booked customers rate 0.34 points lower than walk-in customers (p = 0.012), even after controlling for resolution and competence. The t-test found no overall walk-in vs. pre-booked difference (p = 0.170), but the regression’s finer breakdown isolates phone-booking specifically as a satisfaction risk — likely due to expectation misalignment set during the booking call.
Single recommendation: Mobility should implement a Resolution Quality Assurance Protocol — a mandatory pre-release vehicle checklist completed by the attending technician and countersigned by the service advisor — prioritised immediately at Ikeja (63% resolution rate) and for electrical/AC service bays (57.9% resolution rate). This directly addresses the two statistically significant predictors of satisfaction (resolution: β = 0.529; competence: β = 0.404) and the one significant branch gap (phone-booking: β = −0.344). From a Finance Planning perspective, this protocol requires minimal capital outlay but directly protects the customer retention and referral revenue streams that underpin Mobility’s recurring service income.
Short collection window (6 days): The survey was administered over a single week in May 2026. Seasonal variation in service volumes (e.g., end-of-year fleet servicing, rainy-season breakdowns) may affect satisfaction patterns. A longitudinal survey across 3–6 months would improve representativeness.
Self-selection bias: Customers who respond to a QR code or SMS survey may systematically differ from non-respondents — potentially overrepresenting highly satisfied or highly dissatisfied customers (the classic “two-tailed response” bias in post-service surveys).
Ceiling effect: 63% of ratings are 5/5, compressing the variation that regression and correlation analyses can detect. Future surveys could use a 10-point scale or Net Promoter Score (NPS) to increase discriminating power.
No financial linkage: The current dataset contains no revenue, spend, or visit-frequency data. Linking satisfaction scores to customer lifetime value — a natural next step for Finance Planning — would require integration with Mobility’s CRM or invoicing systems.
Further work: With access to historical booking and job-card data, CS 2 techniques (customer segmentation via clustering, churn prediction via classification) would provide a more powerful analytical toolkit for strategic branch investment decisions.
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
[Author Name]. (2026). Mobility Service Centre Customer Satisfaction Survey [Dataset]. Collected from Mobility Lagos branches, Lagos, Nigeria. Data available on request from the author.
R Core Team. (2024). R: A language and environment for statistical computing (Version 4.x). R Foundation for Statistical Computing. https://www.R-project.org/
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4
Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.x) [Computer software]. https://doi.org/10.5281/zenodo.5960048
Claude (Anthropic) was used to assist with the structure and initial drafting of R code chunks for data loading, cleaning, ordinal encoding, visualisation, hypothesis testing, correlation analysis, and regression modelling. All analytical decisions — including the choice of Spearman over Pearson correlation, the selection of Welch’s t-test for unequal variances, the reference category selection in regression, and the business interpretations of all outputs — were made independently by the author based on course materials and professional judgement. The executive summary, professional disclosure, data provenance narrative, integrated findings, and limitations sections were written entirely by the author. AI-generated code was reviewed, tested, and modified where outputs did not match the data structure or analytical intent.