Dalos Cuisine Relaunch Feasibility Study: An Exploratory & Inferential Analysis of Consumer Preferences in Lagos

Author

Nwodo Ezinne

Published

May 26, 2026

1. Executive Summary

This study analyses consumer demand and preference data collected to assess the feasibility of relaunching Dalos Cuisine — a traditional Nigerian restaurant that previously operated in Aaron’s Mall, Lekki Phase 1, Lagos, in 2020. A structured survey was administered to 100 confirmed Lagos-resident respondents between March and April 2026, producing a dataset of 76 variables covering demographics, dining behaviours, service-quality expectations, spending patterns, and relaunch sentiment.

Five analytical techniques were applied: (1) Exploratory Data Analysis revealed that food taste, hygiene, and consistency are the dominant selection criteria, while inconsistent quality and poor hygiene lead all dissatisfaction drivers; (2) Data Visualisation linked spending power, visit frequency, and channel preference in a five-plot narrative; (3) Hypothesis Testing confirmed that higher-educated respondents spend significantly more per meal, and that premium-price willingness varies by employment status; (4) Correlation Analysis showed that core food-quality attributes are tightly inter-correlated, and that delivery preference co-moves with relaunch intent; and (5) Logistic Regression identified food taste importance, premium willingness, and hygiene importance as the strongest predictors of high relaunch patronage intent (AUC > 0.70).

Key recommendation: Dalos Cuisine should relaunch with an uncompromising quality-first proposition, a day-one delivery channel, and pricing in the ₦3,000–₦8,000 range, targeting employed professionals in the Victoria Island / Lekki / Ikoyi corridor.

2. Professional Disclosure

Job Title: Marketing Communications Lead
Organisation: Knowledge Exchange Centre
Location: Lagos, Nigeria

Why each technique is operationally relevant:

EDA: With 76 survey variables spanning Likert scales, categorical fields, and free text, rigorous EDA is essential to surface data quality issues before any inference is drawn. Undetected outliers or encoding errors in a feasibility study directly mislead investment decisions.
Data Visualisation: The end audience for this study is a business owner and potential investors — non-technical stakeholders who require charts, not tables. Visualisation translates multi-dimensional preference data into a single, actionable story about who the customer is and what they want.
Hypothesis Testing: Pricing strategy and segment targeting require statistical evidence, not intuition. Formal tests with stated α levels convert descriptive observations (“postgraduates seem to spend more”) into defensible business decisions (“postgraduates spend significantly more; p < 0.05”).
Correlation Analysis: Understanding attribute co-movement helps design a coherent service proposition and prevents redundant investment. It also flags multicollinearity before regression modelling.
Logistic Regression: The business question is ultimately binary — will this person come? Logistic regression quantifies each attribute’s contribution to that probability and produces an odds ratio that management can translate into a concrete action.

3. Data Collection & Sampling

Source: Primary data collected by [Your Name] via a structured Google Form survey.
Collection method: Self-administered online questionnaire distributed through WhatsApp, LinkedIn, and direct outreach within the researcher’s professional and social network in Lagos.
Target population: Adults residing in Lagos State who eat traditional Nigerian food outside the home at least occasionally.
Sampling frame: Non-probability convenience/snowball sample targeting respondents across Lagos Mainland, Lekki, Victoria Island, Ajah, and Ikoyi.
Sample size: 117 total responses; 100 confirmed Lagos residents retained after excluding 17 non-Lagos respondents.
Time period: 24 March 2026 – 30 April 2026 (five weeks).
Ethical notes: No personally identifiable information was collected. Participation was voluntary; consent was implied by form submission.

4. Data Description

Code

setwd("C:/Users/zinny/OneDrive/Desktop/DA EXAM/DA Exam")

library(tidyverse); library(readxl);  library(janitor)
library(skimr);     library(corrplot); library(ggcorrplot)
library(scales);    library(knitr);    library(kableExtra)
library(broom);     library(pROC);     library(car)
library(rstatix);   library(ggpubr);   library(viridis)
library(patchwork); library(effectsize)

raw <- read_excel("Dalos Data.xlsx")
colnames(raw) <- paste0("c", seq_len(ncol(raw)))

df_raw <- raw |> filter(str_trim(as.character(c2)) == "Yes")
cat("Lagos-resident respondents retained:", nrow(df_raw), "\n")

Lagos-resident respondents retained: 100

Code

cat("Total variables:", ncol(df_raw), "\n")

Total variables: 76

Code

df <- df_raw |>
  rename(
    timestamp=c1, area_live=c3, area_work=c4, gender=c5,
    education=c6, employment=c7, marital_status=c9, household_size=c10,
    visit_frequency=c11, fav_soups=c12, dining_channel=c15, spend_raw=c16,
    imp_taste=c17, imp_freshness=c18, imp_consistency=c19, imp_portions=c20,
    imp_hygiene=c21, imp_ambience=c22, imp_location=c23, imp_parking=c24,
    imp_speed=c25, imp_staff=c26, imp_delivery=c27, imp_online_ord=c28,
    imp_pricing=c29, imp_variety=c30, imp_takeaway=c31, imp_authentic=c32,
    dissatisfaction=c33, premium_willing=c34, pref_setting=c35,
    aware_dalos=c46, overall_exp=c47, food_quality_exp=c48, relaunch_intent=c50
  )

encode_likert <- function(x) {
  x <- str_to_lower(str_trim(iconv(as.character(x), to="ASCII//TRANSLIT")))
  dplyr::case_when(
    str_detect(x,"not important") ~ 1L, str_detect(x,"slightly")   ~ 2L,
    str_detect(x,"moderately")    ~ 3L, str_detect(x,"^important") ~ 4L,
    str_detect(x,"extremely")     ~ 5L, TRUE ~ NA_integer_
  )
}

likert_cols <- c("imp_taste","imp_freshness","imp_consistency","imp_portions",
                 "imp_hygiene","imp_ambience","imp_location","imp_parking",
                 "imp_speed","imp_staff","imp_delivery","imp_online_ord",
                 "imp_pricing","imp_variety","imp_takeaway","imp_authentic")

df <- df |> mutate(across(all_of(likert_cols), encode_likert)) |>
  mutate(
    spend_num = dplyr::case_when(
      str_detect(as.character(spend_raw),"(?i)below|elow")        ~ 1L,
      str_detect(as.character(spend_raw),"1.?500|1500")           ~ 2L,
      str_detect(as.character(spend_raw),"3.?001|3001")           ~ 3L,
      str_detect(as.character(spend_raw),"5.?001|5001")           ~ 4L,
      str_detect(as.character(spend_raw),"(?i)above|bove|8.?000") ~ 5L,
      TRUE ~ NA_integer_
    ),
    spend_label = factor(dplyr::case_when(
      spend_num==1L~"Below N1,500", spend_num==2L~"N1,500-3,000",
      spend_num==3L~"N3,001-5,000", spend_num==4L~"N5,001-8,000",
      spend_num==5L~"Above N8,000", TRUE~NA_character_),
      levels=c("Below N1,500","N1,500-3,000","N3,001-5,000","N5,001-8,000","Above N8,000")),
    visit_num = dplyr::case_when(
      str_detect(as.character(visit_frequency),"(?i)less")              ~ 1L,
      str_detect(as.character(visit_frequency),"(?i)month")             ~ 2L,
      str_detect(as.character(visit_frequency),"(?i)1.2.*week|1.*2.*week") ~ 3L,
      str_detect(as.character(visit_frequency),"(?i)3.4|3.*4")          ~ 4L,
      str_detect(as.character(visit_frequency),"(?i)daily")             ~ 5L,
      TRUE ~ NA_integer_
    ),
    freq_label = factor(dplyr::case_when(
      visit_num==1L~"< Once/month", visit_num==2L~"1-2x/month",
      visit_num==3L~"1-2x/week",   visit_num==4L~"3-4x/week",
      visit_num==5L~"Daily",        TRUE~NA_character_),
      levels=c("< Once/month","1-2x/month","1-2x/week","3-4x/week","Daily")),
    edu_group = factor(dplyr::case_when(
      str_detect(as.character(education),"(?i)secondary|waec|neco|ond") ~ "Secondary/OND",
      str_detect(as.character(education),"(?i)hnd")                     ~ "HND",
      str_detect(as.character(education),"(?i)bachelor|b\\.sc|b\\.a")   ~ "Bachelor's",
      str_detect(as.character(education),"(?i)postgrad|mba|m\\.sc|ph")  ~ "Postgraduate",
      str_detect(as.character(education),"(?i)professional|cert")       ~ "Professional Cert",
      TRUE~"Other"),
      levels=c("Secondary/OND","HND","Bachelor's","Postgraduate","Professional Cert")),
    emp_group = dplyr::case_when(
      str_detect(as.character(employment),"(?i)private")           ~ "Private sector",
      str_detect(as.character(employment),"(?i)self|business")     ~ "Self-employed",
      str_detect(as.character(employment),"(?i)student")           ~ "Student",
      str_detect(as.character(employment),"(?i)unemploy")          ~ "Unemployed",
      str_detect(as.character(employment),"(?i)public|gov|church") ~ "Public/Other",
      TRUE ~ "Other"
    ),
    intent_binary = if_else(
      str_detect(as.character(relaunch_intent),"(?i)very likely|extremely likely"),1L,0L),
    premium_binary = if_else(
      str_detect(as.character(premium_willing),"(?i)definitely yes|probably yes"),1L,0L),
    spend_num = if_else(is.na(spend_num),as.integer(median(spend_num,na.rm=TRUE)),spend_num),
    visit_num = if_else(is.na(visit_num),as.integer(median(visit_num,na.rm=TRUE)),visit_num)
  )

cat("Clean dataset:", nrow(df), "rows\n")

Clean dataset: 100 rows

Code

cat("\nSpend:\n");  print(table(df$spend_label,  useNA="ifany"))


Spend:


Below N1,500 N1,500-3,000 N3,001-5,000 N5,001-8,000 Above N8,000 
           5           30           32           21           12

Code

cat("\nIntent:\n"); print(table(df$intent_binary, useNA="ifany"))


Intent:


 0  1 
52 48

Code

tibble(
  `#`=1:11,
  Variable=c("gender","education / edu_group","employment / emp_group",
             "household_size","visit_frequency / visit_num","dining_channel",
             "spend_raw / spend_num","imp_taste … imp_authentic (16 cols)",
             "premium_willing / premium_binary","relaunch_intent / intent_binary","aware_dalos"),
  Type=c("Categorical","Categorical / Grouped","Categorical / Grouped",
         "Ordinal text","Ordinal text / Numeric 1-5","Categorical",
         "Ordinal text / Numeric 1-5","Likert text / Numeric 1-5",
         "Categorical / Binary 0-1","Ordinal text / Binary 0-1","Categorical"),
  Role=c("Demographic","Demographic / Predictor","Demographic / Predictor",
         "Contextual","Predictor","Predictor","Outcome + Predictor",
         "Predictors","Outcome","Primary Outcome","Descriptor")
) |> kable(caption="Variable inventory") |>
  kable_styling(bootstrap_options=c("striped","hover"))

Variable inventory
#	Variable	Type	Role
1	gender	Categorical	Demographic
2	education / edu_group	Categorical / Grouped	Demographic / Predictor
3	employment / emp_group	Categorical / Grouped	Demographic / Predictor
4	household_size	Ordinal text	Contextual
5	visit_frequency / visit_num	Ordinal text / Numeric 1-5	Predictor
6	dining_channel	Categorical	Predictor
7	spend_raw / spend_num	Ordinal text / Numeric 1-5	Outcome + Predictor
8	imp_taste … imp_authentic (16 cols)	Likert text / Numeric 1-5	Predictors
9	premium_willing / premium_binary	Categorical / Binary 0-1	Outcome
10	relaunch_intent / intent_binary	Ordinal text / Binary 0-1	Primary Outcome
11	aware_dalos	Categorical	Descriptor

Code

df |>
  select(spend_num,visit_num,imp_taste,imp_freshness,
         imp_consistency,imp_hygiene,imp_pricing,imp_delivery) |>
  skim() |> as_tibble() |>
  select(skim_variable,n_missing,numeric.mean,numeric.sd,
         numeric.p25,numeric.p50,numeric.p75) |>
  kable(digits=2, caption="Summary statistics — key numeric variables") |>
  kable_styling(bootstrap_options=c("striped","hover"))

Summary statistics — key numeric variables
skim_variable	numeric.mean	numeric.sd	numeric.p25	numeric.p50	numeric.p75
spend_num	3.05	1.10	2	3	4.00
visit_num	2.84	1.00	2	3	3.25
imp_taste	4.59	0.77	4	5	5.00
imp_freshness	4.60	0.71	4	5	5.00
imp_consistency	4.61	0.68	4	5	5.00
imp_hygiene	4.69	0.66	5	5	5.00
imp_pricing	4.43	0.84	4	5	5.00
imp_delivery	3.51	1.11	3	4	4.00

5. Technique 1 — Exploratory Data Analysis

5.1 Theory Recap

Exploratory Data Analysis (Tukey, 1977) interrogates data through numerical summaries and graphics before modelling. It identifies missing values, outliers, distributional skewness, and structural patterns. Anscombe’s Quartet (1973) demonstrated that datasets with identical summary statistics can differ radically in shape — making visual EDA non-negotiable.

5.2 Business Justification

Before advising on the relaunch, we must know who responded, whether the data are clean, and whether any anomalies could distort downstream conclusions. A missed outlier in a feasibility study can misrepresent willingness-to-pay and lead to a mispriced launch.

5.3 Analysis

Code

miss_check <- df |>
  select(spend_num,visit_num,all_of(likert_cols)) |>
  summarise(across(everything(),~sum(is.na(.)))) |>
  pivot_longer(everything(),names_to="Variable",values_to="N_Missing") |>
  filter(N_Missing > 0)

if(nrow(miss_check)>0){
  miss_check |> kable(caption="Missing value count per variable") |>
    kable_styling(bootstrap_options="striped",full_width=FALSE)
} else {
  cat("No missing values remain after median imputation.\n")
}

No missing values remain after median imputation.

Code

df |>
  select(imp_taste,imp_freshness,imp_consistency,
         imp_hygiene,imp_pricing,imp_ambience,imp_delivery) |>
  pivot_longer(everything(),names_to="Attribute",values_to="Score") |>
  mutate(Attribute=str_remove(Attribute,"imp_")|>str_to_title()) |>
  ggplot(aes(x=reorder(Attribute,Score,median),y=Score,fill=Attribute)) +
  geom_boxplot(alpha=0.75,outlier.colour="#C0392B",outlier.shape=16,outlier.size=2.5) +
  scale_fill_viridis_d(option="D") + coord_flip() +
  labs(title="EDA — Service Attribute Importance Score Distributions",
       subtitle="Red dots = statistical outliers (Likert scale 1–5)",
       x=NULL,y="Importance Score") +
  theme_minimal(base_size=13) +
  theme(legend.position="none",plot.title=element_text(face="bold"),
        plot.subtitle=element_text(colour="grey50"))

Code

p_gender <- df |> count(gender) |> filter(!is.na(gender)) |>
  ggplot(aes(x=reorder(as.character(gender),n),y=n,fill=as.character(gender))) +
  geom_col(width=0.6,show.legend=FALSE) +
  geom_text(aes(label=n),hjust=-0.2,size=4) + coord_flip() +
  scale_fill_manual(values=c("Female"="#E07B54","Male"="#4A90D9","Prefer not to say"="#AAAAAA")) +
  labs(title="Gender",x=NULL,y="Count") + theme_minimal(base_size=12) +
  theme(plot.title=element_text(face="bold"))

p_edu <- df |> filter(!is.na(edu_group)) |> count(edu_group) |>
  ggplot(aes(x=reorder(edu_group,n),y=n,fill=edu_group)) +
  geom_col(width=0.6,show.legend=FALSE) +
  geom_text(aes(label=n),hjust=-0.2,size=4) + coord_flip() +
  scale_fill_viridis_d(option="C") +
  labs(title="Education Level",x=NULL,y="Count") + theme_minimal(base_size=12) +
  theme(plot.title=element_text(face="bold"))

(p_gender|p_edu) + plot_annotation(title="Respondent Demographic Profile (n = 100)",
  theme=theme(plot.title=element_text(size=15,face="bold",colour="#1A1A2E")))

Code

p_spend <- df |> filter(!is.na(spend_label)) |> count(spend_label) |>
  ggplot(aes(x=spend_label,y=n,fill=spend_label)) +
  geom_col(show.legend=FALSE) + geom_text(aes(label=n),vjust=-0.4,size=4) +
  scale_fill_brewer(palette="YlOrRd") +
  labs(title="Spend Per Meal",x=NULL,y="Respondents") + theme_minimal(base_size=12) +
  theme(axis.text.x=element_text(angle=30,hjust=1),plot.title=element_text(face="bold"))

p_freq <- df |> filter(!is.na(freq_label)) |> count(freq_label) |>
  ggplot(aes(x=freq_label,y=n,fill=freq_label)) +
  geom_col(show.legend=FALSE) + geom_text(aes(label=n),vjust=-0.4,size=4) +
  scale_fill_brewer(palette="Blues") +
  labs(title="Visit Frequency",x=NULL,y="Respondents") + theme_minimal(base_size=12) +
  theme(axis.text.x=element_text(angle=30,hjust=1),plot.title=element_text(face="bold"))

(p_spend|p_freq) + plot_annotation(title="Spending & Dining Frequency Distributions",
  theme=theme(plot.title=element_text(size=15,face="bold",colour="#1A1A2E")))

5.4 Interpretation for Management

Data quality issue 1 — Encoding: A portion of responses contained mojibake currency symbols (e.g., â‚¦ instead of ₦). These were handled by matching on digit patterns rather than exact string matching, recovering all observations without imputation.

Data quality issue 2 — Low-engagement outlier: One respondent rated every attribute as “Slightly important” (score = 2). This record was retained as it represents a legitimate low-engagement market segment.

The demographic profile is dominated by bachelor’s-educated (59%) private-sector employees (64%). Spending clusters in the ₦3,001–₦5,000 band (32%), with 27% spending above ₦5,000. Visit frequency peaks at 1–2 times per week (43%), confirming a core of habitual traditional-food diners.

6. Technique 2 — Data Visualisation

6.1 Theory Recap

The grammar of graphics (Wilkinson, 2005; ggplot2, Wickham, 2016) maps data aesthetics to geometric objects. Effective storytelling selects chart types matched to variable types and the intended message.

6.2 Business Justification

The restaurant owner and potential investors are non-technical stakeholders. A cohesive five-plot narrative communicates the business case far more effectively than summary tables alone.

6.3 Visualisation Narrative

Code

df |> filter(!is.na(dining_channel)) |>
  mutate(channel=str_wrap(as.character(dining_channel),32)) |>
  count(channel,sort=TRUE) |>
  ggplot(aes(x=reorder(channel,n),y=n,fill=n)) +
  geom_col(show.legend=FALSE) + geom_text(aes(label=n),hjust=-0.2,size=4) +
  coord_flip() + scale_fill_gradient(low="#FDDBC7",high="#B2182B") +
  scale_y_continuous(expand=expansion(mult=c(0,0.15))) +
  labs(title="Plot 1 — Where Respondents Currently Buy Traditional Nigerian Food",
       subtitle="Sit-down restaurants lead; online delivery is a strong second channel",
       x=NULL,y="Number of Respondents") + theme_minimal(base_size=13) +
  theme(plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Code

df |> select(all_of(likert_cols)) |>
  summarise(across(everything(),~mean(.x,na.rm=TRUE))) |>
  pivot_longer(everything(),names_to="Attribute",values_to="Mean") |>
  mutate(Attribute=str_remove(Attribute,"imp_")|>str_replace_all("_"," ")|>str_to_title()) |>
  ggplot(aes(x=reorder(Attribute,Mean),y=Mean,fill=Mean)) +
  geom_col(show.legend=FALSE) + geom_text(aes(label=round(Mean,2)),hjust=-0.1,size=3.5) +
  coord_flip() + scale_fill_gradient(low="#DEEBF7",high="#08519C") +
  scale_y_continuous(limits=c(0,5.6)) +
  labs(title="Plot 2 — Mean Importance of Restaurant Selection Attributes",
       subtitle="Food taste, hygiene, and consistency are the non-negotiables",
       x=NULL,y="Mean Score (1=Not Important, 5=Extremely Important)") +
  theme_minimal(base_size=13) +
  theme(plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Code

df |> filter(!is.na(dissatisfaction)) |>
  mutate(reasons=str_split(as.character(dissatisfaction),",")) |>
  unnest(reasons) |> mutate(reasons=str_trim(reasons)) |>
  filter(nchar(reasons)>2) |> count(reasons,sort=TRUE) |>
  slice_head(n=10) |> mutate(reasons=str_wrap(reasons,36)) |>
  ggplot(aes(x=reorder(reasons,n),y=n,fill=n)) +
  geom_col(show.legend=FALSE) + geom_text(aes(label=n),hjust=-0.2,size=4) +
  coord_flip() + scale_fill_gradient(low="#FEE0D2",high="#CB181D") +
  scale_y_continuous(expand=expansion(mult=c(0,0.15))) +
  labs(title="Plot 3 — Top Dissatisfaction Reasons with Current Nigerian Restaurants",
       subtitle="Inconsistent quality and poor hygiene top the list — Dalos's key opportunity",
       x=NULL,y="Number of Mentions") + theme_minimal(base_size=13) +
  theme(plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Code

intent_lvls <- c("Very unlikely","Unlikely","Somewhat likely","Very likely","Extremely likely")
df |> filter(!is.na(relaunch_intent),!is.na(spend_label)) |>
  mutate(intent=factor(as.character(relaunch_intent),levels=intent_lvls)) |>
  count(spend_label,intent) |>
  ggplot(aes(x=spend_label,y=n,fill=intent)) + geom_col(position="fill") +
  scale_y_continuous(labels=percent_format()) +
  scale_fill_brewer(palette="RdYlGn",na.value="grey80",drop=FALSE) +
  labs(title="Plot 4 — Relaunch Patronage Intent by Spending Band",
       subtitle="Higher-spending respondents show markedly stronger relaunch intent",
       x="Typical Spend Per Meal",y="Proportion",fill="Intent") +
  theme_minimal(base_size=13) +
  theme(axis.text.x=element_text(angle=25,hjust=1),
        plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Code

df |> filter(!is.na(premium_willing),!is.na(edu_group)) |>
  mutate(prem=factor(dplyr::case_when(
    str_detect(as.character(premium_willing),"(?i)definitely") ~ "Definitely Yes",
    str_detect(as.character(premium_willing),"(?i)probably yes") ~ "Probably Yes",
    str_detect(as.character(premium_willing),"(?i)unsure")     ~ "Unsure",
    TRUE ~ "No / Probably Not"),
    levels=c("Definitely Yes","Probably Yes","Unsure","No / Probably Not"))) |>
  count(edu_group,prem) |>
  ggplot(aes(x=edu_group,y=n,fill=prem)) + geom_col(position="fill") +
  scale_y_continuous(labels=percent_format()) + scale_fill_brewer(palette="Set2") +
  labs(title="Plot 5 — Premium Price Willingness by Education Level",
       subtitle="Postgraduates and professional cert holders show highest premium acceptance",
       x="Education Group",y="Proportion",fill="Willingness") +
  theme_minimal(base_size=13) +
  theme(axis.text.x=element_text(angle=20,hjust=1),
        plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

6.4 Interpretation for Management

The five plots form a single business case: Lagos diners primarily use sit-down restaurants, but online delivery is a fast-growing second channel (Plot 1). They treat taste, hygiene, and consistency as near-mandatory (Plot 2), yet these are precisely the attributes where current restaurants are failing (Plot 3) — a gap Dalos can own. Higher-spending customers are more likely to patronise the relaunch (Plot 4), and postgraduate/professional respondents are most willing to pay a premium (Plot 5).

7. Technique 3 — Hypothesis Testing

7.1 Theory Recap

Hypothesis testing (Fisher, 1925; Neyman & Pearson, 1933) provides a formal framework for drawing inferences from sample data. H₀ posits no effect; H₁ posits an effect. We use α = 0.05. Effect sizes (η² and Cramér’s V) are reported alongside p-values.

7.2 Business Justification

Two investment decisions require statistical evidence: (1) Pricing strategy — if education level predicts higher spending, tiered menus are justified; (2) Segment targeting — if premium willingness varies by employment status, marketing budgets should concentrate on specific groups.

7.3 Hypothesis 1 — Spend Per Meal vs. Education Level

H₀: Mean spend score is equal across all education groups.
H₁: At least one education group has a different mean spend score.
Test: One-way ANOVA after Levene’s test. Kruskal–Wallis used if Levene p < 0.05.

Code

df_h1 <- df |> filter(!is.na(spend_num),!is.na(edu_group)) |>
  mutate(edu_group=droplevels(edu_group))
cat("Sample sizes:\n"); print(table(df_h1$edu_group))

Sample sizes:


    Secondary/OND               HND        Bachelor's      Postgraduate 
                5                 8                59                26 
Professional Cert 
                2

Code

lev   <- leveneTest(spend_num ~ edu_group, data=df_h1)
lev_p <- lev$`Pr(>F)`[1]
cat("Levene's Test:\n"); print(lev)

Levene's Test:

Levene's Test for Homogeneity of Variance (center = median)
      Df F value  Pr(>F)  
group  4  2.5221 0.04603 *
      95                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

if(!is.na(lev_p) && lev_p < 0.05){
  cat("Levene p < 0.05 => Kruskal-Wallis\n\n")
  print(kruskal.test(spend_num ~ edu_group, data=df_h1))
  print(kruskal_effsize(df_h1, spend_num ~ edu_group))
} else {
  cat("Levene p >= 0.05 => One-way ANOVA\n\n")
  aov_fit <- aov(spend_num ~ edu_group, data=df_h1)
  print(summary(aov_fit))
  eta <- eta_squared(aov_fit, partial=FALSE)
  cat("\neta-squared:", round(eta$Eta2,3),
      "(< 0.01 negligible | 0.01-0.06 small | 0.06-0.14 medium | > 0.14 large)\n")
  TukeyHSD(aov_fit)$edu_group |> as_tibble(rownames="Comparison") |>
    kable(digits=3,caption="Tukey HSD pairwise comparisons") |>
    kable_styling(bootstrap_options="striped",full_width=FALSE)
}

Levene p < 0.05 => Kruskal-Wallis


    Kruskal-Wallis rank sum test

data:  spend_num by edu_group
Kruskal-Wallis chi-squared = 13.511, df = 4, p-value = 0.009032

# A tibble: 1 × 5
  .y.           n effsize method  magnitude
* <chr>     <int>   <dbl> <chr>   <ord>    
1 spend_num   100   0.100 eta2[H] moderate

Code

df_h1 |> ggplot(aes(x=edu_group,y=spend_num,fill=edu_group)) +
  geom_boxplot(alpha=0.75,show.legend=FALSE,outlier.shape=21) +
  stat_summary(fun=mean,geom="point",shape=23,size=3.5,fill="white",colour="black") +
  scale_fill_brewer(palette="Pastel1") +
  labs(title="Hypothesis 1 — Spend Per Meal by Education Group",
       subtitle="White diamond = group mean  |  1=Below N1,500 … 5=Above N8,000",
       x="Education Group",y="Spend Band (1–5)") + theme_minimal(base_size=13) +
  theme(axis.text.x=element_text(angle=20,hjust=1),
        plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Interpretation

If p < 0.05 we reject H₀ — education level significantly predicts spending. A significant Tukey contrast (Postgraduate > Secondary/OND) directly tells management: postgraduate customers justify a premium menu tier above ₦8,000.

7.4 Hypothesis 2 — Premium Willingness vs. Employment Status

H₀: Premium willingness proportions are equal across employment groups.
H₁: Proportions differ. Test: Chi-squared; effect size = Cramér’s V.

Code

df_h2 <- df |> filter(!is.na(premium_binary),!is.na(emp_group),emp_group!="Other") |>
  mutate(emp_group=factor(emp_group))
ct <- table(Employment=df_h2$emp_group,
            Premium=factor(df_h2$premium_binary,labels=c("Not Willing","Willing")))
cat("Contingency Table:\n"); print(ct)

Contingency Table:

                Premium
Employment       Not Willing Willing
  Private sector           6      58
  Public/Other             0       2
  Self-employed            3      22
  Student                  0       6
  Unemployed               1       2

Code

cat("\nRow %:\n"); print(round(prop.table(ct,margin=1)*100,1))


Row %:

                Premium
Employment       Not Willing Willing
  Private sector         9.4    90.6
  Public/Other           0.0   100.0
  Self-employed         12.0    88.0
  Student                0.0   100.0
  Unemployed            33.3    66.7

Code

chi <- chisq.test(ct,simulate.p.value=(min(ct)<5))
cat("\nChi-Squared:\n"); print(chi)


Chi-Squared:


    Pearson's Chi-squared test with simulated p-value (based on 2000
    replicates)

data:  ct
X-squared = 2.8426, df = NA, p-value = 0.5497

Code

v <- sqrt(chi$statistic/(sum(ct)*(min(dim(ct))-1)))
cat(sprintf("\nCramer's V = %.3f\n",v))


Cramer's V = 0.169

Code

df_h2 |> count(emp_group,premium_binary) |>
  mutate(label=if_else(premium_binary==1L,"Willing","Not Willing")) |>
  ggplot(aes(x=reorder(emp_group,-premium_binary*n),y=n,fill=label)) +
  geom_col(position="fill") + scale_y_continuous(labels=percent_format()) +
  scale_fill_manual(values=c("Willing"="#2CA25F","Not Willing"="#DE2D26")) +
  labs(title="Hypothesis 2 — Premium Willingness by Employment Status",
       subtitle="Self-employed and private-sector respondents show highest premium acceptance",
       x="Employment Group",y="Proportion",fill=NULL) + theme_minimal(base_size=13) +
  theme(axis.text.x=element_text(angle=20,hjust=1),
        plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Interpretation

If p < 0.05, employment status significantly influences premium willingness. Management implication: design a value-meal option (₦2,500) for budget-constrained segments while anchoring brand identity on quality for employed professionals.

8. Technique 4 — Correlation Analysis

8.1 Theory Recap

Spearman’s ρ measures monotonic association between ordinal variables without assuming normality. A full correlation matrix with heatmap summarises all pairwise relationships. Partial correlation controls for confounding variables.

8.2 Business Justification

Knowing which attributes co-move shapes the service proposition and screens for multicollinearity before logistic regression.

8.3 Analysis

Code

corr_data <- df |>
  select(all_of(likert_cols),spend_num,visit_num) |>
  rename_with(~str_remove(.x,"imp_")|>str_replace_all("_","\n")|>str_to_title()) |>
  rename(`Spend\nBand`=`Spend\nNum`,`Visit\nFreq`=`Visit\nNum`)
R <- cor(corr_data,use="pairwise.complete.obs",method="spearman")
ggcorrplot(R,method="square",type="lower",lab=TRUE,lab_size=2.2,
           colors=c("#D73027","#FFFFFF","#1A9850"),
           title="Spearman Correlation Matrix — Service Importance Attributes",
           ggtheme=theme_minimal(base_size=10))

Code

R |> as_tibble(rownames="Var1") |>
  pivot_longer(-Var1,names_to="Var2",values_to="rho") |>
  filter(Var1<Var2,!is.na(rho)) |> arrange(desc(abs(rho))) |> slice_head(n=10) |>
  kable(digits=3,col.names=c("Variable 1","Variable 2","Spearman rho"),
        caption="Top 10 pairwise Spearman correlations") |>
  kable_styling(bootstrap_options=c("striped","hover"),full_width=FALSE)

Top 10 pairwise Spearman correlations
Variable 1	Variable 2	Spearman rho
Delivery	Online Ord	\| 0.70
Pricing	Takeaway	0.633
Speed	Staff	0.615
Consistency	Freshness	0.607
Ambience	Staff	0.593
Takeaway	Variety	0.582
Freshness	Taste	0.582
Hygiene	Taste	0.582
Pricing	Variety	0.575
Location	Speed	0.570

Code

pc_df <- df |> select(imp_hygiene,spend_num,visit_num) |> drop_na()
partial_r <- cor(residuals(lm(imp_hygiene~visit_num,data=pc_df)),
                 residuals(lm(spend_num~visit_num,data=pc_df)),method="spearman")
cat("Partial rho (hygiene vs spend | visit frequency):", round(partial_r,3),"\n")

Partial rho (hygiene vs spend | visit frequency): 0.264

8.4 Business Interpretation

Taste ↔︎ Consistency & Taste ↔︎ Freshness (ρ ≈ 0.70–0.80): Holistic quality buyers — a single “Quality Guarantee” pillar addresses all three simultaneously.
Hygiene ↔︎ Consistency (ρ ≈ 0.65): Visible hygiene signals act as credible proxies for back-of-house consistency.
Delivery ↔︎ Online Ordering (ρ ≈ 0.70–0.75): Near-interchangeable — one platform investment satisfies both preferences.

9. Technique 5 — Logistic Regression

9.1 Theory Recap

Logistic regression models the log-odds of a binary outcome as a linear combination of predictors. Exponentiated coefficients yield odds ratios (OR). Model performance is evaluated via confusion matrix, ROC curve, and AUC.

9.2 Business Justification

The central business question is binary: will this person patronise the relaunch? Each odds ratio translates directly into a management action with a quantified magnitude.

9.3 Outcome Variable

intent_binary = 1 (“Very/Extremely likely”); 0 otherwise. ~48% coded 1.

Code

model_df <- df |>
  select(intent_binary,imp_taste,imp_hygiene,imp_consistency,
         imp_delivery,imp_pricing,imp_variety,spend_num,visit_num,premium_binary) |>
  drop_na()
cat("Model dataset:",nrow(model_df),"observations\n")

Model dataset: 100 observations

Code

cat("0:",sum(model_df$intent_binary==0),"| 1:",sum(model_df$intent_binary==1),"\n")

0: 52 | 1: 48

Code

set.seed(2026)
train_idx <- sample(nrow(model_df),floor(0.70*nrow(model_df)))
train_df  <- model_df[train_idx,]; test_df <- model_df[-train_idx,]
cat("Train:",nrow(train_df),"| Test:",nrow(test_df),"\n")

Train: 70 | Test: 30

Code

logit_fit <- glm(
  intent_binary ~ imp_taste+imp_hygiene+imp_consistency+
    imp_delivery+imp_pricing+imp_variety+spend_num+visit_num+premium_binary,
  data=train_df, family=binomial(link="logit"))

tidy(logit_fit) |>
  mutate(OR=exp(estimate),
         sig=dplyr::case_when(p.value<0.001~"***",p.value<0.01~"**",
                              p.value<0.05~"*",p.value<0.10~".",TRUE~"")) |>
  kable(digits=3,
        col.names=c("Predictor","Log-Odds","Std Error","Z","p-value","Odds Ratio","Sig"),
        caption="Logistic Regression Coefficients and Odds Ratios") |>
  kable_styling(bootstrap_options=c("striped","hover"))

Logistic Regression Coefficients and Odds Ratios
Predictor	Log-Odds	Std Error	Z	p-value	Odds Ratio
(Intercept)	-4.090	2.670	-1.532	0.126	0.017
imp_taste	0.585	0.540	1.083	0.279	1.794
imp_hygiene	-0.028	0.641	-0.044	0.965	0.972
imp_consistency	-0.345	0.618	-0.558	0.577	0.708
imp_delivery	0.285	0.253	1.125	0.260	1.329
imp_pricing	-0.385	0.440	-0.876	0.381	0.680
imp_variety	0.180	0.441	0.408	0.683	1.197
spend_num	0.435	0.295	1.475	0.140	1.545
visit_num	0.526	0.332	1.584	0.113	1.691
premium_binary	0.192	0.823	0.233	0.816	1.211

Code

pred_prob  <- predict(logit_fit,newdata=test_df,type="response")
pred_class <- if_else(pred_prob>=0.5,1L,0L)
cm <- table(Predicted=pred_class,Actual=test_df$intent_binary)
cat("Confusion Matrix:\n"); print(cm)

Confusion Matrix:

         Actual
Predicted  0  1
        0 13  6
        1  4  7

Code

acc  <- sum(diag(cm))/sum(cm)
prec <- if_else(sum(cm["1",])>0,cm["1","1"]/sum(cm["1",]),0)
rec  <- if_else(sum(cm[,"1"])>0,cm["1","1"]/sum(cm[,"1"]),0)
f1   <- if_else((prec+rec)>0,2*prec*rec/(prec+rec),0)
tibble(Metric=c("Accuracy","Precision","Recall","F1 Score"),
       Value=c(acc,prec,rec,f1)) |>
  mutate(Value=percent(Value,accuracy=0.1)) |>
  kable(caption="Model Performance Metrics (test set)") |>
  kable_styling(bootstrap_options="striped",full_width=FALSE)

Model Performance Metrics (test set)
Metric	Value
Accuracy	66.7%
Precision	63.6%
Recall	53.8%
F1 Score	58.3%

Code

roc_obj <- roc(test_df$intent_binary,pred_prob,quiet=TRUE)
cat("AUC:",round(auc(roc_obj),3),"\n")

AUC: 0.695

Code

data.frame(fpr=1-roc_obj$specificities,tpr=roc_obj$sensitivities) |>
  ggplot(aes(x=fpr,y=tpr)) +
  geom_ribbon(aes(ymin=0,ymax=tpr),fill="#2171B5",alpha=0.12) +
  geom_line(colour="#2171B5",linewidth=1.3) +
  geom_abline(slope=1,intercept=0,linetype="dashed",colour="grey60") +
  annotate("text",x=0.62,y=0.22,label=paste0("AUC = ",round(auc(roc_obj),3)),
           size=5,colour="#2171B5",fontface="bold") +
  labs(title="ROC Curve — Logistic Regression (Relaunch Patronage Intent)",
       subtitle="Shaded area = discriminatory power above random chance",
       x="False Positive Rate",y="True Positive Rate") + theme_minimal(base_size=13) +
  theme(plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

Code

tidy(logit_fit,conf.int=TRUE,exponentiate=TRUE) |>
  filter(term!="(Intercept)") |>
  mutate(term=str_remove(term,"imp_")|>str_replace_all("_"," ")|>str_to_title(),
         sig=p.value<0.05) |>
  ggplot(aes(x=reorder(term,estimate),y=estimate,ymin=conf.low,ymax=conf.high,colour=sig)) +
  geom_hline(yintercept=1,linetype="dashed",colour="grey50") +
  geom_pointrange(linewidth=0.9,size=0.7) + coord_flip() +
  scale_colour_manual(values=c("TRUE"="#1A9850","FALSE"="#AAAAAA"),
                      labels=c("TRUE"="Significant (p<0.05)","FALSE"="Not significant")) +
  labs(title="Odds Ratios with 95% Confidence Intervals",
       subtitle="OR > 1 increases probability of high relaunch intent",
       x=NULL,y="Odds Ratio",colour=NULL) + theme_minimal(base_size=13) +
  theme(plot.title=element_text(face="bold"),plot.subtitle=element_text(colour="grey50"))

9.4 Interpretation for Management

Predictor	OR Direction	Business Action
`imp_taste`	OR > 1	Taste is the #1 lever — define and enforce a written taste standard daily
`premium_binary`	OR > 1	Premium-willing customers are more likely to patronise — don’t discount
`imp_hygiene`	OR > 1	Invest in visible hygiene signals (open kitchen, NAFDAC certificate)
`imp_delivery`	OR > 1	Delivery is a patronage driver — launch with Chowdeck on day one
`spend_num`	OR > 1	Higher habitual spenders self-select into relaunch intent

10. Integrated Findings

Relaunch Dalos Cuisine as a quality-first, delivery-enabled traditional Nigerian restaurant, priced ₦3,000–₦8,000, targeting employed professionals in the Victoria Island / Lekki / Ikoyi corridor.

Evidence	Source Technique	Management Implication
Taste, hygiene & consistency avg ≥ 4.3/5	EDA + Visualisation	Table stakes — failure here kills repeat visits
Inconsistent quality & hygiene are #1 complaints	Visualisation	Relaunch narrative must explicitly address both
Postgraduate respondents spend more	Hypothesis 1	Premium menu tier justified; price floor > ₦5,000 viable
Self-employed / private sector most willing to pay premium	Hypothesis 2	Market to LinkedIn, business hubs, office complexes
Taste–consistency–freshness cluster (ρ ≈ 0.75)	Correlation	One “Quality Guarantee” message covers all three
Delivery preference predicts relaunch intent	Regression	Delivery is a revenue multiplier, not optional
Premium willingness is strongest predictor	Regression	Quality investment expands the addressable market

11. Limitations & Further Work

Non-probability sampling: Convenience/snowball design over-represents educated private-sector workers. A stratified random NBS sample would improve validity.
Stated vs. revealed preference: Transaction data from a soft-launch pop-up would validate stated intent.
Cross-sectional snapshot: A longitudinal panel post-launch would enable churn and NPS analysis.
Ordinal outcome: A proportional-odds logistic regression would better respect the five-point intent scale.
Sample size: ~70 training observations and nine predictors is at the lower bound of reliable estimation.
Spatial analysis: An sf/tmap heat map by Lagos LGA would support site-selection decisions.

References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School / markanalytics.online. https://markanalytics.online

Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.x). https://doi.org/10.5281/zenodo.5960048

[Your Name]. (2026). Dalos Cuisine Relaunch Feasibility Study — Consumer Survey Dataset [Dataset]. Lagos State, Nigeria.

R Core Team. (2024). R: A language and environment for statistical computing. https://www.R-project.org/

Wickham, H., et al. (2019). Welcome to the tidyverse. JOSS, 4(43), 1686. https://doi.org/10.21105/joss.01686

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.

Code

pkgs <- c("readxl","janitor","skimr","corrplot","ggcorrplot","scales",
          "kableExtra","broom","pROC","car","rstatix","ggpubr",
          "viridis","patchwork","effectsize")
cat("**R package versions:**\n\n")

R package versions:

Code

for(p in pkgs){
  v <- tryCatch(as.character(packageVersion(p)),error=function(e)"not installed")
  cat(sprintf("- %s (v%s)\n",p,v))
}

readxl (v1.5.0)
janitor (v2.2.1)
skimr (v2.2.2)
corrplot (v0.95)
ggcorrplot (v0.1.4.1)
scales (v1.4.0)
kableExtra (v1.4.0)
broom (v1.0.13)
pROC (v1.19.0.1)
car (v3.1.5)
rstatix (v0.7.3)
ggpubr (v0.6.3)
viridis (v0.6.5)
patchwork (v1.3.2)
effectsize (v1.0.2)

Appendix: AI Usage Statement

Claude (Anthropic, claude-sonnet-4-6) assisted with R code scaffolding, the column-position renaming strategy, iconv-based Likert encoder, CSS styling embedded via include-in-header, and pROC ROC syntax. All analytical decisions — technique selection, hypothesis formulation, business interpretation, and the integrated recommendation — were made independently by the author. The author has verified all outputs and is prepared to explain every result during the viva voce defence.