Exploratory and Inferential Analysis of Customer Meal Preferences and Satisfaction at Peej Kitchen

Author

Fagbemi Mary Olapeju

Published

May 12, 2026

Case Study 1 — Exploratory & Inferential Analytics.
This report analyses primary survey data collected for Peej Kitchen to understand customer meal preference, satisfaction, price sensitivity, and operational improvement opportunities.

1 1. Executive Summary

Peej Kitchen is a home-made catering business that prepares and delivers Nigerian meals such as Jollof Rice, Fried Rice, Oha Soup, Egusi Soup, and Ogbono Soup. This case study uses primary survey data collected from 100 respondents in May 2026 to understand the factors influencing customer meal demand, meal preference, and overall satisfaction. The dataset includes customer income range, order frequency, most ordered meal, perceived affordability, likelihood of continued purchase after a price increase, taste/quality rating, delivery time rating, portion size rating, most important purchase factor, and overall satisfaction.

The analysis applies five exploratory and inferential analytics techniques: exploratory data analysis, visualisation, hypothesis testing, correlation analysis, and regression analysis. The evidence from the survey is expected to help Peej Kitchen identify its most attractive meals, understand the main drivers of satisfaction, and prioritise operational improvements. The business recommendation is to protect taste and quality as the core competitive advantage, promote high-demand meals such as Jollof Rice, monitor price sensitivity carefully, and improve operational areas such as delivery experience and portion consistency where the data shows weaker ratings.

Management focus: The report is written for a non-technical business owner. Each analysis section therefore includes the technique used, why it matters for Peej Kitchen, the R output, and a plain-language business interpretation.

2 2. Professional Disclosure

I am Fagbemi Mary Olapeju, the Owner of Peej Kitchen, a home-made catering business that prepares meals for customers and ensures that they are delivered safely and securely. Peej Kitchen serves a variety of Nigerian meals including Jollof Rice, Fried Rice, Oha Soup, Egusi Soup, Ogbono Soup, and other home-made dishes.

The purpose of this analysis is to understand the factors that influence customer demand, meal preference, and customer satisfaction. As the owner of the business, I regularly make decisions about menu planning, pricing, meal quality, delivery experience, and customer retention. Therefore, the use of exploratory and inferential analytics is directly relevant to my day-to-day operations.

Exploratory Data Analysis (EDA). EDA is relevant because it helps me understand the structure of my customer survey data, identify missing values, detect inconsistent responses, and summarise key customer patterns such as most ordered meals, order frequency, and satisfaction levels. This supports better decisions about what customers are buying and where the business may need improvement.

Data Visualisation. Data visualisation is relevant because it allows me to communicate customer behaviour clearly. Charts showing meal preference, satisfaction levels, affordability perception, and key purchase factors can help me quickly identify which areas of the business need attention.

Hypothesis Testing. Hypothesis testing is relevant because it allows me to test whether observed patterns in the data are statistically meaningful. For example, I can test whether taste/quality is significantly associated with overall customer satisfaction and whether satisfaction differs across meal categories.

Correlation Analysis. Correlation analysis is relevant because it helps me understand the strength and direction of relationships between key customer experience variables such as taste, affordability, portion size, delivery time, price sensitivity, and satisfaction.

Regression Analysis. Regression analysis is relevant because it helps me estimate which factors are the strongest predictors of overall customer satisfaction. This supports better business decisions on where Peej Kitchen should focus improvement efforts.

3 3. Data Collection and Sampling

The primary dataset used for this study was collected through a customer survey designed by the owner of Peej Kitchen. The survey was administered using Google Forms and distributed mainly through WhatsApp and direct messages to customers.

The sampling frame consisted of existing customers of Peej Kitchen, potential customers who had tasted the meals, and family and friends who were familiar with the business. The survey was collected around May 2026. A total of 100 responses were collected, which satisfies the minimum requirement of 100 observations for this case study.

The survey captured customer views on income range, order frequency, most ordered meal, affordability, likelihood of continued purchase if prices increase, taste/quality rating, delivery time, portion size, most important factor influencing meal choice, and overall satisfaction.

Participation was voluntary. Respondents were informed that their responses would be used anonymously for an MBA Data Analytics assignment. No personally identifiable information was used in the analysis, and the results are presented only in aggregated form. The dataset is treated as primary survey data collected directly from customers and people familiar with Peej Kitchen.

4 4. Data Description

This section loads the data, standardises the column names, cleans inconsistent categories, and creates numeric scores from ordered survey responses. These numeric scores make it possible to run correlation and regression analysis.

Show code
# Uncomment and run once if any package is missing:
# install.packages(c("tidyverse", "readxl", "janitor", "skimr", "naniar", "corrplot", "broom", "effectsize", "rstatix", "knitr", "kableExtra", "patchwork", "scales"))

library(tidyverse)
library(readxl)
library(janitor)
library(skimr)
library(naniar)
library(corrplot)
library(broom)
library(effectsize)
library(rstatix)
library(knitr)
library(kableExtra)
library(patchwork)
library(scales)

theme_set(
  theme_minimal(base_size = 12) +
    theme(
      plot.title = element_text(face = "bold", colour = "#1f4e79"),
      plot.subtitle = element_text(colour = "#5f6b77"),
      axis.title = element_text(face = "bold"),
      panel.grid.minor = element_blank()
    )
)
Show code
peej_raw <- read_excel("data/peej_kitchen_responses.xlsx")

glimpse(peej_raw)
Rows: 100
Columns: 11
$ Timestamp                                                               <dttm> …
$ `1. What is your monthly income range?`                                 <chr> …
$ `2. How often do you buy home-made meals from peej Kitchen?`            <chr> …
$ `3. Which of the following meals do you order most often?`              <chr> …
$ `4. How would you rate the affordability of our meals?`                 <chr> …
$ `5. If the price increases, how likely are you to continue buying?`     <chr> …
$ `6. How would you rate the taste/quality of the meals?`                 <chr> …
$ `7. How would you rate delivery time?`                                  <chr> …
$ `8. How would you rate portion size?`                                   <chr> …
$ `9. What is the MOST important factor influencing your choice of meal?` <chr> …
$ `10. How satisfied are you overall with our meals?`                     <chr> …
Show code
peej <- peej_raw %>%
  clean_names() %>%
  rename(
    income_range = x1_what_is_your_monthly_income_range,
    order_frequency = x2_how_often_do_you_buy_home_made_meals_from_peej_kitchen,
    most_ordered_meal = x3_which_of_the_following_meals_do_you_order_most_often,
    affordability = x4_how_would_you_rate_the_affordability_of_our_meals,
    price_increase_likelihood = x5_if_the_price_increases_how_likely_are_you_to_continue_buying,
    taste_quality = x6_how_would_you_rate_the_taste_quality_of_the_meals,
    delivery_time = x7_how_would_you_rate_delivery_time,
    portion_size = x8_how_would_you_rate_portion_size,
    most_important_factor = x9_what_is_the_most_important_factor_influencing_your_choice_of_meal,
    overall_satisfaction = x10_how_satisfied_are_you_overall_with_our_meals
  ) %>%
  mutate(
    across(where(is.character), ~ str_squish(.x)),
    order_frequency = case_when(
      order_frequency %in% c("1-2 Monthly", "1-2 times Monthly") ~ "1-2 times Monthly",
      order_frequency == "1–2 times a week" ~ "1-2 times a week",
      order_frequency == "3–5 times a week" ~ "3-5 times a week",
      TRUE ~ order_frequency
    ),
    taste_quality = case_when(
      taste_quality == "Neural" ~ "Neutral",
      TRUE ~ taste_quality
    ),
    affordability = case_when(
      affordability == "affordable" ~ "Affordable",
      TRUE ~ affordability
    ),
    portion_size = case_when(
      portion_size == "very Satisfying" ~ "Very Satisfying",
      TRUE ~ portion_size
    ),
    timestamp = case_when(
      inherits(timestamp, "POSIXct") ~ as.POSIXct(timestamp),
      inherits(timestamp, "Date") ~ as.POSIXct(timestamp),
      is.numeric(timestamp) ~ as.POSIXct((as.numeric(timestamp) - 25569) * 86400, origin = "1970-01-01", tz = "Africa/Lagos"),
      TRUE ~ suppressWarnings(as.POSIXct(timestamp, tz = "Africa/Lagos"))
    ),
    response_date = as.Date(timestamp)
  ) %>%
  mutate(
    satisfaction_score = case_when(
      overall_satisfaction == "Very Dissatisfied" ~ 1,
      overall_satisfaction == "Dissatisfied" ~ 2,
      overall_satisfaction == "Neutral" ~ 3,
      overall_satisfaction == "Satisfied" ~ 4,
      overall_satisfaction == "Very Satisfied" ~ 5,
      TRUE ~ NA_real_
    ),
    quality_score = case_when(
      taste_quality == "Poor" ~ 1,
      taste_quality == "Fair" ~ 2,
      taste_quality == "Neutral" ~ 3,
      taste_quality == "Good" ~ 4,
      taste_quality == "Excellent" ~ 5,
      TRUE ~ NA_real_
    ),
    affordability_score = case_when(
      affordability == "Very Expensive" ~ 1,
      affordability == "Expensive" ~ 2,
      affordability == "Moderate" ~ 3,
      affordability == "Affordable" ~ 4,
      affordability == "Very Affordable" ~ 5,
      TRUE ~ NA_real_
    ),
    price_sensitivity_score = case_when(
      price_increase_likelihood == "Very unlikely" ~ 1,
      price_increase_likelihood == "Unlikely" ~ 2,
      price_increase_likelihood == "Neutral" ~ 3,
      price_increase_likelihood == "Likely" ~ 4,
      price_increase_likelihood == "Very likely" ~ 5,
      TRUE ~ NA_real_
    ),
    delivery_score = case_when(
      delivery_time == "Very Slow" ~ 1,
      delivery_time == "Slow" ~ 2,
      delivery_time == "Neutral" ~ 3,
      delivery_time == "Fast" ~ 4,
      delivery_time == "Very fast" ~ 5,
      TRUE ~ NA_real_
    ),
    portion_score = case_when(
      portion_size == "Too small" ~ 1,
      portion_size == "Small" ~ 2,
      portion_size == "Neutral" ~ 3,
      portion_size == "Satisfying" ~ 4,
      portion_size == "Very Satisfying" ~ 5,
      TRUE ~ NA_real_
    ),
    order_frequency_score = case_when(
      order_frequency == "Occasionally" ~ 1,
      order_frequency == "1-2 times Monthly" ~ 2,
      order_frequency == "1-2 times a week" ~ 3,
      order_frequency == "3-5 times a week" ~ 4,
      TRUE ~ NA_real_
    ),
    income_score = case_when(
      income_range == "Below ₦50,000" ~ 1,
      income_range == "₦50,000 – ₦100,000" ~ 2,
      income_range == "₦100,000 – ₦200,000" ~ 3,
      income_range == "Above ₦200,000" ~ 4,
      TRUE ~ NA_real_
    )
  )
Show code
variable_description <- tibble::tribble(
  ~Variable, ~Type, ~BusinessMeaning,
  "response_date", "Date", "Date the survey response was submitted",
  "income_range", "Categorical / ordinal", "Customer monthly income group",
  "order_frequency", "Categorical / ordinal", "How often customers buy home-made meals from Peej Kitchen",
  "most_ordered_meal", "Categorical", "Meal ordered most often by the respondent",
  "affordability", "Categorical / ordinal", "Customer perception of meal affordability",
  "price_increase_likelihood", "Categorical / ordinal", "Likelihood of continued purchase after price increase",
  "taste_quality", "Categorical / ordinal", "Customer rating of taste and meal quality",
  "delivery_time", "Categorical / ordinal", "Customer rating of delivery time",
  "portion_size", "Categorical / ordinal", "Customer rating of meal portion size",
  "most_important_factor", "Categorical", "Main factor influencing meal choice",
  "overall_satisfaction", "Categorical / ordinal", "Overall customer satisfaction rating"
)

variable_description %>%
  kable(caption = "Description of Variables in the Peej Kitchen Survey") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Description of Variables in the Peej Kitchen Survey
Variable Type BusinessMeaning
response_date Date Date the survey response was submitted
income_range Categorical / ordinal Customer monthly income group
order_frequency Categorical / ordinal How often customers buy home-made meals from Peej Kitchen
most_ordered_meal Categorical Meal ordered most often by the respondent
affordability Categorical / ordinal Customer perception of meal affordability
price_increase_likelihood Categorical / ordinal Likelihood of continued purchase after price increase
taste_quality Categorical / ordinal Customer rating of taste and meal quality
delivery_time Categorical / ordinal Customer rating of delivery time
portion_size Categorical / ordinal Customer rating of meal portion size
most_important_factor Categorical Main factor influencing meal choice
overall_satisfaction Categorical / ordinal Overall customer satisfaction rating
Show code
tibble(
  metric = c(
    "Number of observations",
    "Number of variables",
    "Earliest response date",
    "Latest response date"
  ),
  value = c(
    as.character(nrow(peej)),
    as.character(ncol(peej_raw)),
    format(min(peej$response_date, na.rm = TRUE), "%d %B %Y"),
    format(max(peej$response_date, na.rm = TRUE), "%d %B %Y")
  )
) %>%
  kable(caption = "Dataset Size and Survey Period") %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed")
  )
Dataset Size and Survey Period
metric value
Number of observations 100
Number of variables 11
Earliest response date 05 May 2026
Latest response date 09 May 2026

5 5. Analysis Technique 1: Exploratory Data Analysis

5.1 Brief theory recap

Exploratory Data Analysis is the process of examining the dataset before formal modelling. It helps the analyst understand data structure, missing values, data quality problems, distributions, and early patterns. For Peej Kitchen, EDA helps answer basic operational questions such as: Which meals are most ordered? How satisfied are customers? Are there inconsistent responses that need cleaning before analysis?

5.2 Business justification

Peej Kitchen needs to understand the current state of customer demand and satisfaction before making decisions about pricing, promotion, menu improvement, and delivery operations. EDA is appropriate because this is survey data with a mix of categorical and ordinal variables.

Show code
skim(peej)
Data summary
Name peej
Number of rows 100
Number of columns 20
_______________________
Column type frequency:
character 10
Date 1
numeric 8
POSIXct 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
income_range 0 1 13 19 0 4 0
order_frequency 0 1 12 17 0 4 0
most_ordered_meal 0 1 8 11 0 5 0
affordability 0 1 8 15 0 5 0
price_increase_likelihood 0 1 6 13 0 5 0
taste_quality 0 1 4 9 0 3 0
delivery_time 0 1 4 9 0 5 0
portion_size 0 1 5 15 0 5 0
most_important_factor 0 1 5 13 0 5 0
overall_satisfaction 0 1 7 14 0 3 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
response_date 0 1 2026-05-05 2026-05-09 2026-05-06 5

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
satisfaction_score 0 1 4.32 0.62 3 4 4 5 5 ▁▁▇▁▆
quality_score 0 1 4.52 0.63 3 4 5 5 5 ▁▁▅▁▇
affordability_score 0 1 3.38 0.86 1 3 3 4 5 ▁▁▇▅▂
price_sensitivity_score 0 1 3.35 1.10 1 3 3 4 5 ▂▂▇▇▃
delivery_score 0 1 3.57 0.74 1 3 4 4 5 ▁▁▇▇▂
portion_score 0 1 3.57 0.91 1 3 4 4 5 ▁▂▃▇▂
order_frequency_score 0 1 2.08 1.18 1 1 2 3 4 ▇▂▁▅▂
income_score 0 1 2.81 1.29 1 1 3 4 4 ▅▂▁▂▇

Variable type: POSIXct

skim_variable n_missing complete_rate min max median n_unique
timestamp 0 1 2026-05-05 19:18:29 2026-05-09 09:45:57 2026-05-06 02:54:39 100
Show code
miss_var_summary(peej) %>%
  kable(caption = "Missing Value Summary") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Missing Value Summary
variable n_miss pct_miss
timestamp 0 0
income_range 0 0
order_frequency 0 0
most_ordered_meal 0 0
affordability 0 0
price_increase_likelihood 0 0
taste_quality 0 0
delivery_time 0 0
portion_size 0 0
most_important_factor 0 0
overall_satisfaction 0 0
response_date 0 0
satisfaction_score 0 0
quality_score 0 0
affordability_score 0 0
price_sensitivity_score 0 0
delivery_score 0 0
portion_score 0 0
order_frequency_score 0 0
income_score 0 0
Show code
meal_counts <- peej %>%
  count(most_ordered_meal, sort = TRUE) %>%
  mutate(percent = n / sum(n))

factor_counts <- peej %>%
  count(most_important_factor, sort = TRUE) %>%
  mutate(percent = n / sum(n))

satisfaction_counts <- peej %>%
  count(overall_satisfaction, sort = TRUE) %>%
  mutate(percent = n / sum(n))

meal_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Most Ordered Meals") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Most Ordered Meals
most_ordered_meal n percent
Jollof Rice 61 61.0%
Fried Rice 21 21.0%
Egwusi Soup 9 9.0%
Oha Soup 7 7.0%
Ogbono Soup 2 2.0%
Show code
factor_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Most Important Factors Influencing Meal Choice") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Most Important Factors Influencing Meal Choice
most_important_factor n percent
Taste/Quality 75 75.0%
Price 9 9.0%
Variety 8 8.0%
Portion Size 7 7.0%
Delivery Time 1 1.0%
Show code
satisfaction_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Overall Satisfaction Distribution") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Overall Satisfaction Distribution
overall_satisfaction n percent
Satisfied 52 52.0%
Very Satisfied 40 40.0%
Neutral 8 8.0%
Show code
peej %>%
  summarise(
    average_satisfaction = mean(satisfaction_score, na.rm = TRUE),
    average_quality = mean(quality_score, na.rm = TRUE),
    average_affordability = mean(affordability_score, na.rm = TRUE),
    average_price_continuity = mean(price_sensitivity_score, na.rm = TRUE),
    average_delivery = mean(delivery_score, na.rm = TRUE),
    average_portion = mean(portion_score, na.rm = TRUE),
    average_order_frequency = mean(order_frequency_score, na.rm = TRUE)
  ) %>%
  pivot_longer(everything(), names_to = "metric", values_to = "average_score") %>%
  mutate(average_score = round(average_score, 2)) %>%
  kable(caption = "Average Scores for Ordinal Survey Variables") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Average Scores for Ordinal Survey Variables
metric average_score
average_satisfaction 4.32
average_quality 4.52
average_affordability 3.38
average_price_continuity 3.35
average_delivery 3.57
average_portion 3.57
average_order_frequency 2.08

5.3 Data quality issues and treatment

The EDA revealed the following data quality issues:

  1. Some response categories were written inconsistently. For example, 1-2 Monthly and 1-2 times Monthly both describe the same order frequency category. These were standardised into one category.
  2. Some text responses contained spelling or capitalisation inconsistencies. For example, Neural was corrected to Neutral, affordable was corrected to Affordable, and very Satisfying was corrected to Very Satisfying.
  3. The survey variables were mainly categorical or ordinal. For correlation and regression analysis, ordered categories were converted into numeric scores in a transparent way.

5.4 Plain-language interpretation

The EDA gives Peej Kitchen a clear first picture of customer behaviour. The frequency tables show which meals customers order most often and which factors customers consider most important when choosing meals. The satisfaction table also shows whether the business is generally performing well or whether many customers are neutral or dissatisfied. These results create the foundation for the visualisation, hypothesis testing, correlation, and regression sections.

6 6. Analysis Technique 2: Data Visualisation

6.1 Brief theory recap

Data visualisation uses charts to communicate patterns in a dataset. The goal is not only to produce graphs but to tell a clear business story. For this case, the visualisation story focuses on demand, satisfaction, and the customer experience factors that may influence repeat purchases.

6.2 Business justification

For a catering business, charts are useful because they quickly show which meals are popular, what customers value, and where operational weaknesses may exist. Visualisations are especially helpful for making decisions about menu focus, quality control, pricing, and delivery improvement.

Show code
p1 <- ggplot(peej, aes(x = fct_infreq(most_ordered_meal))) +
  geom_bar(fill = "#2C7FB8") +
  coord_flip() +
  labs(
    title = "Most Frequently Ordered Meals",
    x = "Meal",
    y = "Number of Respondents"
  )

p2 <- ggplot(peej, aes(x = fct_infreq(most_important_factor))) +
  geom_bar(fill = "#41AB5D") +
  coord_flip() +
  labs(
    title = "Main Factor Influencing Meal Choice",
    x = "Factor",
    y = "Number of Respondents"
  )

p3 <- ggplot(peej, aes(x = overall_satisfaction)) +
  geom_bar(fill = "#F16913") +
  labs(
    title = "Overall Customer Satisfaction",
    x = "Satisfaction Level",
    y = "Number of Respondents"
  )

p4 <- ggplot(peej, aes(x = affordability)) +
  geom_bar(fill = "#756BB1") +
  coord_flip() +
  labs(
    title = "Perceived Affordability of Meals",
    x = "Affordability Rating",
    y = "Number of Respondents"
  )

p5 <- ggplot(peej, aes(x = quality_score, y = satisfaction_score)) +
  geom_jitter(width = 0.15, height = 0.15, alpha = 0.55, colour = "#08519C") +
  geom_smooth(method = "lm", se = TRUE, colour = "#DE2D26") +
  labs(
    title = "Taste/Quality and Satisfaction",
    x = "Taste/Quality Score",
    y = "Satisfaction Score"
  )

(p1 | p2) / (p3 | p4) / p5 +
  plot_annotation(title = "Customer Demand and Satisfaction Story for Peej Kitchen")

Show code
ggplot(peej, aes(x = fct_reorder(most_ordered_meal, satisfaction_score, .fun = median), y = satisfaction_score)) +
  geom_boxplot(fill = "#9ECAE1") +
  coord_flip() +
  labs(
    title = "Satisfaction Score by Most Ordered Meal",
    x = "Meal",
    y = "Satisfaction Score"
  )

6.3 Plain-language interpretation

The visualisations help Peej Kitchen understand the demand story. The meal chart identifies the most frequently ordered meals, while the purchase-factor chart shows the main reasons customers choose meals. The satisfaction and affordability charts show how customers feel about the business overall. The relationship plot between taste/quality and satisfaction is especially important because it visually indicates whether customers who rate quality highly also tend to report higher satisfaction.

7 7. Analysis Technique 3: Hypothesis Testing

7.1 Brief theory recap

Hypothesis testing is used to determine whether an observed relationship or difference in the sample is likely to reflect a real pattern in the wider customer population. The null hypothesis usually states that there is no relationship or no difference, while the alternative hypothesis states that a relationship or difference exists.

7.2 Business justification

Peej Kitchen should not rely only on visual impressions. Hypothesis testing helps the business determine whether customer satisfaction is statistically associated with quality or whether satisfaction differs across meals. This supports more evidence-based decisions about quality improvement and menu strategy.

7.3 Hypothesis Test 1: Taste/Quality and Overall Satisfaction

Null hypothesis (H₀): Taste/quality rating and overall satisfaction are independent.

Alternative hypothesis (H₁): Taste/quality rating and overall satisfaction are associated.

Because both variables are categorical/ordinal, a chi-square test of independence is appropriate. If expected counts are too small, a simulated p-value is used to make the test more reliable.

Show code
quality_satisfaction_table <- table(peej$taste_quality, peej$overall_satisfaction)
quality_satisfaction_table
           
            Neutral Satisfied Very Satisfied
  Excellent       0        22             37
  Good            5        27              2
  Neutral         3         3              1
Show code
chisq_initial <- chisq.test(quality_satisfaction_table)
use_simulated_p <- any(chisq_initial$expected < 5)

chisq_quality <- chisq.test(
  quality_satisfaction_table,
  simulate.p.value = use_simulated_p,
  B = 10000
)

chisq_quality

    Pearson's Chi-squared test with simulated p-value (based on 10000
    replicates)

data:  quality_satisfaction_table
X-squared = 43.404, df = NA, p-value = 9.999e-05
Show code
cramers_v_result <- cramers_v(quality_satisfaction_table)
cramers_v_result
Show code
tibble(
  test = "Chi-square test: Taste/Quality vs Overall Satisfaction",
  statistic = round(unname(chisq_quality$statistic), 3),
  p_value = round(chisq_quality$p.value, 4),
  simulated_p_value_used = use_simulated_p,
  cramers_v = round(as.numeric(cramers_v_result$Cramers_v), 3)
) %>%
  kable(caption = "Hypothesis Test 1 Result") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Hypothesis Test 1 Result
test statistic p_value simulated_p_value_used cramers_v
Chi-square test: Taste/Quality vs Overall Satisfaction 43.404 1e-04 TRUE 0.448

7.4 Interpretation of Test 1

If the p-value is below 0.05, the result suggests that taste/quality and overall satisfaction are significantly associated. In business terms, this would mean that customers’ perception of taste and quality is not just a casual opinion; it is meaningfully connected to how satisfied they are with Peej Kitchen. If the p-value is above 0.05, the sample does not provide enough statistical evidence to conclude that taste/quality and satisfaction are associated, although quality may still remain operationally important.

7.5 Hypothesis Test 2: Satisfaction Across Meal Categories

Null hypothesis (H₀): Overall satisfaction scores are the same across meal categories.

Alternative hypothesis (H₁): At least one meal category has a different satisfaction score.

The Kruskal-Wallis test is appropriate because satisfaction score is ordinal and the comparison involves more than two meal groups.

Show code
kruskal_meal <- kruskal.test(satisfaction_score ~ most_ordered_meal, data = peej)
kruskal_meal

    Kruskal-Wallis rank sum test

data:  satisfaction_score by most_ordered_meal
Kruskal-Wallis chi-squared = 7.1829, df = 4, p-value = 0.1265
Show code
kruskal_effect <- peej %>%
  kruskal_effsize(satisfaction_score ~ most_ordered_meal)

kruskal_effect
Show code
tibble(
  test = "Kruskal-Wallis test: Satisfaction across meals",
  statistic = round(unname(kruskal_meal$statistic), 3),
  p_value = round(kruskal_meal$p.value, 4),
  effect_size_epsilon_squared = round(kruskal_effect$effsize, 3),
  effect_size_magnitude = kruskal_effect$magnitude
) %>%
  kable(caption = "Hypothesis Test 2 Result") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Hypothesis Test 2 Result
test statistic p_value effect_size_epsilon_squared effect_size_magnitude
Kruskal-Wallis test: Satisfaction across meals 7.183 0.1265 0.034 small

7.6 Interpretation of Test 2

If the p-value is below 0.05, it suggests that satisfaction differs significantly across the meals. This would mean Peej Kitchen should investigate which meals produce higher or lower satisfaction and use that insight for menu improvement. If the p-value is above 0.05, the data does not provide strong evidence that satisfaction differs by meal type, meaning customer satisfaction may be driven more by general service factors such as taste, delivery, affordability, and portion size than by meal category alone.

8 8. Analysis Technique 4: Correlation Analysis

8.1 Brief theory recap

Correlation analysis measures the direction and strength of association between numeric variables. Because the Peej Kitchen survey variables are ordinal scores, Spearman correlation is used. Spearman correlation is suitable when variables are ranked or ordinal and when the relationship may not be perfectly linear.

8.2 Business justification

Correlation analysis helps Peej Kitchen identify which customer experience variables move together. For example, if taste/quality is strongly correlated with satisfaction, the business should treat meal quality as a priority. If affordability is strongly correlated with willingness to continue buying after a price increase, pricing strategy should be handled carefully.

Show code
numeric_data <- peej %>%
  select(
    satisfaction_score,
    quality_score,
    affordability_score,
    price_sensitivity_score,
    delivery_score,
    portion_score,
    order_frequency_score,
    income_score
  )

cor_matrix <- cor(numeric_data, use = "complete.obs", method = "spearman")
round(cor_matrix, 2)
                        satisfaction_score quality_score affordability_score
satisfaction_score                    1.00          0.60                0.32
quality_score                         0.60          1.00                0.30
affordability_score                   0.32          0.30                1.00
price_sensitivity_score               0.29          0.30                0.47
delivery_score                        0.45          0.59                0.48
portion_score                         0.41          0.28                0.33
order_frequency_score                 0.12          0.10                0.16
income_score                          0.36          0.33                0.05
                        price_sensitivity_score delivery_score portion_score
satisfaction_score                         0.29           0.45          0.41
quality_score                              0.30           0.59          0.28
affordability_score                        0.47           0.48          0.33
price_sensitivity_score                    1.00           0.29          0.41
delivery_score                             0.29           1.00          0.22
portion_score                              0.41           0.22          1.00
order_frequency_score                      0.17           0.18          0.17
income_score                               0.35           0.21          0.23
                        order_frequency_score income_score
satisfaction_score                       0.12         0.36
quality_score                            0.10         0.33
affordability_score                      0.16         0.05
price_sensitivity_score                  0.17         0.35
delivery_score                           0.18         0.21
portion_score                            0.17         0.23
order_frequency_score                    1.00         0.20
income_score                             0.20         1.00
Show code
corrplot(
  cor_matrix,
  method = "color",
  type = "upper",
  addCoef.col = "black",
  tl.col = "black",
  tl.srt = 45,
  number.cex = 0.7
)

Show code
strong_correlations <- as.data.frame(as.table(cor_matrix)) %>%
  rename(variable_1 = Var1, variable_2 = Var2, correlation = Freq) %>%
  filter(variable_1 != variable_2) %>%
  mutate(pair = map2_chr(as.character(variable_1), as.character(variable_2), ~ paste(sort(c(.x, .y)), collapse = "---"))) %>%
  distinct(pair, .keep_all = TRUE) %>%
  mutate(abs_correlation = abs(correlation)) %>%
  arrange(desc(abs_correlation)) %>%
  select(variable_1, variable_2, correlation) %>%
  mutate(correlation = round(correlation, 3))

head(strong_correlations, 10) %>%
  kable(caption = "Top Correlations Among Customer Experience Variables") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Top Correlations Among Customer Experience Variables
variable_1 variable_2 correlation
quality_score satisfaction_score 0.596
delivery_score quality_score 0.589
delivery_score affordability_score 0.480
price_sensitivity_score affordability_score 0.471
delivery_score satisfaction_score 0.451
portion_score satisfaction_score 0.408
portion_score price_sensitivity_score 0.408
income_score satisfaction_score 0.361
income_score price_sensitivity_score 0.346
income_score quality_score 0.332

8.3 Plain-language interpretation

The correlation matrix shows the relationships between customer experience factors. A positive correlation means that as one score increases, the other tends to increase. For example, a positive relationship between quality score and satisfaction score would suggest that customers who give higher ratings to meal quality also tend to give higher overall satisfaction ratings. However, correlation does not prove causation. To confirm causality, Peej Kitchen would need a stronger design, such as tracking satisfaction before and after a deliberate quality improvement or price change.

9 9. Analysis Technique 5: Regression Analysis

9.1 Brief theory recap

Regression analysis estimates how a dependent variable changes when predictor variables change. In this study, linear regression is used to estimate how taste/quality, affordability, price sensitivity, delivery time, portion size, income, and order frequency are associated with overall satisfaction.

Although satisfaction is an ordinal survey score, it is treated as an approximate numeric outcome for this business analytics exercise. The model is interpreted cautiously as an explanatory tool rather than a perfect causal model.

9.2 Business justification

Regression is useful because Peej Kitchen needs to know which business factors are most strongly associated with satisfaction after accounting for other factors. This supports prioritisation. For example, if quality has the strongest positive coefficient, quality improvement should receive more management attention than less influential factors.

Show code
satisfaction_model <- lm(
  satisfaction_score ~ quality_score +
    affordability_score +
    price_sensitivity_score +
    delivery_score +
    portion_score +
    order_frequency_score +
    income_score,
  data = peej
)

summary(satisfaction_model)

Call:
lm(formula = satisfaction_score ~ quality_score + affordability_score + 
    price_sensitivity_score + delivery_score + portion_score + 
    order_frequency_score + income_score, data = peej)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.98115 -0.34210  0.03272  0.32318  1.82251 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)              1.468325   0.370825   3.960 0.000148 ***
quality_score            0.375905   0.097644   3.850 0.000218 ***
affordability_score      0.051828   0.071582   0.724 0.470878    
price_sensitivity_score -0.042315   0.054938  -0.770 0.443143    
delivery_score           0.121576   0.084974   1.431 0.155893    
portion_score            0.117663   0.062899   1.871 0.064570 .  
order_frequency_score   -0.002117   0.042524  -0.050 0.960395    
income_score             0.095901   0.042192   2.273 0.025355 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4824 on 92 degrees of freedom
Multiple R-squared:  0.433, Adjusted R-squared:  0.3898 
F-statistic: 10.04 on 7 and 92 DF,  p-value: 2.907e-09
Show code
model_results <- tidy(satisfaction_model, conf.int = TRUE) %>%
  mutate(
    estimate = round(estimate, 3),
    std.error = round(std.error, 3),
    statistic = round(statistic, 3),
    p.value = round(p.value, 4),
    conf.low = round(conf.low, 3),
    conf.high = round(conf.high, 3)
  )

model_results %>%
  kable(caption = "Regression Coefficients Predicting Overall Satisfaction") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Regression Coefficients Predicting Overall Satisfaction
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 1.468 0.371 3.960 0.0001 0.732 2.205
quality_score 0.376 0.098 3.850 0.0002 0.182 0.570
affordability_score 0.052 0.072 0.724 0.4709 -0.090 0.194
price_sensitivity_score -0.042 0.055 -0.770 0.4431 -0.151 0.067
delivery_score 0.122 0.085 1.431 0.1559 -0.047 0.290
portion_score 0.118 0.063 1.871 0.0646 -0.007 0.243
order_frequency_score -0.002 0.043 -0.050 0.9604 -0.087 0.082
income_score 0.096 0.042 2.273 0.0254 0.012 0.180
Show code
glance(satisfaction_model) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value, df, df.residual) %>%
  mutate(across(where(is.numeric), ~ round(.x, 4))) %>%
  kable(caption = "Regression Model Fit") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))
Regression Model Fit
r.squared adj.r.squared sigma statistic p.value df df.residual
0.433 0.3898 0.4824 10.0353 0 7 92
Show code
par(mfrow = c(2, 2))
plot(satisfaction_model)

Show code
par(mfrow = c(1, 1))

9.3 Plain-language interpretation

The regression output should be interpreted by focusing on the sign, size, and p-value of each coefficient.

  • A positive coefficient means that an increase in that factor is associated with higher satisfaction, holding the other factors constant.
  • A negative coefficient means that an increase in that factor is associated with lower satisfaction, holding the other factors constant.
  • A p-value below 0.05 suggests that the factor is statistically significant in this model.
  • The R-squared value shows the percentage of variation in satisfaction explained by the predictors included in the model.

For managerial decision-making, the most important predictors are the ones with meaningful positive coefficients and statistically significant p-values. If taste/quality is significant, Peej Kitchen should maintain strict quality control, recipe consistency, and ingredient standards. If delivery or portion size is significant, then operational improvements in dispatch timing or portion standardisation should be prioritised.

10 10. Integrated Findings

The five analytical techniques work together to give a complete picture of Peej Kitchen’s customer experience.

EDA summarised the customer survey and identified data quality issues that needed correction before analysis. Visualisation showed the main demand and satisfaction patterns in an easy-to-understand way. Hypothesis testing provided formal statistical evidence about whether quality and meal type are associated with satisfaction. Correlation analysis showed which customer experience variables move together. Regression analysis then combined the main predictors into one model to estimate which factors best explain overall satisfaction.

The overall recommendation is that Peej Kitchen should treat taste and meal quality as its main competitive advantage, while also monitoring affordability, price sensitivity, delivery experience, and portion size. The business should continue promoting high-demand meals, especially the meals with strong order counts and high satisfaction, while improving any meal or service area that shows lower ratings.

A practical action plan is:

  1. Maintain quality consistency by standardising recipes, cooking process, and ingredient selection.
  2. Use Jollof Rice and other high-demand meals as flagship offerings in promotions.
  3. Avoid sudden price increases; if prices must increase, communicate the reason clearly and consider bundle options.
  4. Monitor delivery time and portion size because these affect the total customer experience.
  5. Repeat this survey periodically to track changes in customer satisfaction and demand over time.

11 11. Limitations and Further Work

This study has some limitations. First, the sample size is 100, which is acceptable for the assignment but still limited for making broad conclusions about all potential customers. Second, the survey includes existing customers, potential customers, family, and friends, which may introduce response bias because some respondents may be more favourable toward the business. Third, most variables are ordinal categories, meaning that converting them to numeric scores requires judgement. Fourth, the analysis is based on self-reported survey responses rather than actual transaction records.

With more time and data, Peej Kitchen could improve the analysis by combining survey responses with actual sales records, delivery logs, repeat purchase history, and customer complaints. The business could also collect data over several months to study seasonality and repeat buying behaviour. A future study could use logistic regression to predict whether a customer is likely to continue buying after a price increase or use customer segmentation to identify different customer groups.

12 References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online

Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.x) [Computer software]. https://doi.org/10.5281/zenodo.5960048

Fagbemi, M. O. (2026). Peej Kitchen customer meal preference and satisfaction survey [Survey instrument and dataset]. Administered to existing customers, potential customers, family, and friends, May 2026. Ethical clearance: Respondents were informed that responses would be used anonymously for an MBA analytics assignment.

R Core Team. (2024). R: A language and environment for statistical computing (Version 4.x). R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

13 Appendix: AI Usage Statement

I used ChatGPT as an AI coding and writing assistant to support the structure of the Quarto document, suggest R code for data cleaning and analysis, and improve the clarity of the business interpretation. I made the final analytical decisions, including the selection of the Peej Kitchen business problem, the choice of variables, the interpretation of outputs, and the business recommendations. The dataset was collected independently through a Google Forms survey administered to customers and people familiar with Peej Kitchen.

14 Appendix: Defence Preparation Notes

For the oral defence, I should be able to explain the following:

  1. Why this dataset is real primary data: It was collected through Google Forms from customers, potential customers, family, and friends familiar with Peej Kitchen.
  2. Why EDA was used: It helped me understand the structure, missing values, inconsistencies, and basic patterns in the survey data.
  3. Why visualisation was used: It helped communicate meal preference, satisfaction, affordability, and quality patterns clearly.
  4. Why hypothesis testing was used: It helped test whether relationships in the sample are statistically meaningful.
  5. Why Spearman correlation was used: The survey variables are ordinal scores, so Spearman is more appropriate than Pearson.
  6. Why regression was used: It helped estimate which customer experience factors predict overall satisfaction.
  7. Main business implication: Peej Kitchen should protect taste/quality as its main advantage while improving delivery, portion consistency, and price communication.