Exploratory and Inferential Analysis of Customer Meal Preferences and Satisfaction at Peej Kitchen

Author

Fagbemi Mary Olapeju

Published

May 12, 2026

Case Study 1 — Exploratory & Inferential Analytics.
This report analyses primary survey data collected for Peej Kitchen to understand customer meal preference, satisfaction, price sensitivity, and operational improvement opportunities.

1 1. Executive Summary

Peej Kitchen is a home-made catering business that prepares and delivers Nigerian meals such as Jollof Rice, Fried Rice, Oha Soup, Egusi Soup, and Ogbono Soup. This case study uses primary survey data collected from 100 respondents in May 2026 to understand the factors influencing customer meal demand, meal preference, and overall satisfaction. The dataset includes customer income range, order frequency, most ordered meal, perceived affordability, likelihood of continued purchase after a price increase, taste/quality rating, delivery time rating, portion size rating, most important purchase factor, and overall satisfaction.

The analysis applies five exploratory and inferential analytics techniques: exploratory data analysis, visualisation, hypothesis testing, correlation analysis, and regression analysis. The evidence from the survey is expected to help Peej Kitchen identify its most attractive meals, understand the main drivers of satisfaction, and prioritise operational improvements. The business recommendation is to protect taste and quality as the core competitive advantage, promote high-demand meals such as Jollof Rice, monitor price sensitivity carefully, and improve operational areas such as delivery experience and portion consistency where the data shows weaker ratings.

Management focus: The report is written for a non-technical business owner. Each analysis section therefore includes the technique used, why it matters for Peej Kitchen, the R output, and a plain-language business interpretation.

2 2. Professional Disclosure

I am Fagbemi Mary Olapeju, the Owner of Peej Kitchen, a home-made catering business that prepares meals for customers and ensures that they are delivered safely and securely. Peej Kitchen serves a variety of Nigerian meals including Jollof Rice, Fried Rice, Oha Soup, Egusi Soup, Ogbono Soup, and other home-made dishes.

The purpose of this analysis is to understand the factors that influence customer demand, meal preference, and customer satisfaction. As the owner of the business, I regularly make decisions about menu planning, pricing, meal quality, delivery experience, and customer retention. Therefore, the use of exploratory and inferential analytics is directly relevant to my day-to-day operations.

Exploratory Data Analysis (EDA). EDA is relevant because it helps me understand the structure of my customer survey data, identify missing values, detect inconsistent responses, and summarise key customer patterns such as most ordered meals, order frequency, and satisfaction levels. This supports better decisions about what customers are buying and where the business may need improvement.

Data Visualisation. Data visualisation is relevant because it allows me to communicate customer behaviour clearly. Charts showing meal preference, satisfaction levels, affordability perception, and key purchase factors can help me quickly identify which areas of the business need attention.

Hypothesis Testing. Hypothesis testing is relevant because it allows me to test whether observed patterns in the data are statistically meaningful. For example, I can test whether taste/quality is significantly associated with overall customer satisfaction and whether satisfaction differs across meal categories.

Correlation Analysis. Correlation analysis is relevant because it helps me understand the strength and direction of relationships between key customer experience variables such as taste, affordability, portion size, delivery time, price sensitivity, and satisfaction.

Regression Analysis. Regression analysis is relevant because it helps me estimate which factors are the strongest predictors of overall customer satisfaction. This supports better business decisions on where Peej Kitchen should focus improvement efforts.

3 3. Data Collection and Sampling

The primary dataset used for this study was collected through a customer survey designed by the owner of Peej Kitchen. The survey was administered using Google Forms and distributed mainly through WhatsApp and direct messages to customers.

The sampling frame consisted of existing customers of Peej Kitchen, potential customers who had tasted the meals, and family and friends who were familiar with the business. The survey was collected around May 2026. A total of 100 responses were collected, which satisfies the minimum requirement of 100 observations for this case study.

The survey captured customer views on income range, order frequency, most ordered meal, affordability, likelihood of continued purchase if prices increase, taste/quality rating, delivery time, portion size, most important factor influencing meal choice, and overall satisfaction.

Participation was voluntary. Respondents were informed that their responses would be used anonymously for an MBA Data Analytics assignment. No personally identifiable information was used in the analysis, and the results are presented only in aggregated form. The dataset is treated as primary survey data collected directly from customers and people familiar with Peej Kitchen.

4 4. Data Description

This section loads the data, standardises the column names, cleans inconsistent categories, and creates numeric scores from ordered survey responses. These numeric scores make it possible to run correlation and regression analysis.

Show code

# Uncomment and run once if any package is missing:
# install.packages(c("tidyverse", "readxl", "janitor", "skimr", "naniar", "corrplot", "broom", "effectsize", "rstatix", "knitr", "kableExtra", "patchwork", "scales"))

library(tidyverse)
library(readxl)
library(janitor)
library(skimr)
library(naniar)
library(corrplot)
library(broom)
library(effectsize)
library(rstatix)
library(knitr)
library(kableExtra)
library(patchwork)
library(scales)

theme_set(
  theme_minimal(base_size = 12) +
    theme(
      plot.title = element_text(face = "bold", colour = "#1f4e79"),
      plot.subtitle = element_text(colour = "#5f6b77"),
      axis.title = element_text(face = "bold"),
      panel.grid.minor = element_blank()
    )
)

Show code

peej_raw <- read_excel("data/peej_kitchen_responses.xlsx")

glimpse(peej_raw)

Rows: 100
Columns: 11
$ Timestamp                                                               <dttm> …
$ `1. What is your monthly income range?`                                 <chr> …
$ `2. How often do you buy home-made meals from peej Kitchen?`            <chr> …
$ `3. Which of the following meals do you order most often?`              <chr> …
$ `4. How would you rate the affordability of our meals?`                 <chr> …
$ `5. If the price increases, how likely are you to continue buying?`     <chr> …
$ `6. How would you rate the taste/quality of the meals?`                 <chr> …
$ `7. How would you rate delivery time?`                                  <chr> …
$ `8. How would you rate portion size?`                                   <chr> …
$ `9. What is the MOST important factor influencing your choice of meal?` <chr> …
$ `10. How satisfied are you overall with our meals?`                     <chr> …

Show code

peej <- peej_raw %>%
  clean_names() %>%
  rename(
    income_range = x1_what_is_your_monthly_income_range,
    order_frequency = x2_how_often_do_you_buy_home_made_meals_from_peej_kitchen,
    most_ordered_meal = x3_which_of_the_following_meals_do_you_order_most_often,
    affordability = x4_how_would_you_rate_the_affordability_of_our_meals,
    price_increase_likelihood = x5_if_the_price_increases_how_likely_are_you_to_continue_buying,
    taste_quality = x6_how_would_you_rate_the_taste_quality_of_the_meals,
    delivery_time = x7_how_would_you_rate_delivery_time,
    portion_size = x8_how_would_you_rate_portion_size,
    most_important_factor = x9_what_is_the_most_important_factor_influencing_your_choice_of_meal,
    overall_satisfaction = x10_how_satisfied_are_you_overall_with_our_meals
  ) %>%
  mutate(
    across(where(is.character), ~ str_squish(.x)),
    order_frequency = case_when(
      order_frequency %in% c("1-2 Monthly", "1-2 times Monthly") ~ "1-2 times Monthly",
      order_frequency == "1–2 times a week" ~ "1-2 times a week",
      order_frequency == "3–5 times a week" ~ "3-5 times a week",
      TRUE ~ order_frequency
    ),
    taste_quality = case_when(
      taste_quality == "Neural" ~ "Neutral",
      TRUE ~ taste_quality
    ),
    affordability = case_when(
      affordability == "affordable" ~ "Affordable",
      TRUE ~ affordability
    ),
    portion_size = case_when(
      portion_size == "very Satisfying" ~ "Very Satisfying",
      TRUE ~ portion_size
    ),
    timestamp = case_when(
      inherits(timestamp, "POSIXct") ~ as.POSIXct(timestamp),
      inherits(timestamp, "Date") ~ as.POSIXct(timestamp),
      is.numeric(timestamp) ~ as.POSIXct((as.numeric(timestamp) - 25569) * 86400, origin = "1970-01-01", tz = "Africa/Lagos"),
      TRUE ~ suppressWarnings(as.POSIXct(timestamp, tz = "Africa/Lagos"))
    ),
    response_date = as.Date(timestamp)
  ) %>%
  mutate(
    satisfaction_score = case_when(
      overall_satisfaction == "Very Dissatisfied" ~ 1,
      overall_satisfaction == "Dissatisfied" ~ 2,
      overall_satisfaction == "Neutral" ~ 3,
      overall_satisfaction == "Satisfied" ~ 4,
      overall_satisfaction == "Very Satisfied" ~ 5,
      TRUE ~ NA_real_
    ),
    quality_score = case_when(
      taste_quality == "Poor" ~ 1,
      taste_quality == "Fair" ~ 2,
      taste_quality == "Neutral" ~ 3,
      taste_quality == "Good" ~ 4,
      taste_quality == "Excellent" ~ 5,
      TRUE ~ NA_real_
    ),
    affordability_score = case_when(
      affordability == "Very Expensive" ~ 1,
      affordability == "Expensive" ~ 2,
      affordability == "Moderate" ~ 3,
      affordability == "Affordable" ~ 4,
      affordability == "Very Affordable" ~ 5,
      TRUE ~ NA_real_
    ),
    price_sensitivity_score = case_when(
      price_increase_likelihood == "Very unlikely" ~ 1,
      price_increase_likelihood == "Unlikely" ~ 2,
      price_increase_likelihood == "Neutral" ~ 3,
      price_increase_likelihood == "Likely" ~ 4,
      price_increase_likelihood == "Very likely" ~ 5,
      TRUE ~ NA_real_
    ),
    delivery_score = case_when(
      delivery_time == "Very Slow" ~ 1,
      delivery_time == "Slow" ~ 2,
      delivery_time == "Neutral" ~ 3,
      delivery_time == "Fast" ~ 4,
      delivery_time == "Very fast" ~ 5,
      TRUE ~ NA_real_
    ),
    portion_score = case_when(
      portion_size == "Too small" ~ 1,
      portion_size == "Small" ~ 2,
      portion_size == "Neutral" ~ 3,
      portion_size == "Satisfying" ~ 4,
      portion_size == "Very Satisfying" ~ 5,
      TRUE ~ NA_real_
    ),
    order_frequency_score = case_when(
      order_frequency == "Occasionally" ~ 1,
      order_frequency == "1-2 times Monthly" ~ 2,
      order_frequency == "1-2 times a week" ~ 3,
      order_frequency == "3-5 times a week" ~ 4,
      TRUE ~ NA_real_
    ),
    income_score = case_when(
      income_range == "Below ₦50,000" ~ 1,
      income_range == "₦50,000 – ₦100,000" ~ 2,
      income_range == "₦100,000 – ₦200,000" ~ 3,
      income_range == "Above ₦200,000" ~ 4,
      TRUE ~ NA_real_
    )
  )

Show code

variable_description <- tibble::tribble(
  ~Variable, ~Type, ~BusinessMeaning,
  "response_date", "Date", "Date the survey response was submitted",
  "income_range", "Categorical / ordinal", "Customer monthly income group",
  "order_frequency", "Categorical / ordinal", "How often customers buy home-made meals from Peej Kitchen",
  "most_ordered_meal", "Categorical", "Meal ordered most often by the respondent",
  "affordability", "Categorical / ordinal", "Customer perception of meal affordability",
  "price_increase_likelihood", "Categorical / ordinal", "Likelihood of continued purchase after price increase",
  "taste_quality", "Categorical / ordinal", "Customer rating of taste and meal quality",
  "delivery_time", "Categorical / ordinal", "Customer rating of delivery time",
  "portion_size", "Categorical / ordinal", "Customer rating of meal portion size",
  "most_important_factor", "Categorical", "Main factor influencing meal choice",
  "overall_satisfaction", "Categorical / ordinal", "Overall customer satisfaction rating"
)

variable_description %>%
  kable(caption = "Description of Variables in the Peej Kitchen Survey") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Description of Variables in the Peej Kitchen Survey
Variable	Type	BusinessMeaning
response_date	Date	Date the survey response was submitted
income_range	Categorical / ordinal	Customer monthly income group
order_frequency	Categorical / ordinal	How often customers buy home-made meals from Peej Kitchen
most_ordered_meal	Categorical	Meal ordered most often by the respondent
affordability	Categorical / ordinal	Customer perception of meal affordability
price_increase_likelihood	Categorical / ordinal	Likelihood of continued purchase after price increase
taste_quality	Categorical / ordinal	Customer rating of taste and meal quality
delivery_time	Categorical / ordinal	Customer rating of delivery time
portion_size	Categorical / ordinal	Customer rating of meal portion size
most_important_factor	Categorical	Main factor influencing meal choice
overall_satisfaction	Categorical / ordinal	Overall customer satisfaction rating

Show code

tibble(
  metric = c(
    "Number of observations",
    "Number of variables",
    "Earliest response date",
    "Latest response date"
  ),
  value = c(
    as.character(nrow(peej)),
    as.character(ncol(peej_raw)),
    format(min(peej$response_date, na.rm = TRUE), "%d %B %Y"),
    format(max(peej$response_date, na.rm = TRUE), "%d %B %Y")
  )
) %>%
  kable(caption = "Dataset Size and Survey Period") %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed")
  )

Dataset Size and Survey Period
metric	value
Number of observations	100
Number of variables	11
Earliest response date	05 May 2026
Latest response date	09 May 2026

5 5. Analysis Technique 1: Exploratory Data Analysis

5.1 Brief theory recap

Exploratory Data Analysis is the process of examining the dataset before formal modelling. It helps the analyst understand data structure, missing values, data quality problems, distributions, and early patterns. For Peej Kitchen, EDA helps answer basic operational questions such as: Which meals are most ordered? How satisfied are customers? Are there inconsistent responses that need cleaning before analysis?

5.2 Business justification

Peej Kitchen needs to understand the current state of customer demand and satisfaction before making decisions about pricing, promotion, menu improvement, and delivery operations. EDA is appropriate because this is survey data with a mix of categorical and ordinal variables.

Show code

skim(peej)

Data summary
Name	peej
Number of rows	100
Number of columns	20
_______________________
Column type frequency:
character	10
Date	1
numeric	8
POSIXct	1
________________________
Group variables	None

Variable type: character

skim_variable	complete_rate	min	max	n_unique
income_range	1	13	19	4
order_frequency	1	12	17	4
most_ordered_meal	1	8	11	5
affordability	1	8	15	5
price_increase_likelihood	1	6	13	5
taste_quality	1	4	9	3
delivery_time	1	4	9	5
portion_size	1	5	15	5
most_important_factor	1	5	13	5
overall_satisfaction	1	7	14	3

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
response_date	0	1	2026-05-05	2026-05-09	2026-05-06	5

Variable type: numeric

skim_variable	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
satisfaction_score	1	4.32	0.62	3	4	4	5	5	▁▁▇▁▆
quality_score	1	4.52	0.63	3	4	5	5	5	▁▁▅▁▇
affordability_score	1	3.38	0.86	1	3	3	4	5	▁▁▇▅▂
price_sensitivity_score	1	3.35	1.10	1	3	3	4	5	▂▂▇▇▃
delivery_score	1	3.57	0.74	1	3	4	4	5	▁▁▇▇▂
portion_score	1	3.57	0.91	1	3	4	4	5	▁▂▃▇▂
order_frequency_score	1	2.08	1.18	1	1	2	3	4	▇▂▁▅▂
income_score	1	2.81	1.29	1	1	3	4	4	▅▂▁▂▇

Variable type: POSIXct

skim_variable	n_missing	complete_rate	min	max	median	n_unique
timestamp	0	1	2026-05-05 19:18:29	2026-05-09 09:45:57	2026-05-06 02:54:39	100

Show code

miss_var_summary(peej) %>%
  kable(caption = "Missing Value Summary") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Missing Value Summary
variable	n_miss	pct_miss
timestamp	0	0
income_range	0	0
order_frequency	0	0
most_ordered_meal	0	0
affordability	0	0
price_increase_likelihood	0	0
taste_quality	0	0
delivery_time	0	0
portion_size	0	0
most_important_factor	0	0
overall_satisfaction	0	0
response_date	0	0
satisfaction_score	0	0
quality_score	0	0
affordability_score	0	0
price_sensitivity_score	0	0
delivery_score	0	0
portion_score	0	0
order_frequency_score	0	0
income_score	0	0

Show code

meal_counts <- peej %>%
  count(most_ordered_meal, sort = TRUE) %>%
  mutate(percent = n / sum(n))

factor_counts <- peej %>%
  count(most_important_factor, sort = TRUE) %>%
  mutate(percent = n / sum(n))

satisfaction_counts <- peej %>%
  count(overall_satisfaction, sort = TRUE) %>%
  mutate(percent = n / sum(n))

meal_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Most Ordered Meals") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Most Ordered Meals
most_ordered_meal	n	percent
Jollof Rice	61	61.0%
Fried Rice	21	21.0%
Egwusi Soup	9	9.0%
Oha Soup	7	7.0%
Ogbono Soup	2	2.0%

Show code

factor_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Most Important Factors Influencing Meal Choice") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Most Important Factors Influencing Meal Choice
most_important_factor	n	percent
Taste/Quality	75	75.0%
Price	9	9.0%
Variety	8	8.0%
Portion Size	7	7.0%
Delivery Time	1	1.0%

Show code

satisfaction_counts %>%
  mutate(percent = percent(percent, accuracy = 0.1)) %>%
  kable(caption = "Overall Satisfaction Distribution") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Overall Satisfaction Distribution
overall_satisfaction	n	percent
Satisfied	52	52.0%
Very Satisfied	40	40.0%
Neutral	8	8.0%

Show code

peej %>%
  summarise(
    average_satisfaction = mean(satisfaction_score, na.rm = TRUE),
    average_quality = mean(quality_score, na.rm = TRUE),
    average_affordability = mean(affordability_score, na.rm = TRUE),
    average_price_continuity = mean(price_sensitivity_score, na.rm = TRUE),
    average_delivery = mean(delivery_score, na.rm = TRUE),
    average_portion = mean(portion_score, na.rm = TRUE),
    average_order_frequency = mean(order_frequency_score, na.rm = TRUE)
  ) %>%
  pivot_longer(everything(), names_to = "metric", values_to = "average_score") %>%
  mutate(average_score = round(average_score, 2)) %>%
  kable(caption = "Average Scores for Ordinal Survey Variables") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Average Scores for Ordinal Survey Variables
metric	average_score
average_satisfaction	4.32
average_quality	4.52
average_affordability	3.38
average_price_continuity	3.35
average_delivery	3.57
average_portion	3.57
average_order_frequency	2.08

5.3 Data quality issues and treatment

The EDA revealed the following data quality issues:

Some response categories were written inconsistently. For example, 1-2 Monthly and 1-2 times Monthly both describe the same order frequency category. These were standardised into one category.
Some text responses contained spelling or capitalisation inconsistencies. For example, Neural was corrected to Neutral, affordable was corrected to Affordable, and very Satisfying was corrected to Very Satisfying.
The survey variables were mainly categorical or ordinal. For correlation and regression analysis, ordered categories were converted into numeric scores in a transparent way.

5.4 Plain-language interpretation

The EDA gives Peej Kitchen a clear first picture of customer behaviour. The frequency tables show which meals customers order most often and which factors customers consider most important when choosing meals. The satisfaction table also shows whether the business is generally performing well or whether many customers are neutral or dissatisfied. These results create the foundation for the visualisation, hypothesis testing, correlation, and regression sections.

6 6. Analysis Technique 2: Data Visualisation

6.1 Brief theory recap

Data visualisation uses charts to communicate patterns in a dataset. The goal is not only to produce graphs but to tell a clear business story. For this case, the visualisation story focuses on demand, satisfaction, and the customer experience factors that may influence repeat purchases.

6.2 Business justification

For a catering business, charts are useful because they quickly show which meals are popular, what customers value, and where operational weaknesses may exist. Visualisations are especially helpful for making decisions about menu focus, quality control, pricing, and delivery improvement.

Show code

p1 <- ggplot(peej, aes(x = fct_infreq(most_ordered_meal))) +
  geom_bar(fill = "#2C7FB8") +
  coord_flip() +
  labs(
    title = "Most Frequently Ordered Meals",
    x = "Meal",
    y = "Number of Respondents"
  )

p2 <- ggplot(peej, aes(x = fct_infreq(most_important_factor))) +
  geom_bar(fill = "#41AB5D") +
  coord_flip() +
  labs(
    title = "Main Factor Influencing Meal Choice",
    x = "Factor",
    y = "Number of Respondents"
  )

p3 <- ggplot(peej, aes(x = overall_satisfaction)) +
  geom_bar(fill = "#F16913") +
  labs(
    title = "Overall Customer Satisfaction",
    x = "Satisfaction Level",
    y = "Number of Respondents"
  )

p4 <- ggplot(peej, aes(x = affordability)) +
  geom_bar(fill = "#756BB1") +
  coord_flip() +
  labs(
    title = "Perceived Affordability of Meals",
    x = "Affordability Rating",
    y = "Number of Respondents"
  )

p5 <- ggplot(peej, aes(x = quality_score, y = satisfaction_score)) +
  geom_jitter(width = 0.15, height = 0.15, alpha = 0.55, colour = "#08519C") +
  geom_smooth(method = "lm", se = TRUE, colour = "#DE2D26") +
  labs(
    title = "Taste/Quality and Satisfaction",
    x = "Taste/Quality Score",
    y = "Satisfaction Score"
  )

(p1 | p2) / (p3 | p4) / p5 +
  plot_annotation(title = "Customer Demand and Satisfaction Story for Peej Kitchen")

Show code

ggplot(peej, aes(x = fct_reorder(most_ordered_meal, satisfaction_score, .fun = median), y = satisfaction_score)) +
  geom_boxplot(fill = "#9ECAE1") +
  coord_flip() +
  labs(
    title = "Satisfaction Score by Most Ordered Meal",
    x = "Meal",
    y = "Satisfaction Score"
  )

6.3 Plain-language interpretation

The visualisations help Peej Kitchen understand the demand story. The meal chart identifies the most frequently ordered meals, while the purchase-factor chart shows the main reasons customers choose meals. The satisfaction and affordability charts show how customers feel about the business overall. The relationship plot between taste/quality and satisfaction is especially important because it visually indicates whether customers who rate quality highly also tend to report higher satisfaction.

7 7. Analysis Technique 3: Hypothesis Testing

7.1 Brief theory recap

Hypothesis testing is used to determine whether an observed relationship or difference in the sample is likely to reflect a real pattern in the wider customer population. The null hypothesis usually states that there is no relationship or no difference, while the alternative hypothesis states that a relationship or difference exists.

7.2 Business justification

Peej Kitchen should not rely only on visual impressions. Hypothesis testing helps the business determine whether customer satisfaction is statistically associated with quality or whether satisfaction differs across meals. This supports more evidence-based decisions about quality improvement and menu strategy.

7.3 Hypothesis Test 1: Taste/Quality and Overall Satisfaction

Null hypothesis (H₀): Taste/quality rating and overall satisfaction are independent.

Alternative hypothesis (H₁): Taste/quality rating and overall satisfaction are associated.

Because both variables are categorical/ordinal, a chi-square test of independence is appropriate. If expected counts are too small, a simulated p-value is used to make the test more reliable.

Show code

quality_satisfaction_table <- table(peej$taste_quality, peej$overall_satisfaction)
quality_satisfaction_table

           
            Neutral Satisfied Very Satisfied
  Excellent       0        22             37
  Good            5        27              2
  Neutral         3         3              1

Show code

chisq_initial <- chisq.test(quality_satisfaction_table)
use_simulated_p <- any(chisq_initial$expected < 5)

chisq_quality <- chisq.test(
  quality_satisfaction_table,
  simulate.p.value = use_simulated_p,
  B = 10000
)

chisq_quality


    Pearson's Chi-squared test with simulated p-value (based on 10000
    replicates)

data:  quality_satisfaction_table
X-squared = 43.404, df = NA, p-value = 9.999e-05

Show code

cramers_v_result <- cramers_v(quality_satisfaction_table)
cramers_v_result

Show code

tibble(
  test = "Chi-square test: Taste/Quality vs Overall Satisfaction",
  statistic = round(unname(chisq_quality$statistic), 3),
  p_value = round(chisq_quality$p.value, 4),
  simulated_p_value_used = use_simulated_p,
  cramers_v = round(as.numeric(cramers_v_result$Cramers_v), 3)
) %>%
  kable(caption = "Hypothesis Test 1 Result") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Hypothesis Test 1 Result
test	statistic	p_value	simulated_p_value_used	cramers_v
Chi-square test: Taste/Quality vs Overall Satisfaction	43.404	1e-04	TRUE	0.448

7.4 Interpretation of Test 1

If the p-value is below 0.05, the result suggests that taste/quality and overall satisfaction are significantly associated. In business terms, this would mean that customers’ perception of taste and quality is not just a casual opinion; it is meaningfully connected to how satisfied they are with Peej Kitchen. If the p-value is above 0.05, the sample does not provide enough statistical evidence to conclude that taste/quality and satisfaction are associated, although quality may still remain operationally important.

7.5 Hypothesis Test 2: Satisfaction Across Meal Categories

Null hypothesis (H₀): Overall satisfaction scores are the same across meal categories.

Alternative hypothesis (H₁): At least one meal category has a different satisfaction score.

The Kruskal-Wallis test is appropriate because satisfaction score is ordinal and the comparison involves more than two meal groups.

Show code

kruskal_meal <- kruskal.test(satisfaction_score ~ most_ordered_meal, data = peej)
kruskal_meal


    Kruskal-Wallis rank sum test

data:  satisfaction_score by most_ordered_meal
Kruskal-Wallis chi-squared = 7.1829, df = 4, p-value = 0.1265

Show code

kruskal_effect <- peej %>%
  kruskal_effsize(satisfaction_score ~ most_ordered_meal)

kruskal_effect

Show code

tibble(
  test = "Kruskal-Wallis test: Satisfaction across meals",
  statistic = round(unname(kruskal_meal$statistic), 3),
  p_value = round(kruskal_meal$p.value, 4),
  effect_size_epsilon_squared = round(kruskal_effect$effsize, 3),
  effect_size_magnitude = kruskal_effect$magnitude
) %>%
  kable(caption = "Hypothesis Test 2 Result") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Hypothesis Test 2 Result
test	statistic	p_value	effect_size_epsilon_squared	effect_size_magnitude
Kruskal-Wallis test: Satisfaction across meals	7.183	0.1265	0.034	small

7.6 Interpretation of Test 2

If the p-value is below 0.05, it suggests that satisfaction differs significantly across the meals. This would mean Peej Kitchen should investigate which meals produce higher or lower satisfaction and use that insight for menu improvement. If the p-value is above 0.05, the data does not provide strong evidence that satisfaction differs by meal type, meaning customer satisfaction may be driven more by general service factors such as taste, delivery, affordability, and portion size than by meal category alone.

8 8. Analysis Technique 4: Correlation Analysis

8.1 Brief theory recap

Correlation analysis measures the direction and strength of association between numeric variables. Because the Peej Kitchen survey variables are ordinal scores, Spearman correlation is used. Spearman correlation is suitable when variables are ranked or ordinal and when the relationship may not be perfectly linear.

8.2 Business justification

Correlation analysis helps Peej Kitchen identify which customer experience variables move together. For example, if taste/quality is strongly correlated with satisfaction, the business should treat meal quality as a priority. If affordability is strongly correlated with willingness to continue buying after a price increase, pricing strategy should be handled carefully.

Show code

numeric_data <- peej %>%
  select(
    satisfaction_score,
    quality_score,
    affordability_score,
    price_sensitivity_score,
    delivery_score,
    portion_score,
    order_frequency_score,
    income_score
  )

cor_matrix <- cor(numeric_data, use = "complete.obs", method = "spearman")
round(cor_matrix, 2)

                        satisfaction_score quality_score affordability_score
satisfaction_score                    1.00          0.60                0.32
quality_score                         0.60          1.00                0.30
affordability_score                   0.32          0.30                1.00
price_sensitivity_score               0.29          0.30                0.47
delivery_score                        0.45          0.59                0.48
portion_score                         0.41          0.28                0.33
order_frequency_score                 0.12          0.10                0.16
income_score                          0.36          0.33                0.05
                        price_sensitivity_score delivery_score portion_score
satisfaction_score                         0.29           0.45          0.41
quality_score                              0.30           0.59          0.28
affordability_score                        0.47           0.48          0.33
price_sensitivity_score                    1.00           0.29          0.41
delivery_score                             0.29           1.00          0.22
portion_score                              0.41           0.22          1.00
order_frequency_score                      0.17           0.18          0.17
income_score                               0.35           0.21          0.23
                        order_frequency_score income_score
satisfaction_score                       0.12         0.36
quality_score                            0.10         0.33
affordability_score                      0.16         0.05
price_sensitivity_score                  0.17         0.35
delivery_score                           0.18         0.21
portion_score                            0.17         0.23
order_frequency_score                    1.00         0.20
income_score                             0.20         1.00

Show code

corrplot(
  cor_matrix,
  method = "color",
  type = "upper",
  addCoef.col = "black",
  tl.col = "black",
  tl.srt = 45,
  number.cex = 0.7
)

Show code

strong_correlations <- as.data.frame(as.table(cor_matrix)) %>%
  rename(variable_1 = Var1, variable_2 = Var2, correlation = Freq) %>%
  filter(variable_1 != variable_2) %>%
  mutate(pair = map2_chr(as.character(variable_1), as.character(variable_2), ~ paste(sort(c(.x, .y)), collapse = "---"))) %>%
  distinct(pair, .keep_all = TRUE) %>%
  mutate(abs_correlation = abs(correlation)) %>%
  arrange(desc(abs_correlation)) %>%
  select(variable_1, variable_2, correlation) %>%
  mutate(correlation = round(correlation, 3))

head(strong_correlations, 10) %>%
  kable(caption = "Top Correlations Among Customer Experience Variables") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Top Correlations Among Customer Experience Variables
variable_1	variable_2	correlation
quality_score	satisfaction_score	0.596
delivery_score	quality_score	0.589
delivery_score	affordability_score	0.480
price_sensitivity_score	affordability_score	0.471
delivery_score	satisfaction_score	0.451
portion_score	satisfaction_score	0.408
portion_score	price_sensitivity_score	0.408
income_score	satisfaction_score	0.361
income_score	price_sensitivity_score	0.346
income_score	quality_score	0.332

8.3 Plain-language interpretation

The correlation matrix shows the relationships between customer experience factors. A positive correlation means that as one score increases, the other tends to increase. For example, a positive relationship between quality score and satisfaction score would suggest that customers who give higher ratings to meal quality also tend to give higher overall satisfaction ratings. However, correlation does not prove causation. To confirm causality, Peej Kitchen would need a stronger design, such as tracking satisfaction before and after a deliberate quality improvement or price change.

9 9. Analysis Technique 5: Regression Analysis

9.1 Brief theory recap

Regression analysis estimates how a dependent variable changes when predictor variables change. In this study, linear regression is used to estimate how taste/quality, affordability, price sensitivity, delivery time, portion size, income, and order frequency are associated with overall satisfaction.

Although satisfaction is an ordinal survey score, it is treated as an approximate numeric outcome for this business analytics exercise. The model is interpreted cautiously as an explanatory tool rather than a perfect causal model.

9.2 Business justification

Regression is useful because Peej Kitchen needs to know which business factors are most strongly associated with satisfaction after accounting for other factors. This supports prioritisation. For example, if quality has the strongest positive coefficient, quality improvement should receive more management attention than less influential factors.

Show code

satisfaction_model <- lm(
  satisfaction_score ~ quality_score +
    affordability_score +
    price_sensitivity_score +
    delivery_score +
    portion_score +
    order_frequency_score +
    income_score,
  data = peej
)

summary(satisfaction_model)


Call:
lm(formula = satisfaction_score ~ quality_score + affordability_score + 
    price_sensitivity_score + delivery_score + portion_score + 
    order_frequency_score + income_score, data = peej)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.98115 -0.34210  0.03272  0.32318  1.82251 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)              1.468325   0.370825   3.960 0.000148 ***
quality_score            0.375905   0.097644   3.850 0.000218 ***
affordability_score      0.051828   0.071582   0.724 0.470878    
price_sensitivity_score -0.042315   0.054938  -0.770 0.443143    
delivery_score           0.121576   0.084974   1.431 0.155893    
portion_score            0.117663   0.062899   1.871 0.064570 .  
order_frequency_score   -0.002117   0.042524  -0.050 0.960395    
income_score             0.095901   0.042192   2.273 0.025355 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4824 on 92 degrees of freedom
Multiple R-squared:  0.433, Adjusted R-squared:  0.3898 
F-statistic: 10.04 on 7 and 92 DF,  p-value: 2.907e-09

Show code

model_results <- tidy(satisfaction_model, conf.int = TRUE) %>%
  mutate(
    estimate = round(estimate, 3),
    std.error = round(std.error, 3),
    statistic = round(statistic, 3),
    p.value = round(p.value, 4),
    conf.low = round(conf.low, 3),
    conf.high = round(conf.high, 3)
  )

model_results %>%
  kable(caption = "Regression Coefficients Predicting Overall Satisfaction") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Regression Coefficients Predicting Overall Satisfaction
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	1.468	0.371	3.960	0.0001	0.732	2.205
quality_score	0.376	0.098	3.850	0.0002	0.182	0.570
affordability_score	0.052	0.072	0.724	0.4709	-0.090	0.194
price_sensitivity_score	-0.042	0.055	-0.770	0.4431	-0.151	0.067
delivery_score	0.122	0.085	1.431	0.1559	-0.047	0.290
portion_score	0.118	0.063	1.871	0.0646	-0.007	0.243
order_frequency_score	-0.002	0.043	-0.050	0.9604	-0.087	0.082
income_score	0.096	0.042	2.273	0.0254	0.012	0.180

Show code

glance(satisfaction_model) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value, df, df.residual) %>%
  mutate(across(where(is.numeric), ~ round(.x, 4))) %>%
  kable(caption = "Regression Model Fit") %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed"))

Regression Model Fit
r.squared	adj.r.squared	sigma	statistic	p.value	df	df.residual
0.433	0.3898	0.4824	10.0353	0	7	92

Show code

par(mfrow = c(2, 2))
plot(satisfaction_model)

Show code

par(mfrow = c(1, 1))

9.3 Plain-language interpretation

The regression output should be interpreted by focusing on the sign, size, and p-value of each coefficient.

A positive coefficient means that an increase in that factor is associated with higher satisfaction, holding the other factors constant.
A negative coefficient means that an increase in that factor is associated with lower satisfaction, holding the other factors constant.
A p-value below 0.05 suggests that the factor is statistically significant in this model.
The R-squared value shows the percentage of variation in satisfaction explained by the predictors included in the model.

For managerial decision-making, the most important predictors are the ones with meaningful positive coefficients and statistically significant p-values. If taste/quality is significant, Peej Kitchen should maintain strict quality control, recipe consistency, and ingredient standards. If delivery or portion size is significant, then operational improvements in dispatch timing or portion standardisation should be prioritised.

10 10. Integrated Findings

The five analytical techniques work together to give a complete picture of Peej Kitchen’s customer experience.

EDA summarised the customer survey and identified data quality issues that needed correction before analysis. Visualisation showed the main demand and satisfaction patterns in an easy-to-understand way. Hypothesis testing provided formal statistical evidence about whether quality and meal type are associated with satisfaction. Correlation analysis showed which customer experience variables move together. Regression analysis then combined the main predictors into one model to estimate which factors best explain overall satisfaction.

The overall recommendation is that Peej Kitchen should treat taste and meal quality as its main competitive advantage, while also monitoring affordability, price sensitivity, delivery experience, and portion size. The business should continue promoting high-demand meals, especially the meals with strong order counts and high satisfaction, while improving any meal or service area that shows lower ratings.

A practical action plan is:

Maintain quality consistency by standardising recipes, cooking process, and ingredient selection.
Use Jollof Rice and other high-demand meals as flagship offerings in promotions.
Avoid sudden price increases; if prices must increase, communicate the reason clearly and consider bundle options.
Monitor delivery time and portion size because these affect the total customer experience.
Repeat this survey periodically to track changes in customer satisfaction and demand over time.

11 11. Limitations and Further Work

This study has some limitations. First, the sample size is 100, which is acceptable for the assignment but still limited for making broad conclusions about all potential customers. Second, the survey includes existing customers, potential customers, family, and friends, which may introduce response bias because some respondents may be more favourable toward the business. Third, most variables are ordinal categories, meaning that converting them to numeric scores requires judgement. Fourth, the analysis is based on self-reported survey responses rather than actual transaction records.

With more time and data, Peej Kitchen could improve the analysis by combining survey responses with actual sales records, delivery logs, repeat purchase history, and customer complaints. The business could also collect data over several months to study seasonality and repeat buying behaviour. A future study could use logistic regression to predict whether a customer is likely to continue buying after a price increase or use customer segmentation to identify different customer groups.

12 References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online

Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.x) [Computer software]. https://doi.org/10.5281/zenodo.5960048

Fagbemi, M. O. (2026). Peej Kitchen customer meal preference and satisfaction survey [Survey instrument and dataset]. Administered to existing customers, potential customers, family, and friends, May 2026. Ethical clearance: Respondents were informed that responses would be used anonymously for an MBA analytics assignment.

R Core Team. (2024). R: A language and environment for statistical computing (Version 4.x). R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

13 Appendix: AI Usage Statement

I used ChatGPT as an AI coding and writing assistant to support the structure of the Quarto document, suggest R code for data cleaning and analysis, and improve the clarity of the business interpretation. I made the final analytical decisions, including the selection of the Peej Kitchen business problem, the choice of variables, the interpretation of outputs, and the business recommendations. The dataset was collected independently through a Google Forms survey administered to customers and people familiar with Peej Kitchen.

14 Appendix: Defence Preparation Notes

For the oral defence, I should be able to explain the following:

Why this dataset is real primary data: It was collected through Google Forms from customers, potential customers, family, and friends familiar with Peej Kitchen.
Why EDA was used: It helped me understand the structure, missing values, inconsistencies, and basic patterns in the survey data.
Why visualisation was used: It helped communicate meal preference, satisfaction, affordability, and quality patterns clearly.
Why hypothesis testing was used: It helped test whether relationships in the sample are statistically meaningful.
Why Spearman correlation was used: The survey variables are ordinal scores, so Spearman is more appropriate than Pearson.
Why regression was used: It helped estimate which customer experience factors predict overall satisfaction.
Main business implication: Peej Kitchen should protect taste/quality as its main advantage while improving delivery, portion consistency, and price communication.