Plato’s Pizza
Plato Pizza is a fictional Greek-inspired pizza place in New Jersey, United states of America.Things are going OK here at Plato’s, but there’s room for improvement. We’ve been collecting transaction data for the past year, but really haven’t been able to put it to good use, Mario Maven (Manager, Plato’s Pizza) want the following business questions to be answered from the insight derived from their dataset
The data contains four different dataset that contains datas about their order history and pizza types, They are as follows
About the dataset
This dataset contains 4 tables in CSV format
load packages needed R packages for this analysis
library(tidyverse)
library(lubridate)
library(scales)
load the dataset
order_details <- read.csv("order_details.csv")
orders <- read.csv("orders.csv")
pizza_types <- read.csv("pizza_types.csv")
pizzas <- read.csv("pizzas.csv")
view the individual dataset columns name
colnames(order_details)
## [1] "order_details_id" "order_id" "pizza_id" "quantity"
colnames(orders)
## [1] "order_id" "date" "time"
colnames(pizza_types)
## [1] "pizza_type_id" "name" "category" "ingredients"
colnames(pizzas)
## [1] "pizza_id" "pizza_type_id" "size" "price"
Preview individual dataset data type
str(order_details)
## 'data.frame': 48620 obs. of 4 variables:
## $ order_details_id: int 1 2 3 4 5 6 7 8 9 10 ...
## $ order_id : int 1 2 2 2 2 2 3 3 4 5 ...
## $ pizza_id : chr "hawaiian_m" "classic_dlx_m" "five_cheese_l" "ital_supr_l" ...
## $ quantity : int 1 1 1 1 1 1 1 1 1 1 ...
str(orders)
## 'data.frame': 21350 obs. of 3 variables:
## $ order_id: int 1 2 3 4 5 6 7 8 9 10 ...
## $ date : chr "2015-01-01" "2015-01-01" "2015-01-01" "2015-01-01" ...
## $ time : chr "11:38:36" "11:57:40" "12:12:28" "12:16:31" ...
str(pizza_types)
## 'data.frame': 32 obs. of 4 variables:
## $ pizza_type_id: chr "bbq_ckn" "cali_ckn" "ckn_alfredo" "ckn_pesto" ...
## $ name : chr "The Barbecue Chicken Pizza" "The California Chicken Pizza" "The Chicken Alfredo Pizza" "The Chicken Pesto Pizza" ...
## $ category : chr "Chicken" "Chicken" "Chicken" "Chicken" ...
## $ ingredients : chr "Barbecued Chicken, Red Peppers, Green Peppers, Tomatoes, Red Onions, Barbecue Sauce" "Chicken, Artichoke, Spinach, Garlic, Jalapeno Peppers, Fontina Cheese, Gouda Cheese" "Chicken, Red Onions, Red Peppers, Mushrooms, Asiago Cheese, Alfredo Sauce" "Chicken, Tomatoes, Red Peppers, Spinach, Garlic, Pesto Sauce" ...
str(pizzas)
## 'data.frame': 96 obs. of 4 variables:
## $ pizza_id : chr "bbq_ckn_s" "bbq_ckn_m" "bbq_ckn_l" "cali_ckn_s" ...
## $ pizza_type_id: chr "bbq_ckn" "bbq_ckn" "bbq_ckn" "cali_ckn" ...
## $ size : chr "S" "M" "L" "S" ...
## $ price : num 12.8 16.8 20.8 12.8 16.8 ...
Joining of orders table and order_details table using leftjoin function
order_details_v2 <- left_join(order_details,orders, by= c("order_id"))
joining of the pizzas table and pizza_type table using leftjoin function
pizzas_v2 <- left_join(pizzas,pizza_types, by = c("pizza_type_id"))
view the individual dataset columns name
colnames(pizzas_v2)
## [1] "pizza_id" "pizza_type_id" "size" "price"
## [5] "name" "category" "ingredients"
colnames(order_details_v2)
## [1] "order_details_id" "order_id" "pizza_id" "quantity"
## [5] "date" "time"
Joining of the pizzas_v2 data and order_details_v2 with the pizza_id dataframe using the leftjoin function
maven_pizza <- left_join(order_details_v2,pizzas_v2, by = c("pizza_id"))
Dataset columns name
colnames(maven_pizza)
## [1] "order_details_id" "order_id" "pizza_id" "quantity"
## [5] "date" "time" "pizza_type_id" "size"
## [9] "price" "name" "category" "ingredients"
All data sets have been combined and ready for processing, New dataframe for months, days of the week and days
maven_pizza$months <- format(as.Date(maven_pizza$date),"%B")
maven_pizza$days <- format(as.Date(maven_pizza$date), "%d")
maven_pizza$day_of_the_week <- format(as.Date(maven_pizza$date),"%A")
Dataset columns name
colnames(maven_pizza)
## [1] "order_details_id" "order_id" "pizza_id" "quantity"
## [5] "date" "time" "pizza_type_id" "size"
## [9] "price" "name" "category" "ingredients"
## [13] "months" "days" "day_of_the_week"
table(maven_pizza$price)
##
## 9.75 10.5 11 12 12.25 12.5 12.75 13.25 14.5 14.75 15.25 16 16.25
## 739 1001 570 5600 830 3328 2507 473 391 577 712 4474 1114
## 16.5 16.75 17.5 17.95 18.5 20.25 20.5 20.75 21 23.65 25.5 35.95
## 4040 4294 381 1273 1359 3032 1998 8685 190 480 544 28
table(maven_pizza$quantity)
##
## 1 2 3 4
## 47693 903 21 3
Total revenue
maven_pizza %>%
summarise(Revenue = sum(quantity*price))
## Revenue
## 1 817860
Total Order
maven_pizza %>%
summarise(Orders = sum(quantity))
## Orders
## 1 49574
Average Order price
maven_pizza %>%
summarise(Avg_order_price = mean(quantity*price))
## Avg_order_price
## 1 16.82147
Number of pizza types
maven_pizza %>%
summarise(types_of_pizza = n_distinct(name))
## types_of_pizza
## 1 32
What days of the week do we tend to be busiest both for number of orders and revenue
maven_pizza %>%
mutate(week_days = wday(date, label = TRUE)) %>%
group_by(week_days,category) %>%
summarise(busiest_days_by_orders = sum(quantity)) %>%
arrange(-busiest_days_by_orders) %>%
ggplot(aes(x = week_days, y = busiest_days_by_orders, fill = category))+
geom_col()+
theme(text = element_text(size = 9),element_line(size =1))+
labs(title = "Number of orders placed within the days of the Week")
## `summarise()` has grouped output by 'week_days'. You can override using the
## `.groups` argument.
From the insight, Thursday, Saturday and friday has the highest number of orders within the week days ranging from 7478,7493 and 8242 orders respectively. While on the other hands, Sunday and Monday has the lowest number of orders within the week days ranging from 6035 and 6485 respectively
maven_pizza %>%
mutate(week_days = wday(date, label = TRUE)) %>%
group_by(week_days,category) %>%
summarise(busiest_days_by_revenue = sum(quantity*price)) %>%
arrange(-busiest_days_by_revenue) %>%
ggplot(aes(x = week_days, y = busiest_days_by_revenue, fill = category))+
geom_col()+
scale_y_continuous(labels = function(x) format(x, scientific = FALSE))+
theme(text = element_text(size = 9),element_line(size =1))+
labs(title = "Amount of revenue generated within the days of the Week")
## `summarise()` has grouped output by 'week_days'. You can override using the
## `.groups` argument.
From the insight, Thursday, Saturday and friday generated the highest number of revenue within the week days ranging from $123528,$123182 and $136074 orders respectively. While on the other hands, Sunday and Monday has the lowest number of orders within the week days ranging from $99204 and $107330 respectively
Percentage of Order by pizza categories
maven_pizza %>%
group_by(category) %>%
summarize(Frequency = n()) %>%
mutate(Percent = paste(round(Frequency / dims[1] * 100,0),"%"))
Graphical representation of number of pizza ordered by pizza category
ggplot(maven_pizza, aes(x = category)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=percent)+
theme(legend.title = element_text(size = 12),
legend.text = element_text(size = 12))+
labs(title = "Percentage of pizza ordered by their category")
From our graph, classic pizza category has the most number of order placed which represent 14579(30%) of total order while chicken pizza category has the lowest number of order placed at 10815(22%) of the total orders
Graphical representation of number of pizza ordered by pizza size
ggplot(maven_pizza, aes(x = size)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
scale_y_continuous(labels=percent)+
theme(legend.title = element_text(size = 12),
legend.text = element_text(size = 12))+
labs(title = "Percentage of pizza ordered by their size")
Best and worst selling pizza
maven_pizza %>%
group_by(name) %>%
summarise(total_price = sum(quantity*price)) %>%
arrange(-total_price) %>%
print(n = 10)
## # A tibble: 32 × 2
## name total_price
## <chr> <dbl>
## 1 The Thai Chicken Pizza 43434.
## 2 The Barbecue Chicken Pizza 42768
## 3 The California Chicken Pizza 41410.
## 4 The Classic Deluxe Pizza 38180.
## 5 The Spicy Italian Pizza 34831.
## 6 The Southwest Chicken Pizza 34706.
## 7 The Italian Supreme Pizza 33477.
## 8 The Hawaiian Pizza 32273.
## 9 The Four Cheese Pizza 32266.
## 10 The Sicilian Pizza 30940.
## # … with 22 more rows
maven_pizza %>%
group_by(name) %>%
summarise(total_price = sum(quantity*price)) %>%
arrange(total_price) %>%
print(n = 10)
## # A tibble: 32 × 2
## name total_price
## <chr> <dbl>
## 1 The Brie Carre Pizza 11588.
## 2 The Green Garden Pizza 13956.
## 3 The Spinach Supreme Pizza 15278.
## 4 The Mediterranean Pizza 15360.
## 5 The Spinach Pesto Pizza 15596
## 6 The Calabrese Pizza 15934.
## 7 The Italian Vegetables Pizza 16019.
## 8 The Soppressata Pizza 16426.
## 9 The Chicken Pesto Pizza 16702.
## 10 The Chicken Alfredo Pizza 16900.
## # … with 22 more rows
Visualization of pizza sales by their revenue
maven_pizza %>%
group_by(name) %>%
summarise(total_revenue = sum(quantity*price)) %>%
ggplot(aes(x = name, y = total_revenue))+
geom_col()+
coord_flip() +
labs(title = "Pizza names and its total revenue")
Revenue by Months by category of pizza
maven_pizza %>%
group_by(months) %>%
summarise(monthly_revenue = sum(quantity*price)) %>%
arrange(-monthly_revenue) %>%
ggplot(aes(x = months, y = monthly_revenue ))+
geom_col()+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
labs(title = "Monthly revenue by pizza category")
From the graph, revenue generation ranges from above 60k to 72k per month, The month of July recorded the highest amount in revenue at $72,558 while October recorded the lowest amount in revenue at $64028
How well are we utilizing our seating capacity
The seat available presently, seems to be enough from our analyses, we currently have 15 Tables and 60 Chairs . Averaging one order per table, we only have about 80% of the capacity only 2 to 7 times in a month, and likewise, Averaging one pizza per chair,we only recorded only once that the number of orders (64orders) was over the number of available seats (60 chairs)