Taco Sales Analysis

Author

Simranjeet Kaur

How does location affect price?

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Summarizing the Taco dataset

data <- read_csv("taco.csv")
Rows: 1000 Columns: 13
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): Restaurant_Name, Location, Order_Time, Delivery_Time, Taco_Size, Ta...
dbl (6): Order_ID, Delivery_Duration, Toppings_Count, Distance, Price, Tip
lgl (1): Weekend_Order

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(data)
Rows: 1,000
Columns: 13
$ Order_ID          <dbl> 770487, 671858, 688508, 944962, 476417, 678856, 1836…
$ Restaurant_Name   <chr> "El Taco Loco", "El Taco Loco", "Taco Haven", "Spicy…
$ Location          <chr> "New York", "San Antonio", "Austin", "Dallas", "San …
$ Order_Time        <chr> "1/8/2024 14:55", "23-11-2024 17:11", "21-11-2024 20…
$ Delivery_Time     <chr> "1/8/2024 15:36", "23-11-2024 17:25", "21-11-2024 21…
$ Delivery_Duration <dbl> 41, 14, 38, 45, 15, 83, 45, 31, 17, 73, 64, 29, 11, …
$ Taco_Size         <chr> "Regular", "Regular", "Large", "Regular", "Large", "…
$ Taco_Type         <chr> "Chicken Taco", "Beef Taco", "Pork Taco", "Chicken T…
$ Toppings_Count    <dbl> 5, 1, 2, 2, 0, 0, 1, 3, 2, 1, 1, 4, 2, 1, 1, 2, 5, 4…
$ Distance          <dbl> 3.01, 6.20, 20.33, 3.00, 24.34, 16.70, 9.57, 9.80, 1…
$ Price             <dbl> 9.25, 4.25, 7.00, 5.50, 4.50, 3.00, 5.75, 6.75, 5.50…
$ Tip               <dbl> 2.22, 3.01, 0.02, 1.90, 1.14, 2.32, 0.63, 2.97, 0.33…
$ Weekend_Order     <lgl> FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE…
ggplot(data = data)

ggplot(data)

ggplot(
  data = data,
  mapping = aes(x = Location, y = Price)
)

ggplot(
  data = data,
  mapping = aes(x = Location, y = Price)
)+
  geom_boxplot()

ggplot(
  data = data,
  mapping = aes(x = Location, y = Price, fill = Location)
) +
  geom_boxplot()+
  labs(title = "Price Distribution by Location", x = "Location", y = "Price")+
  coord_flip()

The Price of the tacos varies by the location.The cities like San Jose and Los Angeles charge higher prices for the tacos.

Which taco type earns more tips?

ggplot(data, aes(x = Taco_Type, y = Tip, fill = Taco_Type)) +
  stat_summary(fun = mean, geom = "bar") +
  labs(title = "Average Tip by Taco Type", x = "Taco Type", y = "Average Tip")

This chart shows that there is correlaton between Taco Type and average tip. The tip offers by cutomers is higher for chicken aand pork taco as compared to Beef taco.

How does the proportion of weekend orders vary across differnt locations?

ggplot(data, aes(x = Location, fill = Weekend_Order)) +
  geom_bar(position = "fill") +
  labs(title = "Proportion of Weekend Orders by Location",
       y = "Proportion", x = "Location") +
  scale_fill_manual(values = c("FALSE" = "Orange", "TRUE" =  "Purple"))+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

This chart shows that Austin, New York, Phoneix have higher prioprtions of weekend orders as compared to San Jose and Dallas. Overall, All cities have more weekday orders than weekend.