This report analyzes which product categories contribute the largest total sales based on the bike_orderline_tbl dataset.
Since the actual dataset is not provided, we will simulate a small dataset that represents the same structure.
# Simulated dataset
bike_orderline_tbl <- tibble(
category_1 = c("Mountain", "Mountain", "Road", "Road", "Gravel", "Gravel"),
category_2 = c("Cross Country", "Trail", "Endurance", "Race", "Adventure", "Touring"),
frame_material = c("Aluminum", "Carbon", "Carbon", "Steel", "Aluminum", "Steel"),
total_price = c(50000, 75000, 60000, 80000, 55000, 40000)
)
bike_orderline_tbl
## # A tibble: 6 × 4
## category_1 category_2 frame_material total_price
## <chr> <chr> <chr> <dbl>
## 1 Mountain Cross Country Aluminum 50000
## 2 Mountain Trail Carbon 75000
## 3 Road Endurance Carbon 60000
## 4 Road Race Steel 80000
## 5 Gravel Adventure Aluminum 55000
## 6 Gravel Touring Steel 40000
We group data by product categories and frame material, then summarize total sales.
sales_summary_tbl <- bike_orderline_tbl %>%
group_by(category_1, category_2, frame_material) %>%
summarize(Sales = sum(total_price, na.rm = TRUE)) %>%
ungroup()
## `summarise()` has grouped output by 'category_1', 'category_2'. You can
## override using the `.groups` argument.
We format the Sales column as dollars and rename other columns for clarity.
sales_summary_tbl <- sales_summary_tbl %>%
mutate(Sales = scales::dollar(Sales)) %>%
rename(
"Prime category" = category_1,
"Secondary category" = category_2,
"Frame Material" = frame_material
) %>%
arrange(desc(Sales))
sales_summary_tbl
## # A tibble: 6 × 4
## `Prime category` `Secondary category` `Frame Material` Sales
## <chr> <chr> <chr> <chr>
## 1 Road Race Steel $80,000
## 2 Mountain Trail Carbon $75,000
## 3 Road Endurance Carbon $60,000
## 4 Gravel Adventure Aluminum $55,000
## 5 Mountain Cross Country Aluminum $50,000
## 6 Gravel Touring Steel $40,000
A simple bar chart to visualize total sales by prime category.
bike_orderline_tbl %>%
group_by(category_1) %>%
summarize(Sales = sum(total_price, na.rm = TRUE)) %>%
ggplot(aes(x = reorder(category_1, -Sales), y = Sales, fill = category_1)) +
geom_col(show.legend = FALSE) +
scale_y_continuous(labels = dollar) +
labs(title = "Total Sales by Prime Category", x = "Prime Category", y = "Sales ($)") +
theme_minimal()
✅ Conclusion:
Even with this simulated dataset, we can observe that the Road category
tends to have higher total sales, followed by Mountain and Gravel.
💡 Created for Financial Database Management and Application course.