library(tidyverse)
hogs <- read_csv("hogsmeade_sales.csv")
#install.packages("janitor")
#install.packages("furniture")
library(janitor)
library(furniture)
library(kableExtra)
library(scales)My Report
hogsmeade_clean <- hogs %>%
remove_empty("rows") %>%
remove_empty("cols") %>%
clean_names() %>%
mutate(order_date = dmy(order_date))Welcome to the Hogsmeade Report
Introduction
This report is analyzing data from the Hogsmeade data set, in this it will focus on different data and bring them to life through visualizations. This visualization will help give insights to Hogsmeade business and help understand the data better.
In this report there will a range of different visualizations such as scatterplots, Barcharts, tables, heatmaps and line graphs along with an explanation for each. Different graphs will look into the delivery methods used, sales and profits, what are the best performing products and category within the shop and more.
Comparing the Sales and Profits in the Hogsmeade Shop
This Scatterplot showcases the sales and profit in Hogsmeade shop, its shows a strong linear relationship which means the sales are increasing every time with little outliars. This shows that the highest sales are in the middle price range and profit increases when more money is spent. This scatter plot is a good way to show the patterns with sales and profit and how they are linked. Its clear to say Hogsmeade have a strong price strategy as sales are improving.
ggplot(data = hogsmeade_clean) +
geom_point(mapping = aes(x = profit, y = sales), colour = "#0D6217") +
ggtitle("Comparing Sales and Profit") +
theme(plot.title = element_text(hjust = 0.5))Most Popular Product Categorys Vs Least Popular Product Categorys
This bar chart shows the different product category from the most popular to least popular, you can see that there is a little difference in each but overall they are very similarly popular. the most popular category is brooms and the least favourite category is robes. Hogsmeade have a good product category range and all categorys seem to be doing well with a steady interest in each. This bar chart is good because it makes it visually easy to point out the different categorys and which are performing better than others.
ggplot(data = hogsmeade_clean) +
geom_bar(mapping = aes(x = fct_infreq (product_category), fill = product_category)) +
scale_fill_manual(values = c("#D3BC8D", "#740001", "#FDB913", "#C0C0C0", "#0D6217", "#0E1A40", "#946B2D")) +
ggtitle("Most Popular to Least Popular Product Categorys") +
xlab("Product Category") +
theme(legend.position = "none")Orders By Delivery
This bar chart shows the different types of delivery methods and how may orders each one has. All delivery options are frequently used and is nearly even but the most used delivery option is owl post and the least used option would be Floo Network.Each delivery option is used nearly the same as each other and are all doing well. This bar chart is a good way of showcasing the data and make it visually easy to read.
ggplot(data = hogsmeade_clean) +
geom_bar(mapping = aes(x = fct_infreq(delivery_method), fill = delivery_method)) +
ggtitle("Orders by Delivery Method") +
scale_fill_manual(values = c("#740001", "#FDB913", "#0D6217", "#0E1A40")) +
labs(x = "Delivery Method", y = "Count") +
theme(legend.position = "none") +
theme(legend.position = "none")Monthly Sales Over Four Years
This line graph shows the sales pattern over a period months for the last four years. Its clear that there is a inconsistent pattern through the months into years. Sales have been the strongest in the end of 2024 going into 2025 which could mean there products are becoming more demanding. Each year sales spike one or twice and then go back down meaning it could be due to the different seasons and having busy times of year. This line graph shows this data very well as you can see a trend in what the shops quiet periods are more likely to be and when sales are likely to peak which can predict future sales.
hogs_monthly <- hogsmeade_clean %>%
mutate(month_year = floor_date(order_date, unit = "month")) %>%
group_by(month_year) %>%
summarise(total_sales = sum(sales, na.rm = TRUE))
ggplot(hogs_monthly, mapping = aes(x = month_year, y = total_sales)) +
scale_colour_manual(values = c("#740001", "#FDB913", "#0D6217", "#0E1A40")) +
geom_line() +
labs(x = "Month", y = "Total Sales",
title = "Monthly Sales Over Four Years")Product Category By Sales
This boxplot shows the discount given by each product category, Artifacts, brooms and books have nearly the same discount in each which ranges between 10% -15% with a few near 20%. Other categories have little to zero discounts which means they are not consistent with the discounts, therefore its not regular and more of a once off or random discount. The boxplot indicates that overall there is not any category that gets the highest or more frequent discount, its not predictable and can change at any time.
ggplot(data = hogsmeade_clean) +
geom_boxplot(mapping = aes(y = discount, x = product_category, fill = product_category)) +
labs(title = "Discount by Product Category", y = "Discount", x = "Product category") +
scale_fill_manual(values = c("#D3BC8D", "#740001", "#FDB913", "#C0C0C0", "#0D6217", "#0E1A40", "#946B2D" )) +
theme(legend.position = "none")Summary Table of Regions
This table is a summary of each region and how well they are performing, Overall Diagon Alley has the most sales at 298, 374 and a profit of 196,695. Although this is the best performing region the other three are not far behind as they all are similar in figures. This table shows that the discounts are nearly the same for each region meaning it doesn’t matter where the customer is based and its similar regardless.
hogsmeade_clean %>%
group_by(region) %>%
summarise(Total_Sales = sum(sales),
Total_Profit = sum(profit),
Avg_Discount = mean(discount),
Avg_Units = mean(units_sold)) %>%
# Format numbers to be more readable
mutate(Total_Sales = comma(Total_Sales, accuracy = 1),
Total_Profit = comma(Total_Profit, accuracy = 1),
Avg_Discount = percent(Avg_Discount, accuracy = 0.1),
Avg_Units = round(Avg_Units, 2)) %>%
kable(col.names = c("Region", "Total Sales", "Total Profit", "Avg. Discount", "Avg. Units Sold"),
caption = "Summary by Region")| Region | Total Sales | Total Profit | Avg. Discount | Avg. Units Sold |
|---|---|---|---|---|
| Diagon Alley | 298,374 | 197,695 | 11.4% | 5.43 |
| Hogsmeade | 275,697 | 184,825 | 11.6% | 5.35 |
| Ministry Quarter | 274,479 | 182,334 | 11.0% | 5.28 |
Top 10 Products
This Bar chart shows the top ten best products being sold in Hogsmeade shop, The highest products is the history of magic and Flying carpet whereas the tenth most popular product is the Floo powder. The sales show that the educational products are bringing in the most revenue which means there is a higher demand for the education products. This visualization shows that although there is the top products, there is a good steady sales income from each and that there is a high demand for each product. This shows a steady variety of products being in demand at the shop.
hogsmeade_clean %>%
group_by(product_name) %>%
summarise(total_sales = sum(sales)) %>%
slice_max(total_sales, n = 10) %>%
ggplot(aes(x = fct_reorder(product_name, total_sales), y = total_sales)) +
geom_col(fill = "#740001") +
coord_flip() +
# Adds white labels to each bar
geom_text(aes( x = fct_reorder(product_name, total_sales), y = total_sales,
label = scales::comma(total_sales)),
hjust = 1.2, color = "white") +
labs(title = "Top 10 Products by Sales",
x = "Product",
y = "Total Sales") +
theme(legend.position = "none") +
scale_y_continuous(labels = scales::comma)Heatmap Showing The Sales By Customer and Category Type
This heatmap indicates the sales made from different customers and the type of category they buy from, the light yellow colour shows a high volume of sales whereas the darker shades shows a lower volume in sales. Wands are one of th highest sales which are are shopped by hogwards students and artifacts are very high in sales too indicating the need for more advances products. A variety of different category are bought from staff which suggests their need for basic and everyday teaching needs. Overall there is a high demand for a variety of products to suit different customers needs, also shows us the different customer buying behaviors in the hogsmeade shop. These figures and information is helpful in promoting each product to suit each customers needs and drive sales further. `
hogsmeade_clean %>%
group_by(customer_type, product_category) %>%
summarise(total_sales = sum(sales), .groups = "drop") %>%
ggplot(aes(customer_type, product_category, fill = total_sales)) +
#This code makes the heatmap
geom_tile() +
#Applys colours you choose
scale_fill_gradientn(colours = c("#FDB913", "#C0C0C0")) +
labs(title = "Sales Heatmap by Customer Type & Category", x = "Customer", y = "Product Category", fill = "Sales" )Rstudio Meme
This meme represents what it was like for me trying to download Rtudio to my laptop, even with instructions.