Description of the data set:
Chocolate_Sales <- read.csv("Chocolate Sales.csv")
str(Chocolate_Sales)
## 'data.frame': 1094 obs. of 7 variables:
## $ Sales.Person : chr "Jehu Rudeforth" "Van Tuxwell" "Gigi Bohling" "Jan Morforth" ...
## $ Country : chr "UK" "India" "India" "Australia" ...
## $ Customer.Segment: chr "Wholesale" "Retail" "Retail" "Wholesale" ...
## $ Product : chr "Mint Chip Choco" "85% Dark Bars" "Peanut Butter Cubes" "Peanut Butter Cubes" ...
## $ Date : chr "2022-01-04" "2022-08-01" "2022-07-07" "2022-04-27" ...
## $ Amount : int 5320 7896 4501 12726 13685 5376 13685 3080 3990 2835 ...
## $ Boxes.Shipped : int 180 94 91 342 184 38 176 73 59 102 ...
head(Chocolate_Sales)
## Sales.Person Country Customer.Segment Product Date
## 1 Jehu Rudeforth UK Wholesale Mint Chip Choco 2022-01-04
## 2 Van Tuxwell India Retail 85% Dark Bars 2022-08-01
## 3 Gigi Bohling India Retail Peanut Butter Cubes 2022-07-07
## 4 Jan Morforth Australia Wholesale Peanut Butter Cubes 2022-04-27
## 5 Jehu Rudeforth UK Wholesale Peanut Butter Cubes 2022-02-24
## 6 Van Tuxwell India Retail Smooth Sliky Salty 2022-06-06
## Amount Boxes.Shipped
## 1 5320 180
## 2 7896 94
## 3 4501 91
## 4 12726 342
## 5 13685 184
## 6 5376 38
There are 7 variables in this data set. Here is a description of each variable. Sales Person - Name of the salesperson responsible for the transaction. Country - Sales region or store location where the transaction took place. Customer Segment - Type of customer (Retail, Wholesale) the sale was made to. Product - Name of the chocolate product sold. Date - The transaction date of the chocolate sale. Amount - Total revenue generated from the sale. Boxes Shipped - Number of chocolate boxes shipped in the order.
Static visualization:
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
ggplot(Chocolate_Sales, aes(x=Boxes.Shipped, y=Amount)) +
geom_point(aes(col=Country),alpha = 0.8,size=1) +
geom_smooth(method="lm",formula = y ~ x, col="Black", linewidth=0.7) +
labs(title="Number of Boxes Shipped vs Revenue",
subtitle="From Chocolate Sales dataset",
y="Revenue", x="Number of chocolate boxes shipped",
caption="Chocolate Sales Demographics")
The scatter plot above shows the relationship between the number of chocolate boxes shipped per order and the revenue it generates, categorized by countries. As we can see, the regression line is flat meaning that there is no relation between number of chocolate boxes shipped per order and the revenue it generates. A change in number of chocolate boxes shipped cannot be used to predict the revenue. This may be due to the fact that the products have widely varying prices per unit.
Interactive chart:
library(plotly)
## Warning: package 'plotly' was built under R version 4.4.3
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Static_Plot <- ggplot(Chocolate_Sales, aes(x=Boxes.Shipped, y=Amount)) +
geom_point(aes(col=Country),alpha = 0.8,size=1) +
geom_smooth(method="lm",formula = y ~ x, col="Black", linewidth=0.7) +
labs(title="Number of Boxes Shipped vs Revenue",
subtitle="From Chocolate Sales dataset",
y="Revenue", x="Number of chocolate boxes shipped",
caption="Chocolate Sales Demographics")
Interactive_Plot <- ggplotly(Static_Plot)
Interactive_Plot
This is an interactive plot of the scatter plot shown above.
Animation:
library(gganimate)
## Warning: package 'gganimate' was built under R version 4.4.3
library(av)
## Warning: package 'av' was built under R version 4.4.3
library(gifski)
## Warning: package 'gifski' was built under R version 4.4.3
Chocolate_Sales$Date <- as.Date(Chocolate_Sales$Date, format = "%Y-%m-%d")
Animated_Plot <- ggplot(Chocolate_Sales, aes(x = Amount, y = Product, color = Customer.Segment)) +
geom_point(alpha = 0.8) +
labs(title = "Product vs Revenue Over Time: {frame_time}",
x = "Revenue",
y = "Product",
color = "Customer Segment") +
transition_time(Date) +
ease_aes('linear') +
theme_minimal()
Animated_Plot
animate(Animated_Plot, fps = 3, renderer = gifski_renderer())
This animation shows each chocolate product and the revenue it generates
over time, categorizing it by the customer segment.