Description of the data set:

Chocolate_Sales <- read.csv("Chocolate Sales.csv")

str(Chocolate_Sales)
## 'data.frame':    1094 obs. of  7 variables:
##  $ Sales.Person    : chr  "Jehu Rudeforth" "Van Tuxwell" "Gigi Bohling" "Jan Morforth" ...
##  $ Country         : chr  "UK" "India" "India" "Australia" ...
##  $ Customer.Segment: chr  "Wholesale" "Retail" "Retail" "Wholesale" ...
##  $ Product         : chr  "Mint Chip Choco" "85% Dark Bars" "Peanut Butter Cubes" "Peanut Butter Cubes" ...
##  $ Date            : chr  "2022-01-04" "2022-08-01" "2022-07-07" "2022-04-27" ...
##  $ Amount          : int  5320 7896 4501 12726 13685 5376 13685 3080 3990 2835 ...
##  $ Boxes.Shipped   : int  180 94 91 342 184 38 176 73 59 102 ...
head(Chocolate_Sales)
##     Sales.Person   Country Customer.Segment             Product       Date
## 1 Jehu Rudeforth        UK        Wholesale     Mint Chip Choco 2022-01-04
## 2    Van Tuxwell     India           Retail       85% Dark Bars 2022-08-01
## 3   Gigi Bohling     India           Retail Peanut Butter Cubes 2022-07-07
## 4   Jan Morforth Australia        Wholesale Peanut Butter Cubes 2022-04-27
## 5 Jehu Rudeforth        UK        Wholesale Peanut Butter Cubes 2022-02-24
## 6    Van Tuxwell     India           Retail  Smooth Sliky Salty 2022-06-06
##   Amount Boxes.Shipped
## 1   5320           180
## 2   7896            94
## 3   4501            91
## 4  12726           342
## 5  13685           184
## 6   5376            38

There are 7 variables in this data set. Here is a description of each variable. Sales Person - Name of the salesperson responsible for the transaction. Country - Sales region or store location where the transaction took place. Customer Segment - Type of customer (Retail, Wholesale) the sale was made to. Product - Name of the chocolate product sold. Date - The transaction date of the chocolate sale. Amount - Total revenue generated from the sale. Boxes Shipped - Number of chocolate boxes shipped in the order.

Static visualization:

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
ggplot(Chocolate_Sales, aes(x=Boxes.Shipped, y=Amount)) +
  geom_point(aes(col=Country),alpha = 0.8,size=1) +
  geom_smooth(method="lm",formula = y ~ x, col="Black", linewidth=0.7) + 
  labs(title="Number of Boxes Shipped vs Revenue", 
       subtitle="From Chocolate Sales dataset", 
       y="Revenue", x="Number of chocolate boxes shipped", 
       caption="Chocolate Sales Demographics")

The scatter plot above shows the relationship between the number of chocolate boxes shipped per order and the revenue it generates, categorized by countries. As we can see, the regression line is flat meaning that there is no relation between number of chocolate boxes shipped per order and the revenue it generates. A change in number of chocolate boxes shipped cannot be used to predict the revenue. This may be due to the fact that the products have widely varying prices per unit.

Interactive chart:

library(plotly)
## Warning: package 'plotly' was built under R version 4.4.3
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
Static_Plot <- ggplot(Chocolate_Sales, aes(x=Boxes.Shipped, y=Amount)) +
  geom_point(aes(col=Country),alpha = 0.8,size=1) +
  geom_smooth(method="lm",formula = y ~ x, col="Black", linewidth=0.7) + 
  labs(title="Number of Boxes Shipped vs Revenue", 
       subtitle="From Chocolate Sales dataset", 
       y="Revenue", x="Number of chocolate boxes shipped", 
       caption="Chocolate Sales Demographics")

Interactive_Plot <- ggplotly(Static_Plot)

Interactive_Plot

This is an interactive plot of the scatter plot shown above.

Animation:

library(gganimate)
## Warning: package 'gganimate' was built under R version 4.4.3
library(av)
## Warning: package 'av' was built under R version 4.4.3
library(gifski)
## Warning: package 'gifski' was built under R version 4.4.3
Chocolate_Sales$Date <- as.Date(Chocolate_Sales$Date, format = "%Y-%m-%d")

Animated_Plot <- ggplot(Chocolate_Sales, aes(x = Amount, y = Product, color = Customer.Segment)) +
  geom_point(alpha = 0.8) +
  labs(title = "Product vs Revenue Over Time: {frame_time}",
       x = "Revenue",
       y = "Product",
       color = "Customer Segment") +
  transition_time(Date) +
  ease_aes('linear') +
  theme_minimal() 

Animated_Plot

animate(Animated_Plot, fps = 3, renderer = gifski_renderer())

This animation shows each chocolate product and the revenue it generates over time, categorizing it by the customer segment.