Olist a Brazilian e-commerce
Olist is a Brazilian department store (marketplace) that operates in e-commerce segment, but is not an e-commerce itself as she says. It operates as a SaaS (Software as a Service) technology company since 2015. It offers a marketplace solution (of e-commerce segment) to shopkeepers of all sizes (aand for most segements) to increase their sales whether they have online presence or not.
Olist’s solution consists of three aspects: Software, Contracts with the main marketplaces and Reputation sharing.
Olist says she
Please note Olist’s perspective ( a supply chain preview):: she prescribes there are many factors that can influence the sales of a shopkeeper (e.g: type of product, demand, seasonality, competitive pricing, terms, inventory etc)
Load packages and determine working directory
Load datasets
Customers <- read.csv("olist_customers_dataset.csv", header = TRUE)
Geolocation <- read.csv("olist_geolocation_dataset.csv", header = TRUE)
Items <- read.csv("olist_order_items_dataset.csv", header = TRUE)
Payments <- read.csv("olist_order_payments_dataset.csv", header = TRUE)
Reviews <- read.csv("olist_order_reviews_dataset.csv", header = TRUE)
Orders <- read.csv("olist_orders_dataset.csv", header = TRUE)
Products <- read.csv("olist_products_dataset.csv", header = TRUE)
Sellers <- read.csv("olist_sellers_dataset.csv", header = TRUE)
Pdttrans <- read.csv("product_category_name_translation.csv", header = TRUE) %>%
rename("product_category_name" = "ï..product_category_name")
On Kaggle, she shared its data of 100k orders from 2016 to 2018. There are 8+1 datasets to play with and explore. The data model has been described in image below and it is organised and normalised for each category.
Data Model
Orders dataset
kable(head(Orders), format = 'markdown', caption = 'Orders dataset')
order_id | customer_id | order_status | order_purchase_timestamp | order_approved_at | order_delivered_carrier_date | order_delivered_customer_date | order_estimated_delivery_date |
---|---|---|---|---|---|---|---|
e481f51cbdc54678b7cc49136f2d6af7 | 9ef432eb6251297304e76186b10a928d | delivered | 2017-10-02 10:56:33 | 2017-10-02 11:07:15 | 2017-10-04 19:55:00 | 2017-10-10 21:25:13 | 2017-10-18 00:00:00 |
53cdb2fc8bc7dce0b6741e2150273451 | b0830fb4747a6c6d20dea0b8c802d7ef | delivered | 2018-07-24 20:41:37 | 2018-07-26 03:24:27 | 2018-07-26 14:31:00 | 2018-08-07 15:27:45 | 2018-08-13 00:00:00 |
47770eb9100c2d0c44946d9cf07ec65d | 41ce2a54c0b03bf3443c3d931a367089 | delivered | 2018-08-08 08:38:49 | 2018-08-08 08:55:23 | 2018-08-08 13:50:00 | 2018-08-17 18:06:29 | 2018-09-04 00:00:00 |
949d5b44dbf5de918fe9c16f97b45f8a | f88197465ea7920adcdbec7375364d82 | delivered | 2017-11-18 19:28:06 | 2017-11-18 19:45:59 | 2017-11-22 13:39:59 | 2017-12-02 00:28:42 | 2017-12-15 00:00:00 |
ad21c59c0840e6cb83a9ceb5573f8159 | 8ab97904e6daea8866dbdbc4fb7aad2c | delivered | 2018-02-13 21:18:39 | 2018-02-13 22:20:29 | 2018-02-14 19:46:34 | 2018-02-16 18:17:02 | 2018-02-26 00:00:00 |
a4591c265e18cb1dcee52889e2d8acc3 | 503740e9ca751ccdda7ba28e9ab8f608 | delivered | 2017-07-09 21:57:05 | 2017-07-09 22:10:13 | 2017-07-11 14:58:04 | 2017-07-26 10:57:55 | 2017-08-01 00:00:00 |
customer_id | customer_unique_id | customer_zip_code_prefix | customer_city | customer_state |
---|---|---|---|---|
06b8999e2fba1a1fbc88172c00ba8bc7 | 861eff4711a542e4b93843c6dd7febb0 | 14409 | franca | SP |
18955e83d337fd6b2def6b18a428ac77 | 290c77bc529b7ac935b93aa66c333dc3 | 9790 | sao bernardo do campo | SP |
4e7b3e00288586ebd08712fdd0374a03 | 060e732b5b29e8181a18229c7b0b2b5e | 1151 | sao paulo | SP |
b2b6027bc5c5109e529d4dc6358b12c3 | 259dac757896d24d7702b9acbbff3f3c | 8775 | mogi das cruzes | SP |
4f2d8ab171c80ec8364f7c12e35b23ad | 345ecd01c38d18a9036ed96c73b8d066 | 13056 | campinas | SP |
879864dab9bc3047522c92c82e1212b8 | 4c93744516667ad3b8f1fb645a3116a4 | 89254 | jaragua do sul | SC |
kable(head(Geolocation), format = 'markdown', caption = 'Geolocation dataset')
geolocation_zip_code_prefix | geolocation_lat | geolocation_lng | geolocation_city | geolocation_state |
---|---|---|---|---|
1037 | -23.54562 | -46.63929 | sao paulo | SP |
1046 | -23.54608 | -46.64482 | sao paulo | SP |
1046 | -23.54613 | -46.64295 | sao paulo | SP |
1041 | -23.54439 | -46.63950 | sao paulo | SP |
1035 | -23.54158 | -46.64161 | sao paulo | SP |
1012 | -23.54776 | -46.63536 | são paulo | SP |
order_id | order_item_id | product_id | seller_id | shipping_limit_date | price | freight_value |
---|---|---|---|---|---|---|
00010242fe8c5a6d1ba2dd792cb16214 | 1 | 4244733e06e7ecb4970a6e2683c13e61 | 48436dade18ac8b2bce089ec2a041202 | 2017-09-19 09:45:35 | 58.90 | 13.29 |
00018f77f2f0320c557190d7a144bdd3 | 1 | e5f2d52b802189ee658865ca93d83a8f | dd7ddc04e1b6c2c614352b383efe2d36 | 2017-05-03 11:05:13 | 239.90 | 19.93 |
000229ec398224ef6ca0657da4fc703e | 1 | c777355d18b72b67abbeef9df44fd0fd | 5b51032eddd242adc84c38acab88f23d | 2018-01-18 14:48:30 | 199.00 | 17.87 |
00024acbcdf0a6daa1e931b038114c75 | 1 | 7634da152a4610f1595efa32f14722fc | 9d7a1d34a5052409006425275ba1c2b4 | 2018-08-15 10:10:18 | 12.99 | 12.79 |
00042b26cf59d7ce69dfabb4e55b4fd9 | 1 | ac6c3623068f30de03045865e4e10089 | df560393f3a51e74553ab94004ba5c87 | 2017-02-13 13:57:51 | 199.90 | 18.14 |
00048cc3ae777c65dbb7d2a0634bc1ea | 1 | ef92defde845ab8450f9d70c526ef70f | 6426d21aca402a131fc0a5d0960a3c90 | 2017-05-23 03:55:27 | 21.90 | 12.69 |
kable(head(Payments), format = 'markdown', caption = 'Payments dataset')
order_id | payment_sequential | payment_type | payment_installments | payment_value |
---|---|---|---|---|
b81ef226f3fe1789b1e8b2acac839d17 | 1 | credit_card | 8 | 99.33 |
a9810da82917af2d9aefd1278f1dcfa0 | 1 | credit_card | 1 | 24.39 |
25e8ea4e93396b6fa0d3dd708e76c1bd | 1 | credit_card | 1 | 65.71 |
ba78997921bbcdc1373bb41e913ab953 | 1 | credit_card | 8 | 107.78 |
42fdf880ba16b47b59251dd489d4441a | 1 | credit_card | 2 | 128.45 |
298fcdf1f73eb413e4d26d01b25bc1cd | 1 | credit_card | 2 | 96.12 |
product_id | product_category_name | product_name_lenght | product_description_lenght | product_photos_qty | product_weight_g | product_length_cm | product_height_cm | product_width_cm |
---|---|---|---|---|---|---|---|---|
1e9e8ef04dbcff4541ed26657ea517e5 | perfumaria | 40 | 287 | 1 | 225 | 16 | 10 | 14 |
3aa071139cb16b67ca9e5dea641aaa2f | artes | 44 | 276 | 1 | 1000 | 30 | 18 | 20 |
96bd76ec8810374ed1b65e291975717f | esporte_lazer | 46 | 250 | 1 | 154 | 18 | 9 | 15 |
cef67bcfe19066a932b7673e239eb23d | bebes | 27 | 261 | 1 | 371 | 26 | 4 | 26 |
9dc1a7de274444849c219cff195d0b71 | utilidades_domesticas | 37 | 402 | 4 | 625 | 20 | 17 | 13 |
41d3672d4792049fa1779bb35283ed13 | instrumentos_musicais | 60 | 745 | 1 | 200 | 38 | 5 | 11 |
review_id | order_id | review_score | review_comment_title | review_comment_message | review_creation_date | review_answer_timestamp |
---|---|---|---|---|---|---|
7bc2406110b926393aa56f80a40eba40 | 73fc7af87114b39712e6da79b0a377eb | 4 | 2018-01-18 00:00:00 | 2018-01-18 21:46:59 | ||
80e641a11e56f04c1ad469d5645fdfde | a548910a1c6147796b98fdf73dbeba33 | 5 | 2018-03-10 00:00:00 | 2018-03-11 03:05:13 | ||
228ce5500dc1d8e020d8d1322874b6f0 | f9e4b658b201a9f2ecdecbb34bed034b | 5 | 2018-02-17 00:00:00 | 2018-02-18 14:36:24 | ||
e64fb393e7b32834bb789ff8bb30750e | 658677c97b385a9be170737859d3511b | 5 | Recebi bem antes do prazo estipulado. | 2017-04-21 00:00:00 | 2017-04-21 22:02:06 | |
f7c4243c7fe1938f181bec41a392bdeb | 8e6bfb81e283fa7e4f11123a3fb894f1 | 5 | Parabéns lojas lannister adorei comprar pela Internet seguro e prático Parabéns a todos feliz Páscoa | 2018-03-01 00:00:00 | 2018-03-02 10:26:53 | |
15197aa66ff4d0650b5434f1b46cda19 | b18dcdf73be66366873cd26c5724d1dc | 1 | 2018-04-13 00:00:00 | 2018-04-16 00:39:37 |
seller_id | seller_zip_code_prefix | seller_city | seller_state |
---|---|---|---|
3442f8959a84dea7ee197c632cb2df15 | 13023 | campinas | SP |
d1b65fc7debc3361ea86b5f14c68d2e2 | 13844 | mogi guacu | SP |
ce3ad9de960102d0677a81f5d0bb7b2d | 20031 | rio de janeiro | RJ |
c0f3eea2e14555b6faeea3dd58c1b1c3 | 4195 | sao paulo | SP |
51a04a8a6bdcb23deccc82b0b80742cf | 12914 | braganca paulista | SP |
c240c4061717ac1806ae6ee72be3533b | 20920 | rio de janeiro | RJ |
kable(head(Pdttrans), format = 'markdown', caption = 'Translation dataset')
product_category_name | product_category_name_english |
---|---|
beleza_saude | health_beauty |
informatica_acessorios | computers_accessories |
automotivo | auto |
cama_mesa_banho | bed_bath_table |
moveis_decoracao | furniture_decor |
esporte_lazer | sports_leisure |
Olist has been simulate addition of Customers, Product Categories, and Sellers. As per the data, there has been a steady stream of customers and sellers getting registered Olist’s platform. This progress, in conjunction with regular increase in number of new product categories being offered has maintainedOlist’s growth momentum.
The graph shows, in 2017, there has been a positive trend line in number of New Customers [Customer Unique Identity] getting registered with Olist. In 2018, morethan 6000 were getting registered every month.
In my opinion, addition of + 50 (approximate) New Product Categories per month - as an offer - has galvanised the spurt in new customers getting registered. Consistency in this phenomenon was Olist’s byword for 20 months i.e from January 2017 to August 2018.
Sellers too maintained the similar trend as that of Customers, and within the time frame of this data, Olist ended up having + 3095 registered sellers on it’s platform
Seller.Gr <- inner_join(Orders,
left_join(Items, Sellers, by = 'seller_id'),
by = 'order_id')
Seller.Gr %>% select_at(vars(purchase_year, purchase_month, seller_id)) %>%
distinct_at(vars(purchase_year, purchase_month, seller_id)) %>%
arrange_at(vars(purchase_year, purchase_month)) %>%
count(purchase_year, purchase_month) %>%
ggplot(aes(x = factor(purchase_month), y =n)) +
geom_point(color = 'red', alpha = 0.63, shape = 21, aes(size = n), fill = alpha('cornsilk', 0.3)) +
geom_segment(aes(x = factor(purchase_month), xend = factor(purchase_month),
y = 0, yend = n), alpha = 0.45, linetyle = 3, color = 'red') +
coord_polar(start = 2) +
facet_wrap(.~purchase_year) +
theme_minimal() +
theme(axis.text = element_text(size = 8), panel.grid.major = element_line(),
legend.position = 'bottom') +
labs(title = 'Increase in number of sellers per month',
x = 'Month', y = 'Number of Sellers')
## Warning: Ignoring unknown parameters: linetyle
Buoyancy of Product Categories can be gauged from the fact how frequently a Product Category has been purchased by customers? OR which are the customer driven “FAST MOVING” product categories? This kernel of information may act as one of the parameters which a company (and in this case Olist) may probably use for portfolio building.
Lets first start with the case where all Product Categories were purchased multiple times. The graph below shows which categories are high frequency categories.
Please Note: It excludes all the categories that were purchased as: one product category per month per year
Case where only one order was placed for one product category in a month and in a given year
Case where orders were placed for various product categories and each product category was purchased more than once but less than six times. Plus, list below provides some most frequently ordered categories
Case where only one order was placed for one product category in a month and in a given year
Case where orders were placed for various product categories and each product category was purchased more than five but less than twenty one times. Plus, list below provides some most frequently ordered categories.
Case where orders were placed for various product categories and each product category was purchased more than twenty but less than ninety three times. Plus, list below provides some most frequently ordered categories.
Case where orders were placed for various product categories and each product category was purchased more than ninety two but less than two hundred and fifty one times. Plus, list below provides some most frequently ordered categories.
Case where orders were placed for various product categories and each product category was purchased more than two hundred and fifty times. Plus, list below provides some most frequently ordered categories
k means clustering has been used to create clusters and to discover significant features. This helped me look into; how those features interacted and what were the relations between them? The clusters were created to look into data in tabular form and to discover the polynomial equations.
Please Note: Price & Freight Value is amount charged per item and is not per order. Plus, an order may have more than one item
#assignments <- unnest(kclusts, augmented)
#p1 <- assignments %>% head(10000) %>%
#ggplot(aes(x = price, y = freight_value)) +
#geom_point(aes(color))
clusterings %>% ggplot(aes(x = k, y = tot.withinss)) +
geom_line() + geom_point()
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 3.207, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 1} \approx 2.830 - 0.037x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 3.984. The \(\beta = -0.037\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.037.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG31 <- glm(freight_value ~ order_item_id, data = Tot.Grkm1, family="poisson or quasipoisson")
##summary(TG31)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 1} \approx 16.22 - 96.55x - 20.80x^3 + 29.69x^4 - 20.78x^5 + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 3.528, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 2} \approx 2.867 - 0.018x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 4.204. The \(\beta = -0.018\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.018.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG32 <- glm(freight_value ~ order_item_id, data = Tot.Grkm2, family="poisson or quasipoisson")
##summary(TG32)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 2} \approx 17.14 - 34.91x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 35.237, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 3} \approx 4.525 - 0.032x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 38.974. The \(\beta = -0.032\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.032.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG33 <- glm(freight_value ~ order_item_id, data = Tot.Grkm3, family="poisson or quasipoisson")
##summary(TG33)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 3} \approx 89.06 + \epsilon \]
\[ \epsilon \Rightarrow \text{error} \]
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 18.649, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 4} \approx 3.964 - 0.028x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 23.213. The \(\beta = -0.028\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.028.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG34 <- glm(freight_value ~ order_item_id, data = Tot.Grkm4, family="poisson or quasipoisson")
##summary(TG34)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 4} \approx 50.88 + 98.48x^2 - 90.17x^3 + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 4.810, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 5} \approx 3.169 - 0.068x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 5.761. The \(\beta = -0.068\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.068.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG35 <- glm(freight_value ~ order_item_id, data = Tot.Grkm5, family="poisson or quasipoisson")
##summary(TG35)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 5} \approx 22.00 - 95.53x + 27.75x^2 + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
The \(\frac{\text{Residual Deviance}}{\text{Degrees of freedom}} > 1 \text{ or} \text{ } \frac{\text{Variance}}{\text{Mean}} \neq 1\) for Poisson Regression
is at 9.993, so probably there is overdispersion. We adjust to take into account overdispersion and use Quasi-Poisson Regression
equation.
Quasi-Poisson: regression equation
\[\text{Freight Value}_\text{ Cluster 6} \approx 3.461 - 0.012x + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
Interpretation: The predictor is significant as p-value \(\approx\) ZERO, the Dispersion parameter for Quasi-Poisson
family is taken to be at 12.102. The \(\beta = -0.012\) or \(\beta < 0\) implies with every unit increase in items per order the freight value will decrease and will be multiplied by -0.012.
IF we take 8 different parameters to determine the Freight Value then using Quasi-Poisson regression
- following predictors are significant with p-value \(\approx\) ZERO:
…and lastly the distance is yet to be taken into account.
NB: - strikethrough are non-significant parameters. Codes can be had from ‘Codes’ section.
##TG36 <- glm(freight_value ~ order_item_id, data = Tot.Grkm6, family="poisson or quasipoisson")
##summary(TG36)
Polynomial: regression equation
\[\text{Freight Value}_\text{ Cluster 6} \approx 31.41 + 49.12x^2 + \epsilon \]
\[ x \Rightarrow \text{Number of Items per Order}, \text{ } \epsilon \Rightarrow \text{error}\]
The graph below shows how the density of customers and sellers are spread across various states of Brazil. It is a stagnant presentation of graph “Customers.Sellers.States.gif”, that is pasted below along with it’s code.
Graph is the first pictorial insight into: how Customer Order Cycle (the time to deliver goods) has changed as Olist started growing. This graph is the stagnant presentation of the graph “Lead.Times.gif”, which has been pasted below along with it’s code.
Thanks for giving your time to this long notebook. Should you like it, please do upvote.
Appendix 01
The code of “Customers.Sellers.States.gif” graph has been provided below. The animated graph had been created with gganimate package but due to API version incompatibility the actual code has been provided below for reference.
#CS.states %>%
#ggplot(aes(y = customer_state, x = seller_state))+
#geom_jitter(fill = alpha("blue", 0.001), color = "white", alpha = 0.36, size = 2.1, shape = 21)+
#coord_polar(start = 0)+
#theme_minimal()+
#labs(title ="Order Status' Density :: ...{closest_state}...",
#x = "Seller's State", y = "Customer's State",
#subtitle = "States :: Sellers vs. Customers")+
#transition_states(order_status, transition_length = 5, state_length = 5)
State
Appendix 02
The actual graph had been created with gganimate package but due to API version incompatibility the code of “Lead.Times.gif” graph has been provided below for reference.
Average Lead Times per Quarter per Year
#Lead.Time %>% ggplot(aes(x = purchase_quart, y =Est.Lead.t))+
#geom_line(color = "red", alpha = 0.81, linetype = "longdash")+
#geom_segment(aes(xend = 2018.4, yend = Est.Lead.t),
#linetype = 2, colour = 'grey51')+
#geom_text(aes(x = 2018.3, label = Quart.Y, hjust = 1))+
#geom_point(size=3, color="red", fill=alpha("orange", 0.63),
#alpha=0.54, shape=21, stroke=1)+
#facet_wrap(~order_status)+
#coord_polar(start = 0)+
#transition_reveal(purchase_quart)+
#theme(axis.text.x = element_blank(),
#axis.ticks = element_blank(),
#panel.background = element_blank(),
#panel.grid.major.x = element_line(color = "coral", linetype = 3),
#panel.grid.major.y = element_line(color = "coral", linetype = 3),
#strip.background = element_blank())+
#labs(title = "Change in proposed average lead times per Quarter/Year",
#subtitle = "Proposed average number of days to delivery",
#x = "Purchasing Quarter/Year",
#y= " Estimated average lead time [... in Days]",
#caption = "derived from Thomas Lin Pedersen's Dec 2018 graph: ~github.com/thomasp85")
gif