This data set gives us lots of interesting data to use that can help us understand:
What methods can we take to ensure the product arrives on time? What do we do that will give us a good rating? What things correlate the most that determine importance? How does our ratings compare to other companies?
I believe these things will be interesting because it will be helpful to try and determine what a great supply chain has. By looking at all the components, we can really see what a company can do to make its supply chain great! Also, I find this data to be very rich with lots of hidden parts that we cannot get from just looking at it. Only an analysis will be able to tell us!
Process Optimization
To begin, we will be taking a look at what is the best method to delivering our products. We can tell what the optimized path is by looking at what mode of shipment, what warehouse it comes from, and what level of importance is the most effective for delivering our product on time. Comparing this to the most frequently chosen path, we can show the company by changing to the more effective path, their company can benefit by providing customers downstream value through saving on costs. The first step in determining what a great supply chain will have is determining the effectiveness and frequency of each path. We are going to look at the completion rate of each mode of transportation to determine which will be the most reliable way to ship our products to begin.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.3 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.3 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::group_rows() masks kableExtra::group_rows()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: package 'tidytext' was built under R version 4.3.2
Warning: package 'textdata' was built under R version 4.3.2
| Flight |
1777 |
1069 |
60.15757 |
| Road |
1760 |
1035 |
58.80682 |
| Ship |
7462 |
4459 |
59.75610 |
As we can see here, the three modes of transportation are quite similar in completion rates. However, it appears that flights are the most effective strategy to send our products. Assuming that the cost of shipping each of these ways are similar, we would assume that flights would be the best way to send all products.
However, according to the graph, we can see that ship is by far the most commonly used method of transportation. Since we don’t know what company this is, we have to assume this either because they are not near an easily accessible airport or their warehouse is located closely to a harbor.
For determining which warehouse would be the best we can do a similar analysis.
| A |
1833 |
1075 |
58.64703 |
| B |
1833 |
1104 |
60.22913 |
| C |
1833 |
1094 |
59.68358 |
| D |
1834 |
1096 |
59.76009 |
| F |
3666 |
2194 |
59.84725 |
Again, we can see similar results for each of the warehouses. However, understanding why the volume is so large at warehouse block F instead of the more efficient warehouse block B could help supply chain managers reduce costs for their downstream customers.
This graph helps us understand the main process at which this electronic company is supplying their goods to their customers. By following the main source of volume we can determine the path most followed. Obviously, they are usually going through warehouse block F as their main process follow right now. However, by using the more effective route with going through warehouse block B, they could reduce costs even if it is a marginal difference.
Finally, I believe that our priority of our products matters in our supply chain. Determining between highly important packages and lowly important packages could make a difference in what the optimal process would be.
We are going to a similar analysis for this to determine the best process path for our products to take.
| high |
948 |
616 |
64.97890 |
| low |
5297 |
3140 |
59.27884 |
| medium |
4754 |
2807 |
59.04501 |
Here, the most effective route and the most often taken route. By labeling a product as high importance we can see that it reaches it delivery point on time the most often. However, it is more common that a delivery is labeled as low importance and not delivered on time compared to a high importance item. Therefore, my recommendation to this company would to start treating all deliveries as if they a high importance item to increase completion rate efficiency and thus delivering downstream value to their customers.
We see the frequency of product’s importance, how often they are labeled, and how often they reached their destination on time. Again, instead of categorically, making a distinction between products, the company should treat them more as the same to improve process efficiency.
After finding the results of effectiveness and frequency between different factors, we can come up with a plan for this company to use their most efficient options to create a higher reached in time rate! Right now, the company is using Ships, Warehouse Block F, and low importance to delivered the majority of their products in time. However, by using Planes, Warehouse Block B, and a high importance on the majority of the products going through their process, they would create a more effective process saving them and their customers money.
Rating Analysis
Next, we want to determine what factors contribute the most to Rating. We will do a similar analysis to see how number of customer care calls, number of prior purchases, gender, and discount offered will affect the rating.
Our average customer rating follows. which is very low and we need to identify where in the data we can improve.
Customer_rating Customer_care_calls Prior_purchases
Customer_rating 1.0000 0.0122 0.0132
Customer_care_calls 0.0122 1.0000 0.1808
Prior_purchases 0.0132 0.1808 1.0000
Discount_offered -0.0031 -0.1308 -0.0828
Gender 0.0028 0.0025 -0.0094
Discount_offered Gender
Customer_rating -0.0031 0.0028
Customer_care_calls -0.1308 0.0025
Prior_purchases -0.0828 -0.0094
Discount_offered 1.0000 -0.0118
Gender -0.0118 1.0000
From these correlations we can see which are the most correlated. The number of customer care calls and number of prior purchases are the stats that are the most correlated to customer rating. Improving on these factors could help us improve on our companies low rating. Gender and discount are really not correlated, so these factors we do not have to improve on.
From this graph we can see some interesting relationships between number of customer care calls and the rating the company receives. Interestingly, when a customer receives 3 calls, they are way more likely to rate the company a 1 comparatively than any other number of calls is. The company should make sure to investigate why this is the case and if there is any real correlation because of it. It appears at 2, 5, 6, and 7 calls clients are more likely to rate the company hire than when they call them 3 and 4 times. So, understanding why this is the case and breaking down their internal process may help them to have a better rating.
Next, we should determine if prior purchases influences the rating.
Here we can see when a customer purchases items in the past, what kind of rating they are likely to leave. We see that buying 2 and 5 items prior more times than not lead to a customer leaving a lover review. This could mean that our customer retention is not as strong after those first few purchases. Especially in the beginning of our relationship with a customer we want to be strong so we can get our customers to continue to buy from us. Working on our process will help with customer satisfaction and increase our rating overall.
Working with these factors, we can see how small changes to the way we care for customers can help improve our rating.