ABSTRACT.

In this small project I’m going to analyse the time period between the order date and the shipped date. today’s lifestyle defined as expensive, which means that consumers don’t like to wait to filled their satisfactions. Resellers are in a race against time, minimize the delivery period as much as possible. My goal is to track after employees performance in order to enhance productivity.

I used the NORTHWIND database for demonstrate the process. The data contain 809 orders records

Data limitations: there might be not logical dates due to setting the database to recent years, but the differentiate are correct and that what I aiming to investigate. According to the dataset info I will assume that delivery time will be between 8 to 30 days.

KeyWords: Time period between Order Date and Shipped Date (TPOSD)

Part 1: Connecting SQL to R studio and import Database

I had to create an ODBC connection in order to send queries to sql. To do that I had to use the RODBC library and to add (defined) a new connection in the “ODBC Data Source Administrator (64-bit)”. * The code been hided to make the page cleaner

After Loading the database I had to make some manipulations on the query from SQL so R will except the columns’ formatting. Here are the outputs visualize in a table.

In the following table I loaded only 15 rows of the dataset which their time period between the Order Date and the Shipped date (TPOSD) is above 8 days just for demonstration. Inside the table I have marked all the orders which their time period between the orders date and day of delivery above 20 marked in yellow and above 27 days marked in red.

Random 15 Orders from 8 days and above
EmployeeID Full_Name OrderDate ShipDate ship_ord_dif Order Status
5 Steven Buchanan 2018-07-04 2018-07-16 12 Status OK
5 Steven Buchanan 2018-07-11 2018-07-23 12 Status OK
4 Margaret Peacock 2018-07-19 2018-07-29 10 Status OK
4 Margaret Peacock 2018-07-19 2018-07-30 11 Status OK
6 Michael Suyama 2018-07-24 2018-08-23 30 Ask emp to mng
2 Andrew Fuller 2018-07-25 2018-08-12 18 Status OK
5 Steven Buchanan 2018-07-31 2018-08-09 9 Status OK
6 Michael Suyama 2018-08-01 2018-08-30 29 Ask emp to mng
6 Michael Suyama 2018-08-06 2018-08-16 10 Status OK
2 Andrew Fuller 2018-08-14 2018-09-12 29 Ask emp to mng
8 Laura Callahan 2018-08-21 2018-08-30 9 Status OK
4 Margaret Peacock 2018-08-23 2018-09-03 11 Status OK
1 Nancy Davolio 2018-08-29 2018-09-11 13 Status OK
2 Andrew Fuller 2018-09-09 2018-09-18 9 Status OK
4 Margaret Peacock 2018-09-10 2018-10-09 29 Ask emp to mng

Just from random rows in the data it is possible to understand that there are few orders which are inside the delivery limitations zone (8 to 32 days).

Part 2: Getting first glance on the data by diagrams

Now I’m going to use all the dataset and demonstrate in a bar histogram chart and scatter plot diagram to get a better view of the situation, to find where are the most Orders are placed based on the time past till their delivery. I grouped the days till delivery as follow: “0-8”, “9-16”, “17-23”, “24+”.

From the bar histogram chart we can understand that most of the Orders are between 0 to 8 days, which is a good sign. But it is also possible to see that the last group (“24+”) little bit higher than the group next to it. This group is the most concern and need to be tracked and optimize so it would be as less as possible.

The following scatter plots chart represent the same data but in a different way. Each dot represent number of orders which they are in the same TPOSD. The more it goes left and up in the chart the better it is. the most top left dot represent a group of more than 100 orders which their TPOSD is around 5 days. the second highest dot placed between 80 - 100 orders and TPOSD around 5 days.

Part 3: Pareto 80 - 20

In this calculation I’m trying to show how many Orders refer to be the core of all the orders in total. In other words what is the amount of orders which their ratio equal to 80% of all the orders, and their TPOSD is less as possible.

The following table present that 665 Orders are evaluate to 82% of all the orders and they rate between 1 to 10 TPOSD.

TPOSD Count orders rolling sum % rolling sum of count orders
1 18 18 2
2 58 76 9
3 68 144 17
4 65 209 25
5 65 274 33
6 88 362 44
7 110 472 58
8 69 541 66
9 73 614 75
10 51 665 82
11 16 681 84
12 17 698 86
13 9 707 87
14 5 712 88
15 7 719 88
16 1 720 88
17 4 724 89
18 6 730 90
19 7 737 91
20 5 742 91
21 10 752 92
22 4 756 93
23 5 761 94
24 3 764 94
25 4 768 94
26 4 772 95
27 2 774 95
28 3 777 96
29 8 785 97
30 4 789 97
31 2 791 97
32 7 798 98
34 4 802 99
35 5 807 99
37 2 809 100