We are delighted to have you continue on in our interview process! Please complete the scenario below. The set of data needed to review is attached separately as an excel document. Feel free to manipulate the document. You will be presenting your solution and findings during your onsite interview with the management team.

Operational Scenario:

*See Excel Document Attached

We have 3 locations staffed with manual fraud reviewers: Atlantic City, Montreal, and Ibiza. Fraud reviewers in this position will evaluate orders for fraudulent activity and decide to cancel or pass the order. We have 13 total reviewers that we can allocate to any location. Each reviewer can look at 100 orders per hour and work 8 hours total per day.

Using the data in the attached spreadsheet please analyze and recommend your staffing recommendations for all 3 locations assuming that this data will remain constant year round. The spreadsheet contains one month of order data with order times and dates. In addition to a staffing recommendation please highlight any other trends that you see in our fraud data. Would you have any further questions on the ask? What would your next steps be? Please be ready to make your recommendation and highlight any trends that you find in a short 5- minute presentation using your preferred method to present (Power Point, Excel, etc…).

The Solution


Building a scheduling solution to maximize reviewers ability to detect fraud requires understanding the patterns in the data, namely 80,000 plus records from the month of August 2016, and consideration of the given constraints.

The data set contains 81,243 records of sales in the USA and Canada. Below is a sample of the first 6 entries of order records from the month of August in 2016. The “Fraudulent” column shows a 1 if the order is fraudulent and a 0 if it is not. I added a country column to make a comparison between the US and Canada. (Because four states were hard to place, they’re neither in the USA or Canada, I have excluded them from the data. Wayfair.ca was not one of the Websites where sales were made).

The Raw Data

OrderNumber OrderDate OrderDayWeek OrderHour Fraudulent Revenue State Country Website
9.12217e+15 2016-08-01 Monday 0 0 317.64 Oregon USA Wayfair
9.12217e+15 2016-08-01 Monday 0 1 24.99 New York USA AllModern
9.12223e+15 2016-08-01 Monday 0 0 3707.00 California USA Wayfair
9.12224e+15 2016-08-01 Monday 0 1 22.98 New York USA AllModern
9.12224e+15 2016-08-01 Monday 0 0 1470.00 New York USA Wayfair
9.12224e+15 2016-08-01 Monday 0 0 592.07 California USA Wayfair

Time Zones

I have made the assumption that the “Order Hour” is the time the customer placed the order in his or her own time zone. Since orders have been placed from all of over the USA and Canada, I added a column for the corrected time, i.e., EST, because the reviewers’ scheduled hours are in EST. In this way, I can get an accurate sense of when ordering is busiest and when fraudulent activity is most prevalent.

Below is the data with the standardized order time.

OrderNumber OrderDate OrderDayWeek OrderHour Fraudulent Revenue State Country Website Zone differenceZone orderEasternTime
9.12217e+15 2016-08-01 Monday 0 0 317.64 Oregon USA Wayfair Pacific Time Zone, Mountain Time Zone -3 21
9.12217e+15 2016-08-01 Monday 0 1 24.99 New York USA AllModern Eastern Time Zone 0 0
9.12223e+15 2016-08-01 Monday 0 0 3707.00 California USA Wayfair Pacific Time Zone -3 21
9.12224e+15 2016-08-01 Monday 0 1 22.98 New York USA AllModern Eastern Time Zone 0 0
9.12224e+15 2016-08-01 Monday 0 0 1470.00 New York USA Wayfair Eastern Time Zone 0 0
9.12224e+15 2016-08-01 Monday 0 0 592.07 California USA Wayfair Pacific Time Zone -3 21

Visualizing the data

Below is how many orders there were each day of the month.

There appears to be a weekly pattern to the data. We can look at the data weekly.

Day of the Week

Monday to Thursday are busier days and fraudulent orders are slightly higher on Sundays, Mondays, Wednesdays, and Thursdays.

Hour of the Day

Starting at about 10 am, we see an increase in orders. This needs to be considered in scheduling the reviewers. However, as we see in the frequency plot, fraud is higher in the very early hours, requiring vigilant reviewers in the corresponding shift.

Shifts

There are three shifts and one period between 9pm - 2am which is not covered. Therefore the shift that starts at 2am, needs to review all orders that took place between those hours. Also, there is overlap between the shifts. To solve the scheduling problem, I use hourly data and calculate the numbers of reviewers needed to sort through the cases to effectively detect fraudulent activity. I then assigned the hours to the shifts. In order to distribute the previewed orders, i.e.m those that came in between 9pm - 2am, I took the total number of orders and divided it by the number of hours in the 2am - 10am shift. I added this number to the number of orders that typically come in at each hour. With this, I was able to calculate the number of reviewers needed.

The data is shown below.

OrderDayWeek orderEasternTime totalOrders fraudOrders percentFraudulentOrders WorkersNeeded
Sunday 0 209 24 0.11 2.09
Sunday 1 117 16 0.14 1.17
Sunday 2 74 23 0.31 0.74
Sunday 3 43 13 0.30 0.43
Sunday 4 48 9 0.19 0.48
Sunday 5 34 13 0.38 0.34
Sunday 6 34 2 0.06 0.34
Sunday 7 93 7 0.08 0.93
Sunday 8 177 9 0.05 1.77
Sunday 9 293 12 0.04 2.93
Sunday 10 393 15 0.04 3.93
Sunday 11 390 22 0.06 3.90
Sunday 12 461 31 0.07 4.61
Sunday 13 450 36 0.08 4.50
Sunday 14 438 21 0.05 4.38
Sunday 15 465 18 0.04 4.65
Sunday 16 476 33 0.07 4.76
Sunday 17 471 19 0.04 4.71
Sunday 18 427 31 0.07 4.27
Sunday 19 456 25 0.05 4.56
Sunday 20 480 19 0.04 4.80
Sunday 21 475 34 0.07 4.75
Sunday 22 428 38 0.09 4.28
Sunday 23 351 42 0.12 3.51
Monday 0 309 48 0.16 3.09
Monday 1 153 17 0.11 1.53
Monday 2 112 19 0.17 1.12
Monday 3 78 25 0.32 0.78
Monday 4 66 12 0.18 0.66
Monday 5 56 8 0.14 0.56
Monday 6 103 10 0.10 1.03
Monday 7 200 16 0.08 2.00
Monday 8 396 23 0.06 3.96
Monday 9 647 18 0.03 6.47
Monday 10 907 67 0.07 9.07
Monday 11 1091 46 0.04 10.91
Monday 12 1188 56 0.05 11.88
Monday 13 1220 50 0.04 12.20
Monday 14 1408 110 0.08 14.08
Monday 15 1366 130 0.10 13.66
Monday 16 1161 59 0.05 11.61
Monday 17 1018 67 0.07 10.18
Monday 18 774 59 0.08 7.74
Monday 19 770 78 0.10 7.70
Monday 20 849 78 0.09 8.49
Monday 21 841 75 0.09 8.41
Monday 22 912 192 0.21 9.12
Monday 23 574 100 0.17 5.74
Tuesday 0 392 51 0.13 3.92
Tuesday 1 222 44 0.20 2.22
Tuesday 2 131 25 0.19 1.31
Tuesday 3 64 11 0.17 0.64
Tuesday 4 52 8 0.15 0.52
Tuesday 5 80 13 0.16 0.80
Tuesday 6 130 13 0.10 1.30
Tuesday 7 227 12 0.05 2.27
Tuesday 8 478 31 0.06 4.78
Tuesday 9 783 27 0.03 7.83
Tuesday 10 1129 56 0.05 11.29
Tuesday 11 1265 46 0.04 12.65
Tuesday 12 1444 64 0.04 14.44
Tuesday 13 1471 55 0.04 14.71
Tuesday 14 1347 49 0.04 13.47
Tuesday 15 1327 61 0.05 13.27
Tuesday 16 1116 92 0.08 11.16
Tuesday 17 954 80 0.08 9.54
Tuesday 18 795 54 0.07 7.95
Tuesday 19 726 47 0.06 7.26
Tuesday 20 871 59 0.07 8.71
Tuesday 21 883 38 0.04 8.83
Tuesday 22 796 39 0.05 7.96
Tuesday 23 577 26 0.05 5.77
Wednesday 0 270 15 0.06 2.70
Wednesday 1 192 32 0.17 1.92
Wednesday 2 120 37 0.31 1.20
Wednesday 3 87 38 0.44 0.87
Wednesday 4 88 33 0.38 0.88
Wednesday 5 87 34 0.39 0.87
Wednesday 6 100 11 0.11 1.00
Wednesday 7 200 12 0.06 2.00
Wednesday 8 375 21 0.06 3.75
Wednesday 9 644 27 0.04 6.44
Wednesday 10 753 28 0.04 7.53
Wednesday 11 825 34 0.04 8.25
Wednesday 12 980 55 0.06 9.80
Wednesday 13 1069 53 0.05 10.69
Wednesday 14 939 39 0.04 9.39
Wednesday 15 922 35 0.04 9.22
Wednesday 16 904 44 0.05 9.04
Wednesday 17 762 48 0.06 7.62
Wednesday 18 639 59 0.09 6.39
Wednesday 19 548 44 0.08 5.48
Wednesday 20 566 29 0.05 5.66
Wednesday 21 617 13 0.02 6.17
Wednesday 22 641 40 0.06 6.41
Wednesday 23 430 16 0.04 4.30
Thursday 0 293 32 0.11 2.93
Thursday 1 151 23 0.15 1.51
Thursday 2 96 21 0.22 0.96
Thursday 3 76 28 0.37 0.76
Thursday 4 57 24 0.42 0.57
Thursday 5 93 34 0.37 0.93
Thursday 6 103 10 0.10 1.03
Thursday 7 241 61 0.25 2.41
Thursday 8 419 61 0.15 4.19
Thursday 9 681 64 0.09 6.81
Thursday 10 797 88 0.11 7.97
Thursday 11 1014 99 0.10 10.14
Thursday 12 1053 89 0.08 10.53
Thursday 13 1041 53 0.05 10.41
Thursday 14 1015 56 0.06 10.15
Thursday 15 972 66 0.07 9.72
Thursday 16 975 109 0.11 9.75
Thursday 17 623 59 0.09 6.23
Thursday 18 469 32 0.07 4.69
Thursday 19 425 33 0.08 4.25
Thursday 20 475 26 0.05 4.75
Thursday 21 525 34 0.06 5.25
Thursday 22 516 48 0.09 5.16
Thursday 23 386 20 0.05 3.86
Friday 0 202 29 0.14 2.02
Friday 1 133 32 0.24 1.33
Friday 2 75 9 0.12 0.75
Friday 3 55 11 0.20 0.55
Friday 4 44 9 0.20 0.44
Friday 5 48 5 0.10 0.48
Friday 6 99 9 0.09 0.99
Friday 7 192 14 0.07 1.92
Friday 8 301 19 0.06 3.01
Friday 9 502 17 0.03 5.02
Friday 10 604 21 0.03 6.04
Friday 11 706 45 0.06 7.06
Friday 12 752 35 0.05 7.52
Friday 13 726 32 0.04 7.26
Friday 14 715 36 0.05 7.15
Friday 15 613 29 0.05 6.13
Friday 16 554 27 0.05 5.54
Friday 17 472 23 0.05 4.72
Friday 18 335 21 0.06 3.35
Friday 19 309 25 0.08 3.09
Friday 20 332 29 0.09 3.32
Friday 21 352 22 0.06 3.52
Friday 22 346 25 0.07 3.46
Friday 23 286 45 0.16 2.86
Saturday 0 162 18 0.11 1.62
Saturday 1 81 8 0.10 0.81
Saturday 2 95 8 0.08 0.95
Saturday 3 28 3 0.11 0.28
Saturday 4 40 3 0.08 0.40
Saturday 5 39 5 0.13 0.39
Saturday 6 47 12 0.26 0.47
Saturday 7 73 4 0.05 0.73
Saturday 8 164 3 0.02 1.64
Saturday 9 266 9 0.03 2.66
Saturday 10 303 10 0.03 3.03
Saturday 11 326 14 0.04 3.26
Saturday 12 394 23 0.06 3.94
Saturday 13 408 30 0.07 4.08
Saturday 14 460 25 0.05 4.60
Saturday 15 413 19 0.05 4.13
Saturday 16 427 36 0.08 4.27
Saturday 17 440 27 0.06 4.40
Saturday 18 349 34 0.10 3.49
Saturday 19 339 34 0.10 3.39
Saturday 20 343 25 0.07 3.43
Saturday 21 333 25 0.08 3.33
Saturday 22 298 28 0.09 2.98
Saturday 23 233 19 0.08 2.33

The Model

Using the above data, I then model a schedule using this data in Excel’s Solver Simplex-LP Add In. I did a schedule for each day of the week. Screenshots are shown below.

  1. Set Up




  1. Solver


3.Solution

The final solution for each of the 7 days in the week is shown below.


The ideal solution that the Solver arrived at requires 17 reviewers, 4 more than are available. Based of the observations and trends outlines in the bar charts, namely the busier days and busier hours, I made modifications to accommodate reviewer availability in each of the three geographic areas. The modified schedule is shown again here so the differences and modifications can be further regarded. In order to test if this schedule accommodates the company’s needs, I would need to build a forecasting model that can account for the intricacies involved with shift planning.



Additional observations

The following bar charts and maps highlight a few other observations that might be important in considering constraints when building schedules.

Orders can be made on various websites and we can see that some of them have a higher percentage of fraudulent activity, particularly in the earlier hours.

Finally, orders from certain states may need to be scrutinized more carefully. There are more fraudulent orders that come from CA, AZ, TX, FL, and NY while AZ has the highest percentage of them. This should put all reviewers on alert.