As a first pass, I load the data, drop the null columns, extra the month and the month name, and do a quick glance of the data. I see that the classes of the data were passed into R as intended, and the distribution of the data seems fairly normal.
orders <- data.table(read.xlsx2(file = "XLS_takehome_NA.xlsx",
sheetName = "Orders",
stringsAsFactors = F,
colClasses = c("Date", "Date", "character", "numeric")))
## clean up data
orders <- orders[, 1:4][order(dateordered)]
orders[, month := month(dateordered)]
orders$monthName <- month.name[ orders$month]
## descriptives
sapply(orders, class)
## dateordered datereturned orderstatus orders month
## "Date" "Date" "character" "numeric" "numeric"
## monthName
## "character"
summary(orders)
## dateordered datereturned orderstatus
## Min. :2016-08-01 Min. :2016-08-25 Length:213
## 1st Qu.:2016-09-08 1st Qu.:2016-11-13 Class :character
## Median :2016-10-19 Median :2016-12-11 Mode :character
## Mean :2016-10-18 Mean :2016-12-05
## 3rd Qu.:2016-11-28 3rd Qu.:2017-01-05
## Max. :2016-12-31 Max. :2017-01-28
## NA's :153
## orders month monthName
## Min. : 1.00 Min. : 8.0 Length:213
## 1st Qu.: 1.00 1st Qu.: 9.0 Class :character
## Median : 4.00 Median :10.0 Mode :character
## Mean : 5.13 Mean :10.1
## 3rd Qu.: 7.00 3rd Qu.:11.0
## Max. :36.00 Max. :12.0
##
At first glance, one may think that the rate of returns is rising– the sheer number of returns has increased over time. However in order to answer the first question, I did an aggregation to get the return rate by month. I did so by calculating the number of orders returned per the number of orders placed by the month in which that order was placed. According to these trends, the return rate has been steadily falling since August.
I believe that the return rate calculated here is indicative of how the company was performing or the quality of the mattress at the the time of sale. The plot, despite being out of order, shows the return rates over time.
returnRate <- orders[, .( orders = sum(orders[orderstatus == "complete"], na.rm = T),
returns = sum(orders[orderstatus == "returned"], na.rm = T),
returnRate = sum(orders[orderstatus == "returned"], na.rm = T)/
sum(orders[orderstatus == "complete"], na.rm = T)),
by = .(monthName, month)]
## clean up
returnRate[, returnRate := round(returnRate,2)]
print(returnRate)
## monthName month orders returns returnRate
## 1: August 8 139 10 0.07
## 2: September 9 134 9 0.07
## 3: October 10 170 10 0.06
## 4: November 11 214 12 0.06
## 5: December 12 376 19 0.05
## plot
plot_ly(data = returnRate,
x = ~monthName,
y = ~returnRate,
mode = 'lines',
colors = "Set1")
## No trace type specified:
## Based on info supplied, a 'bar' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#bar
As a gut check, I wanted to aggregate the data in a different way. I aggregated the data by the date of return, which shows the sheer volume of mattresses being sold and returned by month, regardless of their sale date. While I believed this calculation to be less valuable, I created it to rest assured. Again, we can see that, although the number of returns is rising, the rate is not as telling when ignoring the month in which the order was placed.
returnsByMonth <- orders[, .(returns = sum(orders[orderstatus == "returned"], na.rm = T)),
by = monthName]
ordersByMonth <- orders[, .(orders = sum(orders[orderstatus == "complete"], na.rm = T)),
by = monthName]
## join
setkey(returnsByMonth, monthName)
setkey(ordersByMonth, monthName)
returnsByMonth <- ordersByMonth[returnsByMonth]
## clean up
returnsByMonth[!is.na(monthName), returnRate := returns/orders, by = monthName]
returnsByMonth <- returnsByMonth[c(1, 5, 4, 3, 2)]
print(returnsByMonth)
## monthName orders returns returnRate
## 1: August 139 10 0.07194
## 2: September 134 9 0.06716
## 3: October 170 10 0.05882
## 4: November 214 12 0.05607
## 5: December 376 19 0.05053
## plot
plot_ly(returnsByMonth, x = ~monthName) %>%
add_trace(y = ~orders, name = 'Orders',mode = 'lines') %>%
add_trace(y = ~returns, name = 'Returns', mode = 'lines')
## No trace type specified:
## Based on info supplied, a 'bar' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#bar
## No trace type specified:
## Based on info supplied, a 'bar' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#bar
Some examples of factors highlighted by the first aggregation include:
- Changing (bettering) quality of product over time
- Changing (bettering) customer experience at the point of sale (whether customers are being matched to the right product (mattress) at the point of sale)
Some examples of factors highlighted by the second aggregation include:
- Changes to consumer behavior in reaction to new competitors
- Changes to post-sale responsiveness of customer support
- Other factors not associated with date of point-of-sale
Using a package in R, I created and ran a SQL query to get the number of completed orders by date. If the audience also wanted to see an aggregation by month, I would add either a MONTH or a DATEPART to the query.
casper_orders <- orders
sqldf("SELECT SUM(orders) as totalOrders, dateordered
FROM casper_orders
WHERE orderstatus = 'complete'
GROUP BY dateordered;")
## totalOrders dateordered
## 1 4 2016-08-01
## 2 4 2016-08-02
## 3 8 2016-08-03
## 4 6 2016-08-04
## 5 1 2016-08-05
## 6 1 2016-08-06
## 7 5 2016-08-07
## 8 4 2016-08-08
## 9 9 2016-08-09
## 10 4 2016-08-10
## 11 12 2016-08-11
## 12 8 2016-08-12
## 13 4 2016-08-13
## 14 6 2016-08-14
## 15 3 2016-08-15
## 16 1 2016-08-16
## 17 3 2016-08-17
## 18 5 2016-08-18
## 19 5 2016-08-19
## 20 3 2016-08-20
## 21 3 2016-08-21
## 22 2 2016-08-22
## 23 4 2016-08-23
## 24 6 2016-08-24
## 25 3 2016-08-25
## 26 5 2016-08-26
## 27 3 2016-08-27
## 28 7 2016-08-28
## 29 3 2016-08-29
## 30 3 2016-08-30
## 31 4 2016-08-31
## 32 3 2016-09-01
## 33 6 2016-09-02
## 34 2 2016-09-03
## 35 8 2016-09-04
## 36 10 2016-09-05
## 37 7 2016-09-06
## 38 5 2016-09-07
## 39 9 2016-09-08
## 40 4 2016-09-09
## 41 3 2016-09-10
## 42 1 2016-09-11
## 43 2 2016-09-12
## 44 5 2016-09-13
## 45 2 2016-09-14
## 46 10 2016-09-15
## 47 1 2016-09-16
## 48 6 2016-09-17
## 49 2 2016-09-18
## 50 3 2016-09-19
## 51 4 2016-09-20
## 52 5 2016-09-21
## 53 4 2016-09-22
## 54 2 2016-09-23
## 55 7 2016-09-24
## 56 3 2016-09-25
## 57 3 2016-09-26
## 58 4 2016-09-27
## 59 2 2016-09-28
## 60 6 2016-09-29
## 61 5 2016-09-30
## 62 8 2016-10-01
## 63 2 2016-10-02
## 64 3 2016-10-03
## 65 3 2016-10-04
## 66 6 2016-10-05
## 67 8 2016-10-06
## 68 6 2016-10-07
## 69 4 2016-10-08
## 70 7 2016-10-09
## 71 6 2016-10-10
## 72 7 2016-10-11
## 73 6 2016-10-12
## 74 5 2016-10-13
## 75 5 2016-10-14
## 76 6 2016-10-15
## 77 3 2016-10-16
## 78 6 2016-10-17
## 79 5 2016-10-18
## 80 6 2016-10-19
## 81 8 2016-10-20
## 82 5 2016-10-21
## 83 10 2016-10-22
## 84 7 2016-10-23
## 85 2 2016-10-24
## 86 7 2016-10-25
## 87 7 2016-10-26
## 88 2 2016-10-27
## 89 7 2016-10-28
## 90 1 2016-10-29
## 91 6 2016-10-30
## 92 6 2016-10-31
## 93 7 2016-11-01
## 94 2 2016-11-02
## 95 6 2016-11-03
## 96 8 2016-11-04
## 97 5 2016-11-05
## 98 3 2016-11-07
## 99 2 2016-11-08
## 100 2 2016-11-09
## 101 10 2016-11-10
## 102 9 2016-11-11
## 103 4 2016-11-12
## 104 4 2016-11-13
## 105 7 2016-11-14
## 106 6 2016-11-15
## 107 6 2016-11-16
## 108 11 2016-11-17
## 109 11 2016-11-18
## 110 6 2016-11-19
## 111 8 2016-11-20
## 112 6 2016-11-21
## 113 5 2016-11-22
## 114 5 2016-11-23
## 115 13 2016-11-24
## 116 5 2016-11-25
## 117 5 2016-11-26
## 118 4 2016-11-27
## 119 9 2016-11-28
## 120 9 2016-11-29
## 121 36 2016-11-30
## 122 32 2016-12-01
## 123 15 2016-12-02
## 124 12 2016-12-03
## 125 11 2016-12-04
## 126 11 2016-12-05
## 127 7 2016-12-06
## 128 9 2016-12-07
## 129 5 2016-12-08
## 130 14 2016-12-09
## 131 14 2016-12-10
## 132 9 2016-12-11
## 133 13 2016-12-12
## 134 10 2016-12-13
## 135 15 2016-12-14
## 136 18 2016-12-15
## 137 13 2016-12-16
## 138 8 2016-12-17
## 139 10 2016-12-18
## 140 11 2016-12-19
## 141 13 2016-12-20
## 142 8 2016-12-21
## 143 10 2016-12-22
## 144 12 2016-12-23
## 145 4 2016-12-24
## 146 8 2016-12-25
## 147 12 2016-12-26
## 148 13 2016-12-27
## 149 23 2016-12-28
## 150 14 2016-12-29
## 151 14 2016-12-30
## 152 8 2016-12-31