The goal of this project is to determine profitability by sku.

The picks table will correspond to the data table in Domo which is the product of an ETL the orders_products_picked table and a calculated weighted average cost of inventory table. I can make the ETL run by sending the inventoryFlow table up to Domo. It usually takes around 2min to load into Domo so the pause is for 5min to be safe.

Load Data and Packages

I’ll load packages and source the inventoryFlow script.

## NULL

Now that the weighted average cost of inventory for each product at the time of each warehouse pick is appended to the orders_products_picked table I’ll download all the tables for the project from Domo.

Create Picks Table

Select the columns we want to use from the picks table. Group by orders_id and prodcuts_id. Summerise products_quantity as sum, products_weight by mean and weightedAvgCost by mean.

The weightedAvgCost here should probably be summerised by the weighted average instead of the mean but since the costs should be the same the mean should work.

Create Products Table

Next up is the orders_prodcuts table which will be called productsSel here. The first step will be to calculate the extended_products_price and extended_final_price for each sku.

After that we will replace the products_id of any product that has a products_parent_id with the products_parent_id.

If the sku has a products_parent_id and it has a case amount greater than 0 the products_quantity should be changed to equal the products_quantity * products_case_amount.

Once the new quantities are calculated the products_price and final_price should be changed so the price per unit is correct for the new number of units.

Kit Creation

Since kits are listed as their own sku in the products table but listed as the components of the kit in the picks table we need to transform the products table to take out the kit rows and replace them with the kit contents. In doing so we need to spread the price of the kit based on the costs of the items in the kit.

First we’ll make a list of all the orders that have a kit in them. Then create an empty data frame to hold the kits.

For each orders_id with a kit in it we’ll find the components of the kit in the bundles table. The bundles table can change over time so we’ll make sure we’re matching the date of the order with the date of the bundle unless the order is from a time before we started appending the bundles list by day.

I’ll flag the kit items as 2 and leave the kits flagged as 1 so we can filter out the kits skus later.

All the kits we made get tacked onto the end of the productsSel dataframe.

Full Join of Picks and Products

I’ll join the two tables now by orders_id and products_id.

Subset the Joined Table into Kits and Not Kits

We need to work on the orders where the order includes a kit. Here we will filter the joined list so we have a dataframe with only kits and one without kits. We’ll also get the list of all orders that include a kit to itterate over.

Split Prices by Costs

Here we’ll split the prices of the kits by the items in the kits proportionally to the costs of the items in the kits. We’ll append them onto an empty data frame created by filtering for NA orders_id which don’t exist.

Bind the kits onto the not kits

Dealing with drop ships

Drop-ships apparently have no qty picked because of that we have a price but no cost. This will join on the cost listed in the inventory table for the product for the matching date.

Since this wasn’t being tracked for this purpose before 2/5/19 I’ll use the costs on that date for drop ships which occured in the past.

I’ll bind those onto a data frame which includes only the skus where the qtyPicked != NA.

missingPicks1 <- priceAndCost %>%
        filter(is.na(qtyPicked)) %>%
        left_join(., costs, by = c("products_id", "mdy")) %>%
        filter(products_weight.y != 0) %>%
        mutate(products_quantity = ifelse(is.na(products_quantity), 0, products_quantity),
               products_price = ifelse(is.na(products_price), 0, products_price),
               qtyPicked = products_quantity,
               products_weight = products_weight.y,
               weightedAvgCost = products_cost,
               products_ttl_weight = products_weight * products_quantity,
               products_ttl_cost = weightedAvgCost * products_quantity) %>%
        select(-products_weight.x, -products_weight.y, -products_cost) %>%
        filter(products_quantity != 0 & qtyPicked != 0)

costsDate1 <- costs %>%
        filter(mdy == min(mdy))

missingPicks2 <- priceAndCost %>%
        filter(is.na(qtyPicked),
               mdy < ymd("2019-02-05")) %>%
        left_join(., costsDate1, by = c("products_id")) %>%
        mutate(products_quantity = ifelse(is.na(products_quantity), 0, products_quantity),
               products_price = ifelse(is.na(products_price), 0, products_price),
               qtyPicked = products_quantity,
               products_weight = products_weight.y,
               weightedAvgCost = products_cost,
               products_ttl_weight = products_weight * products_quantity,
               products_ttl_cost = weightedAvgCost * products_quantity) %>%
        select(-products_weight.x, -products_weight.y, -mdy.y, -products_cost) %>%
        rename(mdy = mdy.x) %>%
        filter(products_quantity != 0 & qtyPicked != 0)

orders_product_and_picks <- priceAndCost %>%
        filter(!is.na(qtyPicked)) %>%
        rbind(missingPicks1) %>%
        rbind(missingPicks2) %>%
        mutate(products_price = ifelse(is.na(products_price), 0, products_price),
               final_price = ifelse(is.na(final_price), 0 , final_price)) %>%
        mutate(extended_price = ifelse(final_price > 0,
                                       final_price * products_quantity,
                                       products_price * products_quantity)) %>%
        arrange(orders_id, products_id)

Combining orders with multiple shipments

We’ll group together orders with multiple shipments to calculate the total box cost and total shipping cost. That will get bound onto the totals data set. I’ll also drop the fields from the totals data set that aren’t used or aren’t necessary.

Join the totals to the product level

Define all the functions to split the totals

# Functions to split by weight
ShippingChargeSplit <- function(df) {
        x = numeric(length = length(df$products_ttl_weight))
        for (i in 1:length(x)) {
                x[i] = (df$products_ttl_weight[i]/sum(df$products_ttl_weight)) * df$Shipping[i]
        }
        return(x)
}

BoxCostSplit <- function(df) {
        x = numeric(length = length(df$products_ttl_weight))
        for (i in 1:length(x)) {
                x[i] = (df$products_ttl_weight[i]/sum(df$products_ttl_weight)) * df$box_cost[i]
        }
        return(x)
}

ShippingCostSplit <- function(df) {
        x = numeric(length = length(df$products_ttl_weight))
        for (i in 1:length(x)) {
                x[i] = (df$products_ttl_weight[i]/sum(df$products_ttl_weight)) * df$shipping_cost[i]
        }
        return(x)
}

# Functions to split by extended price
TaxSplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Tax[i]
        }
        return(x)
}

CouponSplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Coupon[i]
        }
        return(x)
}

DiscountSplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Discount[i]
        }
        return(x)
}

Misc_RefundSplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Misc_Refund[i]
        }
        return(x)
}

AddonSplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Addon_Charge[i]
        }
        return(x)
}

CharitySplit <- function(df) {
        x = numeric(length = length(df$extended_price))
        for (i in 1:length(x)) {
                x[i] = (df$extended_price[i]/sum(df$extended_price)) * df$Charity[i]
        }
        return(x)
}

Split costs and charges by SKU

# Split the totals by the products in each order.  Shipping charges and costs
# should be split on weight, all other charges should be split by price.
costsAssigned <- mergedProductsAndTotals %>%
        group_by(orders_id) %>%
        nest() %>%
        mutate(ShippingChargeSplit = map(data, ShippingChargeSplit),
               BoxCostSplit = map(data, BoxCostSplit),
               ShippingCostSplit = map(data, ShippingCostSplit),
               TaxSplit = map(data, TaxSplit),
               CouponSplit = map(data, CouponSplit),
               DiscountSplit = map(data, DiscountSplit),
               Misc_RefundSplit = map(data, Misc_RefundSplit),
               AddonSplit = map(data, AddonSplit),
               CharitySplit = map(data, CharitySplit)) %>%
        unnest() %>%
        select(-Shipping, -Tax, -Coupon, -Discount, -Misc_Refund, -Addon_Charge, 
               -Charity, -box_cost, -shipping_cost) %>%
        mutate(ShippingChargeSplit = if_else(is.na(ShippingChargeSplit), 0, as.numeric(round(ShippingChargeSplit, 2))),
               BoxCostSplit = if_else(is.na(BoxCostSplit), 0, as.numeric(round(BoxCostSplit, 2))),
               ShippingCostSplit = if_else(is.na(ShippingCostSplit), 0, as.numeric(round(ShippingCostSplit, 2))),
               TaxSplit = if_else(is.na(TaxSplit), 0, as.numeric(round(TaxSplit, 2))),
               CouponSplit = if_else(is.na(CouponSplit), 0, as.numeric(round(CouponSplit, 2))),
               DiscountSplit = if_else(is.na(DiscountSplit), 0, as.numeric(round(DiscountSplit, 2))),
               Misc_RefundSplit = if_else(is.na(Misc_RefundSplit), 0, as.numeric(round(Misc_RefundSplit, 2))),
               AddonSplit = if_else(is.na(AddonSplit), 0, as.numeric(round(AddonSplit, 2))),
               CharitySplit = if_else(is.na(CharitySplit), 0, as.numeric(round(CharitySplit, 2))),
               revenue = extended_price + ShippingChargeSplit - CouponSplit - DiscountSplit - Misc_RefundSplit + AddonSplit,
               revenue = if_else(is.na(revenue), 0, as.numeric(round(revenue, 2))),
               cost = products_ttl_cost + ShippingCostSplit + BoxCostSplit,
               cost = if_else(is.na(cost), 0, as.numeric(round(cost, 2))))

#sum(costsAssigned$extended_price)
#sum(costsAssigned$final_price * costsAssigned$products_quantity)

Push to the cloud

Here I’ll push the final table up to Domo and also save this r-markdown file as an r-script file.