Homework 1:

Exploratory analysis and essay Pre-work Visit the following website and explore the range of sizes of this data set (from 100 to 5 million records): https://excelbianalytics.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ or (new) https://www.kaggle.com/datasets

1). Select 2 files to download 2). Based on your computer’s capabilities (memory, CPU), select 2 files you can handle (recommended one small, one large) 3). Download the files 4). Review the structure and content of the tables, and think about the data sets (structure, size, dependencies, labels, etc) 5). Consider the similarities and differences in the two data sets you have downloaded 6). Think about how to analyze and predict an outcome based on the data sets available Based on the data you have, think which two machine learning algorithms presented so far could be used to analyze the data.

#library(knitr)
#install.packages("tinytex")
#tinytex::install_tinytex()

Solution:

I selected the sales data set and used two sizes small (100 sales records) and large (50,000 sales records). Please advise how to load large files to Github which appears to have a 25Mb size limit.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
df_100_small <- read.csv("https://raw.githubusercontent.com/tponnada/hello-world/master/100%20Sales%20Records.csv")

df_50000_large <- read.csv("https://raw.githubusercontent.com/tponnada/hello-world/master/50000%20Sales%20Records.csv")

Exploratory Analysis

Once files are loaded, we proceed with reviewing the structure and content of the tables and consider the similarities/differences between the two data sets to understand which two machine learning algorithms can be applied in this context.

A visual examination tells us both the small and the large data set have the same variables, the same number of columns with the same exact data types (similarities) while the differences really come down to the number of rows in each file. Of the 14 variables in both data sets, the variable types are accurate for most variables with the exception of date which needed conversion from character to date in both data sets. This is accomplished in the data clean-up section below.

head(df_100_small)
##                              Region               Country       Item.Type
## 1             Australia and Oceania                Tuvalu       Baby Food
## 2 Central America and the Caribbean               Grenada          Cereal
## 3                            Europe                Russia Office Supplies
## 4                Sub-Saharan Africa Sao Tome and Principe          Fruits
## 5                Sub-Saharan Africa                Rwanda Office Supplies
## 6             Australia and Oceania       Solomon Islands       Baby Food
##   Sales.Channel Order.Priority Order.Date  Order.ID Ship.Date Units.Sold
## 1       Offline              H  5/28/2010 669165933 6/27/2010       9925
## 2        Online              C  8/22/2012 963881480 9/15/2012       2804
## 3       Offline              L   5/2/2014 341417157  5/8/2014       1779
## 4        Online              C  6/20/2014 514321792  7/5/2014       8102
## 5       Offline              L   2/1/2013 115456712  2/6/2013       5062
## 6        Online              C   2/4/2015 547995746 2/21/2015       2974
##   Unit.Price Unit.Cost Total.Revenue Total.Cost Total.Profit
## 1     255.28    159.42    2533654.00 1582243.50    951410.50
## 2     205.70    117.11     576782.80  328376.44    248406.36
## 3     651.21    524.96    1158502.59  933903.84    224598.75
## 4       9.33      6.92      75591.66   56065.84     19525.82
## 5     651.21    524.96    3296425.02 2657347.52    639077.50
## 6     255.28    159.42     759202.72  474115.08    285087.64
head(df_50000_large)
##               Region   Country Item.Type Sales.Channel Order.Priority
## 1 Sub-Saharan Africa   Namibia Household       Offline              M
## 2             Europe   Iceland Baby Food        Online              H
## 3             Europe    Russia      Meat        Online              L
## 4             Europe  Moldova       Meat        Online              L
## 5             Europe     Malta    Cereal        Online              M
## 6               Asia Indonesia      Meat        Online              H
##   Order.Date  Order.ID  Ship.Date Units.Sold Unit.Price Unit.Cost Total.Revenue
## 1  8/31/2015 897751939 10/12/2015       3604     668.27    502.54     2408445.1
## 2 11/20/2010 599480426   1/9/2011       8435     255.28    159.42     2153286.8
## 3  6/22/2017 538911855  6/25/2017       4848     421.89    364.69     2045322.7
## 4  2/28/2012 459845054  3/20/2012       7225     421.89    364.69     3048155.2
## 5  8/12/2010 626391351  9/13/2010       1975     205.70    117.11      406257.5
## 6  8/20/2010 472974574  8/27/2010       2542     421.89    364.69     1072444.4
##   Total.Cost Total.Profit
## 1  1811154.2     597290.9
## 2  1344707.7     808579.1
## 3  1768017.1     277305.6
## 4  2634885.2     413270.0
## 5   231292.2     174965.2
## 6   927042.0     145402.4
colnames(df_100_small)
##  [1] "Region"         "Country"        "Item.Type"      "Sales.Channel" 
##  [5] "Order.Priority" "Order.Date"     "Order.ID"       "Ship.Date"     
##  [9] "Units.Sold"     "Unit.Price"     "Unit.Cost"      "Total.Revenue" 
## [13] "Total.Cost"     "Total.Profit"
colnames(df_50000_large)
##  [1] "Region"         "Country"        "Item.Type"      "Sales.Channel" 
##  [5] "Order.Priority" "Order.Date"     "Order.ID"       "Ship.Date"     
##  [9] "Units.Sold"     "Unit.Price"     "Unit.Cost"      "Total.Revenue" 
## [13] "Total.Cost"     "Total.Profit"

Data clean-up

df_100_small[['Order Date']] <- as.Date(df_100_small[['Order.Date']], "%m/%d/%Y")
df_100_small[['Ship Date']] <- as.Date(df_100_small[['Ship.Date']], "%m/%d/%Y")

df_50000_large[['Order Date']] <- as.Date(df_50000_large[['Order.Date']], "%m/%d/%Y")
df_50000_large[['Ship Date']] <- as.Date(df_50000_large[['Ship.Date']], "%m/%d/%Y")

Data summary and visualization

Both the small and the large dataset encompass 7 years of history of sales data by country/region and item type/sales channel within those countries.

summary(df_100_small)
##     Region            Country           Item.Type         Sales.Channel     
##  Length:100         Length:100         Length:100         Length:100        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Order.Priority      Order.Date           Order.ID          Ship.Date        
##  Length:100         Length:100         Min.   :114606559   Length:100        
##  Class :character   Class :character   1st Qu.:338922488   Class :character  
##  Mode  :character   Mode  :character   Median :557708561   Mode  :character  
##                                        Mean   :555020412                     
##                                        3rd Qu.:790755081                     
##                                        Max.   :994022214                     
##    Units.Sold     Unit.Price       Unit.Cost      Total.Revenue    
##  Min.   : 124   Min.   :  9.33   Min.   :  6.92   Min.   :   4870  
##  1st Qu.:2836   1st Qu.: 81.73   1st Qu.: 35.84   1st Qu.: 268721  
##  Median :5382   Median :179.88   Median :107.28   Median : 752314  
##  Mean   :5129   Mean   :276.76   Mean   :191.05   Mean   :1373488  
##  3rd Qu.:7369   3rd Qu.:437.20   3rd Qu.:263.33   3rd Qu.:2212045  
##  Max.   :9925   Max.   :668.27   Max.   :524.96   Max.   :5997055  
##    Total.Cost       Total.Profit       Order Date           Ship Date         
##  Min.   :   3612   Min.   :   1258   Min.   :2010-02-02   Min.   :2010-02-25  
##  1st Qu.: 168868   1st Qu.: 121444   1st Qu.:2012-02-14   1st Qu.:2012-02-24  
##  Median : 363566   Median : 290768   Median :2013-07-12   Median :2013-08-11  
##  Mean   : 931806   Mean   : 441682   Mean   :2013-09-16   Mean   :2013-10-09  
##  3rd Qu.:1613870   3rd Qu.: 635829   3rd Qu.:2015-04-07   3rd Qu.:2015-04-28  
##  Max.   :4509794   Max.   :1719922   Max.   :2017-05-22   Max.   :2017-06-17
summary(df_50000_large)
##     Region            Country           Item.Type         Sales.Channel     
##  Length:50000       Length:50000       Length:50000       Length:50000      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Order.Priority      Order.Date           Order.ID          Ship.Date        
##  Length:50000       Length:50000       Min.   :100013196   Length:50000      
##  Class :character   Class :character   1st Qu.:324007046   Class :character  
##  Mode  :character   Mode  :character   Median :550422394   Mode  :character  
##                                        Mean   :549733027                     
##                                        3rd Qu.:776782381                     
##                                        Max.   :999999463                     
##    Units.Sold      Unit.Price       Unit.Cost      Total.Revenue    
##  Min.   :    1   Min.   :  9.33   Min.   :  6.92   Min.   :     28  
##  1st Qu.: 2498   1st Qu.: 81.73   1st Qu.: 35.84   1st Qu.: 276487  
##  Median : 5018   Median :154.06   Median : 97.44   Median : 781325  
##  Mean   : 5000   Mean   :265.65   Mean   :187.32   Mean   :1323716  
##  3rd Qu.: 7493   3rd Qu.:421.89   3rd Qu.:263.33   3rd Qu.:1808642  
##  Max.   :10000   Max.   :668.27   Max.   :524.96   Max.   :6682032  
##    Total.Cost       Total.Profit         Order Date        
##  Min.   :     21   Min.   :      7.2   Min.   :2010-01-01  
##  1st Qu.: 160637   1st Qu.:  94150.9   1st Qu.:2011-11-15  
##  Median : 467104   Median : 279536.4   Median :2013-10-09  
##  Mean   : 933157   Mean   : 390558.7   Mean   :2013-10-11  
##  3rd Qu.:1190390   3rd Qu.: 564286.7   3rd Qu.:2015-09-04  
##  Max.   :5249075   Max.   :1738178.4   Max.   :2017-07-28  
##    Ship Date         
##  Min.   :2010-01-02  
##  1st Qu.:2011-12-11  
##  Median :2013-11-02  
##  Mean   :2013-11-05  
##  3rd Qu.:2015-09-30  
##  Max.   :2017-09-16
hist(df_100_small$`Total.Profit`)

hist(df_50000_large$`Total.Profit`)

### Machine-learning algorithms (Algorithm 1: Multiple Linear regression)

Since the sales data set is numeric in nature, I decided to use the supervised machine learning approach of linear regression which is used to generate a numeric prediction. Here, for example the question we pose could be what is the profitability based on units sold, total cost, region and item type. To do this, we use a multiple regression model with units sold, total cost, region and item type as the independent variables and total profit as the dependent variable.

The model we obtain for the smaller dataset of 100 records is

Total Profit = 4.588e+04 + (4.290e+01 * Units.Sold) + (2.163e-01 * Total.Cost) + (4.477e+04 * RegionAustralia and Oceania) + (-4.772e+04 * RegionCentral America and the Caribbean) + (1.701e+04 * RegionEurope) + (1.190e+05 * RegionMiddle East and North Africa) + (-3.704e+04 * RegionNorth America) + (3.986e+03 * RegionSub-Saharan Africa) + (-3.080e+05 * Item.TypeBeverages) + (-7.325e+03 * Item.TypeCereal) + (4.769e+04 * Item.TypeClothes) + (3.265e+05 * Item.TypeCosmetics) + (-3.035e+05 * Item.TypeFruits) + (-7.427e+04 * Item.TypeHousehold) + (-4.952e+05 * Item.TypeMeat) + (-2.683e+05 * Item.TypeOffice Supplies) + (-1.967e+05 * Item.TypePersonal Care) + (-9.106e+04 * Item.TypeSnacks) + (-6.136e+04 * Item.TypeVegetables)

For a fictional units sold of 200 for Clothes with a per unit cost of 100 in the Australian and Oceania region, the profit using the model formula comes out to $146,946 for model 1.

The model we obtain for the larger dataset of 50,000 records is

Total Profit = 1.161e+05 + (3.764e+01 * Units.Sold) + (2.163e-01 * Total.Cost) + (4.219e+03 * RegionAustralia and Oceania) + (1.636e+03 * RegionCentral America and the Caribbean) + (1.178e+03 * RegionEurope) + (3.432e+03 * RegionMiddle East and North Africa) + (-1.296e+03 * RegionNorth America) + (2.888e+03 * RegionSub-Saharan Africa) + (-2.628e+05 * Item.TypeBeverages) + (9.335e+03 * Item.TypeCereal) + (2.177e+04 * Item.TypeClothes) + (2.779e+05 * Item.TypeCosmetics) + (-3.025e+05 * Item.TypeFruits) + (-2.196e+04 * Item.TypeHousehold) + (-4.123e+05 * Item.TypeMeat) + (-2.425e+05 * Item.TypeOffice Supplies) + (-2.449e+05 * Item.TypePersonal Care) + (-1.360e+05 * Item.TypeSnacks) + (-8.900e+04 * Item.TypeVegetables)

For a fictional units sold of 200 for Clothes with a per unit cost of 100 in the Australian and Oceania region, the profit using the model formula comes out to $149,639 for model 2 (close to model 1 value but not the same).Both models fit reasonably well as seen by the R^2 and adjusted R^2.

sales_mod1 <- lm(data = df_100_small, Total.Profit ~ Units.Sold + Total.Cost + Region + Item.Type)

summary(sales_mod1)
## 
## Call:
## lm(formula = Total.Profit ~ Units.Sold + Total.Cost + Region + 
##     Item.Type, data = df_100_small)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -274707  -48314   -1709   59132  192626 
## 
## Coefficients:
##                                           Estimate Std. Error t value Pr(>|t|)
## (Intercept)                              4.588e+04  5.681e+04   0.808  0.42175
## Units.Sold                               4.290e+01  5.468e+00   7.846 1.62e-11
## Total.Cost                               2.575e-01  2.162e-02  11.910  < 2e-16
## RegionAustralia and Oceania              4.477e+04  4.457e+04   1.004  0.31818
## RegionCentral America and the Caribbean -4.772e+04  4.882e+04  -0.978  0.33124
## RegionEurope                             1.701e+04  3.802e+04   0.447  0.65582
## RegionMiddle East and North Africa       1.190e+05  4.464e+04   2.666  0.00928
## RegionNorth America                     -3.704e+04  6.653e+04  -0.557  0.57925
## RegionSub-Saharan Africa                 3.986e+03  3.478e+04   0.115  0.90904
## Item.TypeBeverages                      -3.080e+05  5.418e+04  -5.684 2.06e-07
## Item.TypeCereal                         -7.325e+03  5.495e+04  -0.133  0.89429
## Item.TypeClothes                         4.769e+04  4.912e+04   0.971  0.33460
## Item.TypeCosmetics                       3.265e+05  4.818e+04   6.776 1.90e-09
## Item.TypeFruits                         -3.035e+05  5.291e+04  -5.736 1.66e-07
## Item.TypeHousehold                      -7.427e+04  6.288e+04  -1.181  0.24104
## Item.TypeMeat                           -4.952e+05  8.266e+04  -5.991 5.68e-08
## Item.TypeOffice Supplies                -2.683e+05  5.751e+04  -4.665 1.22e-05
## Item.TypePersonal Care                  -1.967e+05  5.241e+04  -3.753  0.00033
## Item.TypeSnacks                         -9.106e+04  7.023e+04  -1.297  0.19848
## Item.TypeVegetables                     -6.136e+04  5.752e+04  -1.067  0.28931
##                                            
## (Intercept)                                
## Units.Sold                              ***
## Total.Cost                              ***
## RegionAustralia and Oceania                
## RegionCentral America and the Caribbean    
## RegionEurope                               
## RegionMiddle East and North Africa      ** 
## RegionNorth America                        
## RegionSub-Saharan Africa                   
## Item.TypeBeverages                      ***
## Item.TypeCereal                            
## Item.TypeClothes                           
## Item.TypeCosmetics                      ***
## Item.TypeFruits                         ***
## Item.TypeHousehold                         
## Item.TypeMeat                           ***
## Item.TypeOffice Supplies                ***
## Item.TypePersonal Care                  ***
## Item.TypeSnacks                            
## Item.TypeVegetables                        
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 95730 on 80 degrees of freedom
## Multiple R-squared:  0.9615, Adjusted R-squared:  0.9523 
## F-statistic: 105.1 on 19 and 80 DF,  p-value: < 2.2e-16
sales_mod2 <- lm(data = df_50000_large, Total.Profit ~ Units.Sold + Total.Cost + Region + Item.Type)

summary(sales_mod2)
## 
## Call:
## lm(formula = Total.Profit ~ Units.Sold + Total.Cost + Region + 
##     Item.Type, data = df_50000_large)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -397586  -59147     -83   58771  397660 
## 
## Coefficients:
##                                           Estimate Std. Error  t value Pr(>|t|)
## (Intercept)                              1.161e+05  2.152e+03   53.962  < 2e-16
## Units.Sold                               3.764e+01  2.379e-01  158.240  < 2e-16
## Total.Cost                               2.163e-01  9.240e-04  234.117  < 2e-16
## RegionAustralia and Oceania              4.219e+03  2.054e+03    2.054   0.0400
## RegionCentral America and the Caribbean  1.636e+03  1.871e+03    0.874   0.3819
## RegionEurope                             1.178e+03  1.531e+03    0.770   0.4415
## RegionMiddle East and North Africa       3.432e+03  1.811e+03    1.895   0.0581
## RegionNorth America                     -1.296e+03  3.385e+03   -0.383   0.7020
## RegionSub-Saharan Africa                 2.888e+03  1.525e+03    1.894   0.0583
## Item.TypeBeverages                      -2.628e+05  2.380e+03 -110.453  < 2e-16
## Item.TypeCereal                          9.335e+03  2.317e+03    4.029 5.61e-05
## Item.TypeClothes                         2.177e+04  2.376e+03    9.162  < 2e-16
## Item.TypeCosmetics                       2.779e+05  2.351e+03  118.177  < 2e-16
## Item.TypeFruits                         -3.025e+05  2.405e+03 -125.774  < 2e-16
## Item.TypeHousehold                      -2.196e+04  2.794e+03   -7.859 3.95e-15
## Item.TypeMeat                           -4.123e+05  2.483e+03 -166.027  < 2e-16
## Item.TypeOffice Supplies                -2.425e+05  2.858e+03  -84.853  < 2e-16
## Item.TypePersonal Care                  -2.449e+05  2.354e+03 -104.026  < 2e-16
## Item.TypeSnacks                         -1.360e+05  2.324e+03  -58.523  < 2e-16
## Item.TypeVegetables                     -8.900e+04  2.324e+03  -38.299  < 2e-16
##                                            
## (Intercept)                             ***
## Units.Sold                              ***
## Total.Cost                              ***
## RegionAustralia and Oceania             *  
## RegionCentral America and the Caribbean    
## RegionEurope                               
## RegionMiddle East and North Africa      .  
## RegionNorth America                        
## RegionSub-Saharan Africa                .  
## Item.TypeBeverages                      ***
## Item.TypeCereal                         ***
## Item.TypeClothes                        ***
## Item.TypeCosmetics                      ***
## Item.TypeFruits                         ***
## Item.TypeHousehold                      ***
## Item.TypeMeat                           ***
## Item.TypeOffice Supplies                ***
## Item.TypePersonal Care                  ***
## Item.TypeSnacks                         ***
## Item.TypeVegetables                     ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 104700 on 49980 degrees of freedom
## Multiple R-squared:  0.9233, Adjusted R-squared:  0.9232 
## F-statistic: 3.165e+04 on 19 and 49980 DF,  p-value: < 2.2e-16

Machine-learning algorithms (Algorithm 2: Multinomial Logistic regression)

The second type of machine learning algorithm I decided to use was logistic regression. Instead of modeling our response variable directly, as in linear regression, logistic regression models the probability of a particular response value. As determined earlier, there are 14 variables, 6 categorical variables (Region, Country, Item Type, Sales Channel, Order Priority), 2 dates (Order Date, Ship Date) and 6 numeric variables (Units Sold, Unit Price, Unit Cost, Total Revenue, Total Cost and Total Profit). The dataset is relatively clean with no missing numeric values. Here, I use the dependent variable as Sales.Channel

Splitting the Data

Using the sample() base R function that we introduced in Chapter 3, we partition our data into training and test datasets using a 75 percent to 25 percent split. We call the new datasets df_100_small_train and df_100_small_test and df_50000_large_train and df_50000_large_test, respectively. The results show that we do have similar class distributions across all three sets and we do not have a class imbalance problem, especially in the large dataset.

Building the logistic regression model

To train a binomial logistic regression model using the glm() function, we pass three main arguments to it. The first argument (data) is the training data (donors _ train). The second argument (family) is the type of regression model we intend to build. We set it to binomial. This tells the glm() function that we intend to build a binomial logistic regression model using the logit link function. Instead of setting family = binomial, we could also write family = binomial(link = “logit”). The last argument we pass to the function is the formula for the prediction problem. This is where we specify which features (predictors) to use to predict the class (response). For our model, we specify that the function should use all the features in our training set (.) to build a model that predicts Sales.Channel.

Results of the model

In linear regression, we interpreted the model coefficients as the average change in the value of the response as a result of a unit change in a particular predictor. However, in logistic regression, we interpret the model coefficients as the change in the log-odds of the response as a result of a unit change in the predictor variable (see Equation 5.8). For example, a value of -4.623e-06 for the coefficient of RegionAustralia and Oceania means that, for every unit increase in the value of RegionAustralia and Oceania, the log-odds of Sales.Channel being TRUE (Online mode) changes by -4.623e-06.

df_100_small <- df_100_small %>%
mutate(newcode = case_when(
                          (Sales.Channel == 'Offline') ~ '0',
                          (Sales.Channel == 'Online') ~ '1'))            

df_100_small$newcode <- as.numeric(df_100_small$newcode)

df_100_small %>%
keep(is.numeric) %>%
summary()
##     Order.ID           Units.Sold     Unit.Price       Unit.Cost     
##  Min.   :114606559   Min.   : 124   Min.   :  9.33   Min.   :  6.92  
##  1st Qu.:338922488   1st Qu.:2836   1st Qu.: 81.73   1st Qu.: 35.84  
##  Median :557708561   Median :5382   Median :179.88   Median :107.28  
##  Mean   :555020412   Mean   :5129   Mean   :276.76   Mean   :191.05  
##  3rd Qu.:790755081   3rd Qu.:7369   3rd Qu.:437.20   3rd Qu.:263.33  
##  Max.   :994022214   Max.   :9925   Max.   :668.27   Max.   :524.96  
##  Total.Revenue       Total.Cost       Total.Profit        newcode   
##  Min.   :   4870   Min.   :   3612   Min.   :   1258   Min.   :0.0  
##  1st Qu.: 268721   1st Qu.: 168868   1st Qu.: 121444   1st Qu.:0.0  
##  Median : 752314   Median : 363566   Median : 290768   Median :0.5  
##  Mean   :1373488   Mean   : 931806   Mean   : 441682   Mean   :0.5  
##  3rd Qu.:2212045   3rd Qu.:1613870   3rd Qu.: 635829   3rd Qu.:1.0  
##  Max.   :5997055   Max.   :4509794   Max.   :1719922   Max.   :1.0
df_50000_large <- df_50000_large %>%
mutate(newcode = case_when(
                          (Sales.Channel == 'Offline') ~ '0',
                          (Sales.Channel == 'Online') ~ '1'))

df_50000_large$newcode <- as.numeric(df_50000_large$newcode)

df_50000_large %>%
keep(is.numeric) %>%
summary()
##     Order.ID           Units.Sold      Unit.Price       Unit.Cost     
##  Min.   :100013196   Min.   :    1   Min.   :  9.33   Min.   :  6.92  
##  1st Qu.:324007046   1st Qu.: 2498   1st Qu.: 81.73   1st Qu.: 35.84  
##  Median :550422394   Median : 5018   Median :154.06   Median : 97.44  
##  Mean   :549733027   Mean   : 5000   Mean   :265.65   Mean   :187.32  
##  3rd Qu.:776782381   3rd Qu.: 7493   3rd Qu.:421.89   3rd Qu.:263.33  
##  Max.   :999999463   Max.   :10000   Max.   :668.27   Max.   :524.96  
##  Total.Revenue       Total.Cost       Total.Profit          newcode      
##  Min.   :     28   Min.   :     21   Min.   :      7.2   Min.   :0.0000  
##  1st Qu.: 276487   1st Qu.: 160637   1st Qu.:  94150.9   1st Qu.:0.0000  
##  Median : 781325   Median : 467104   Median : 279536.4   Median :1.0000  
##  Mean   :1323716   Mean   : 933157   Mean   : 390558.7   Mean   :0.5007  
##  3rd Qu.:1808642   3rd Qu.:1190390   3rd Qu.: 564286.7   3rd Qu.:1.0000  
##  Max.   :6682032   Max.   :5249075   Max.   :1738178.4   Max.   :1.0000
set.seed(1234)

sample_set_small <- sample(nrow(df_100_small), round(nrow(df_100_small)*.75), replace = FALSE)
df_100_small_train <- df_100_small[sample_set_small, ]
df_100_small_test <- df_100_small[-sample_set_small, ]

round(prop.table(table(select(df_100_small, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##      50      50
round(prop.table(table(select(df_100_small_train, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##      48      52
round(prop.table(table(select(df_100_small_test, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##      56      44
set.seed(1234)

sample_set_large <- sample(nrow(df_50000_large), round(nrow(df_50000_large)*.75), replace = FALSE)
df_50000_large_train <- df_50000_large[sample_set_large, ]
df_50000_large_test <- df_50000_large[-sample_set_large, ]

round(prop.table(table(select(df_50000_large, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##   49.93   50.07
round(prop.table(table(select(df_50000_large_train, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##   49.89   50.11
round(prop.table(table(select(df_50000_large_test, Sales.Channel), exclude = NULL)), 4) * 100
## Sales.Channel
## Offline  Online 
##   50.07   49.93
df_100_small_mod <- glm(data = df_100_small_train, family = binomial, formula = newcode ~ .)
summary(df_100_small_mod)
## 
## Call:
## glm(formula = newcode ~ ., family = binomial, data = df_100_small_train)
## 
## Coefficients: (162 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                             -2.657e+01  1.468e+06       0        1
## RegionAustralia and Oceania             -4.623e-06  1.424e+06       0        1
## RegionCentral America and the Caribbean -3.941e-13  8.723e+05       0        1
## RegionEurope                            -1.963e-07  2.195e+06       0        1
## RegionMiddle East and North Africa      -3.856e-07  8.723e+05       0        1
## RegionNorth America                      4.618e-06  2.518e+06       0        1
## RegionSub-Saharan Africa                -4.410e-06  1.007e+06       0        1
## CountryAngola                            4.410e-06  8.723e+05       0        1
## CountryAustralia                        -4.619e-06  3.901e+06       0        1
## CountryAustria                           2.014e-07  1.424e+06       0        1
## CountryAzerbaijan                        3.903e-07  8.723e+05       0        1
## CountryBulgaria                         -2.765e-09  2.195e+06       0        1
## CountryBurkina Faso                     -4.823e-06  2.937e+06       0        1
## CountryCameroon                          4.208e-06  1.126e+06       0        1
## CountryCape Verde                        4.412e-06  1.126e+06       0        1
## CountryComoros                          -2.124e-07  1.816e+06       0        1
## CountryCote d'Ivoire                     9.030e-06  1.816e+06       0        1
## CountryDemocratic Republic of the Congo -4.116e-07  4.643e+06       0        1
## CountryDjibouti                          4.415e-06  7.122e+05       0        1
## CountryEast Timor                        7.809e-09  1.007e+06       0        1
## CountryFederated States of Micronesia   -2.050e-07  3.341e+06       0        1
## CountryFiji                              9.243e-06  4.419e+06       0        1
## CountryFrance                           -8.001e-10  2.467e+06       0        1
## CountryGabon                             9.029e-06  1.884e+06       0        1
## CountryGrenada                          -2.003e-07  1.332e+06       0        1
## CountryHaiti                             5.101e-09  8.723e+05       0        1
## CountryHonduras                                 NA         NA      NA       NA
## CountryIceland                           2.067e-07  8.723e+05       0        1
## CountryKenya                            -4.075e-07  3.264e+06       0        1
## CountryKiribati                          4.224e-06  1.745e+06       0        1
## CountryKuwait                           -1.201e-08  7.122e+05       0        1
## CountryLaos                             -4.432e-09  1.511e+06       0        1
## CountryLesotho                           4.227e-06  1.332e+06       0        1
## CountryLibya                            -4.417e-06  5.036e+05       0        1
## CountryMacedonia                         4.816e-06  1.511e+06       0        1
## CountryMadagascar                        4.412e-06  1.126e+06       0        1
## CountryMali                              4.223e-06  1.234e+06       0        1
## CountryMauritania                        9.024e-06  2.937e+06       0        1
## CountryMexico                                   NA         NA      NA       NA
## CountryMoldova                           4.610e-06  1.424e+06       0        1
## CountryMonaco                           -4.426e-06  3.225e+06       0        1
## CountryMyanmar                          -1.995e-07  8.723e+05       0        1
## CountryNiger                             8.834e-06  1.007e+06       0        1
## CountryNorway                           -4.427e-06  4.671e+06       0        1
## CountryPortugal                         -9.045e-06  4.419e+06       0        1
## CountryRepublic of the Congo             9.029e-06  2.195e+06       0        1
## CountryRomania                          -4.418e-06  2.467e+06       0        1
## CountryRussia                            1.919e-07  1.424e+06       0        1
## CountryRwanda                            4.406e-06  1.007e+06       0        1
## CountrySamoa                             4.423e-06  1.332e+06       0        1
## CountrySan Marino                       -4.424e-06  4.671e+06       0        1
## CountrySao Tome and Principe             8.635e-06  2.137e+06       0        1
## CountrySenegal                          -4.185e-07  2.980e+06       0        1
## CountrySierra Leone                      4.406e-06  1.007e+06       0        1
## CountrySlovenia                         -4.427e-06  4.698e+06       0        1
## CountrySolomon Islands                   4.617e-06  1.234e+06       0        1
## CountrySouth Sudan                       1.365e-05  4.214e+06       0        1
## CountrySpain                             1.963e-07  2.195e+06       0        1
## CountrySri Lanka                         5.101e-09  1.332e+06       0        1
## CountrySwitzerland                       2.014e-07  1.234e+06       0        1
## CountrySyria                                    NA         NA      NA       NA
## CountryThe Gambia                       -2.124e-07  1.745e+06       0        1
## CountryTurkmenistan                             NA         NA      NA       NA
## CountryTuvalu                                   NA         NA      NA       NA
## CountryZambia                                   NA         NA      NA       NA
## Item.TypeBeverages                       4.619e-06  3.868e+06       0        1
## Item.TypeCereal                         -8.763e-14  7.122e+05       0        1
## Item.TypeClothes                        -4.624e-06  2.252e+06       0        1
## Item.TypeCosmetics                      -4.628e-06  2.137e+06       0        1
## Item.TypeFruits                         -4.434e-06  2.617e+06       0        1
## Item.TypeHousehold                      -4.623e-06  1.424e+06       0        1
## Item.TypeMeat                           -4.623e-06  1.511e+06       0        1
## Item.TypeOffice Supplies                -4.618e-06  2.195e+06       0        1
## Item.TypePersonal Care                  -9.241e-06  3.525e+06       0        1
## Item.TypeSnacks                         -4.824e-06  3.225e+06       0        1
## Item.TypeVegetables                             NA         NA      NA       NA
## Sales.ChannelOnline                      5.313e+01  1.424e+06       0        1
## Order.PriorityH                          4.618e-06  2.195e+06       0        1
## Order.PriorityL                          4.618e-06  2.467e+06       0        1
## Order.PriorityM                          4.618e-06  2.137e+06       0        1
## Order.Date1/14/2017                             NA         NA      NA       NA
## Order.Date1/4/2011                              NA         NA      NA       NA
## Order.Date10/11/2013                     9.532e-09  1.007e+06       0        1
## Order.Date10/13/2013                            NA         NA      NA       NA
## Order.Date10/13/2014                            NA         NA      NA       NA
## Order.Date10/14/2014                            NA         NA      NA       NA
## Order.Date10/21/2012                            NA         NA      NA       NA
## Order.Date10/23/2016                            NA         NA      NA       NA
## Order.Date10/28/2014                            NA         NA      NA       NA
## Order.Date10/30/2010                     4.804e-06  1.670e+06       0        1
## Order.Date11/14/2015                            NA         NA      NA       NA
## Order.Date11/19/2016                            NA         NA      NA       NA
## Order.Date11/22/2011                            NA         NA      NA       NA
## Order.Date11/26/2010                            NA         NA      NA       NA
## Order.Date11/26/2011                            NA         NA      NA       NA
## Order.Date11/6/2014                             NA         NA      NA       NA
## Order.Date11/7/2011                             NA         NA      NA       NA
## Order.Date12/23/2010                            NA         NA      NA       NA
## Order.Date12/29/2013                            NA         NA      NA       NA
## Order.Date12/30/2010                            NA         NA      NA       NA
## Order.Date12/31/2016                            NA         NA      NA       NA
## Order.Date12/6/2016                             NA         NA      NA       NA
## Order.Date2/1/2013                              NA         NA      NA       NA
## Order.Date2/16/2012                             NA         NA      NA       NA
## Order.Date2/17/2012                             NA         NA      NA       NA
## Order.Date2/2/2010                              NA         NA      NA       NA
## Order.Date2/23/2015                             NA         NA      NA       NA
## Order.Date2/25/2017                             NA         NA      NA       NA
## Order.Date2/3/2014                              NA         NA      NA       NA
## Order.Date2/4/2015                              NA         NA      NA       NA
## Order.Date2/6/2010                              NA         NA      NA       NA
## Order.Date2/8/2017                              NA         NA      NA       NA
## Order.Date3/11/2017                             NA         NA      NA       NA
## Order.Date3/18/2012                             NA         NA      NA       NA
## Order.Date3/29/2016                             NA         NA      NA       NA
## Order.Date4/18/2014                             NA         NA      NA       NA
## Order.Date4/23/2011                             NA         NA      NA       NA
## Order.Date4/23/2012                             NA         NA      NA       NA
## Order.Date4/23/2013                             NA         NA      NA       NA
## Order.Date4/25/2015                             NA         NA      NA       NA
## Order.Date4/30/2012                             NA         NA      NA       NA
## Order.Date4/7/2014                              NA         NA      NA       NA
## Order.Date5/14/2014                             NA         NA      NA       NA
## Order.Date5/2/2014                              NA         NA      NA       NA
## Order.Date5/22/2017                             NA         NA      NA       NA
## Order.Date5/26/2011                             NA         NA      NA       NA
## Order.Date5/28/2010                             NA         NA      NA       NA
## Order.Date5/29/2012                             NA         NA      NA       NA
## Order.Date5/7/2010                              NA         NA      NA       NA
## Order.Date5/7/2016                              NA         NA      NA       NA
## Order.Date6/1/2016                              NA         NA      NA       NA
## Order.Date6/13/2012                             NA         NA      NA       NA
## Order.Date6/20/2014                             NA         NA      NA       NA
## Order.Date6/26/2013                             NA         NA      NA       NA
## Order.Date6/30/2010                             NA         NA      NA       NA
## Order.Date6/30/2016                             NA         NA      NA       NA
## Order.Date6/7/2012                              NA         NA      NA       NA
## Order.Date6/8/2012                              NA         NA      NA       NA
## Order.Date7/14/2015                             NA         NA      NA       NA
## Order.Date7/17/2012                             NA         NA      NA       NA
## Order.Date7/18/2014                             NA         NA      NA       NA
## Order.Date7/20/2013                             NA         NA      NA       NA
## Order.Date7/26/2011                             NA         NA      NA       NA
## Order.Date7/30/2015                             NA         NA      NA       NA
## Order.Date7/31/2012                             NA         NA      NA       NA
## Order.Date7/31/2015                             NA         NA      NA       NA
## Order.Date7/7/2014                              NA         NA      NA       NA
## Order.Date7/8/2012                              NA         NA      NA       NA
## Order.Date8/14/2015                             NA         NA      NA       NA
## Order.Date8/18/2013                             NA         NA      NA       NA
## Order.Date8/2/2014                              NA         NA      NA       NA
## Order.Date8/22/2012                             NA         NA      NA       NA
## Order.Date9/15/2011                             NA         NA      NA       NA
## Order.Date9/17/2012                             NA         NA      NA       NA
## Order.ID                                        NA         NA      NA       NA
## Ship.Date1/20/2011                              NA         NA      NA       NA
## Ship.Date1/23/2017                              NA         NA      NA       NA
## Ship.Date1/28/2014                              NA         NA      NA       NA
## Ship.Date1/31/2011                              NA         NA      NA       NA
## Ship.Date1/5/2011                               NA         NA      NA       NA
## Ship.Date1/7/2012                               NA         NA      NA       NA
## Ship.Date10/20/2012                             NA         NA      NA       NA
## Ship.Date10/23/2011                             NA         NA      NA       NA
## Ship.Date11/10/2014                             NA         NA      NA       NA
## Ship.Date11/14/2014                             NA         NA      NA       NA
## Ship.Date11/15/2011                             NA         NA      NA       NA
## Ship.Date11/15/2014                             NA         NA      NA       NA
## Ship.Date11/16/2013                             NA         NA      NA       NA
## Ship.Date11/17/2010                             NA         NA      NA       NA
## Ship.Date11/18/2015                             NA         NA      NA       NA
## Ship.Date11/25/2013                             NA         NA      NA       NA
## Ship.Date11/25/2016                             NA         NA      NA       NA
## Ship.Date11/30/2012                             NA         NA      NA       NA
## Ship.Date12/12/2014                             NA         NA      NA       NA
## Ship.Date12/14/2016                             NA         NA      NA       NA
## Ship.Date12/18/2016                             NA         NA      NA       NA
## Ship.Date12/25/2010                             NA         NA      NA       NA
## Ship.Date12/3/2011                              NA         NA      NA       NA
## Ship.Date12/31/2016                             NA         NA      NA       NA
## Ship.Date2/13/2017                              NA         NA      NA       NA
## Ship.Date2/21/2015                              NA         NA      NA       NA
## Ship.Date2/25/2010                              NA         NA      NA       NA
## Ship.Date2/25/2017                              NA         NA      NA       NA
## Ship.Date2/28/2012                              NA         NA      NA       NA
## Ship.Date2/6/2013                               NA         NA      NA       NA
## Ship.Date3/18/2010                              NA         NA      NA       NA
## Ship.Date3/2/2015                               NA         NA      NA       NA
## Ship.Date3/20/2012                              NA         NA      NA       NA
## Ship.Date3/20/2014                              NA         NA      NA       NA
## Ship.Date3/28/2017                              NA         NA      NA       NA
## Ship.Date4/19/2014                              NA         NA      NA       NA
## Ship.Date4/27/2011                              NA         NA      NA       NA
## Ship.Date4/29/2016                              NA         NA      NA       NA
## Ship.Date4/7/2012                               NA         NA      NA       NA
## Ship.Date5/10/2010                              NA         NA      NA       NA
## Ship.Date5/10/2016                              NA         NA      NA       NA
## Ship.Date5/18/2012                              NA         NA      NA       NA
## Ship.Date5/20/2013                              NA         NA      NA       NA
## Ship.Date5/28/2015                              NA         NA      NA       NA
## Ship.Date5/30/2014                              NA         NA      NA       NA
## Ship.Date5/8/2014                               NA         NA      NA       NA
## Ship.Date6/2/2012                               NA         NA      NA       NA
## Ship.Date6/27/2010                              NA         NA      NA       NA
## Ship.Date6/27/2012                              NA         NA      NA       NA
## Ship.Date6/28/2014                              NA         NA      NA       NA
## Ship.Date6/29/2016                              NA         NA      NA       NA
## Ship.Date6/3/2012                               NA         NA      NA       NA
## Ship.Date6/5/2017                               NA         NA      NA       NA
## Ship.Date6/8/2012                               NA         NA      NA       NA
## Ship.Date7/1/2013                               NA         NA      NA       NA
## Ship.Date7/11/2014                              NA         NA      NA       NA
## Ship.Date7/15/2011                              NA         NA      NA       NA
## Ship.Date7/24/2012                              NA         NA      NA       NA
## Ship.Date7/26/2016                              NA         NA      NA       NA
## Ship.Date7/27/2012                              NA         NA      NA       NA
## Ship.Date7/30/2014                              NA         NA      NA       NA
## Ship.Date7/5/2014                               NA         NA      NA       NA
## Ship.Date7/9/2012                               NA         NA      NA       NA
## Ship.Date8/1/2010                               NA         NA      NA       NA
## Ship.Date8/19/2014                              NA         NA      NA       NA
## Ship.Date8/25/2015                              NA         NA      NA       NA
## Ship.Date8/7/2013                               NA         NA      NA       NA
## Ship.Date8/8/2015                               NA         NA      NA       NA
## Ship.Date9/11/2012                              NA         NA      NA       NA
## Ship.Date9/15/2012                              NA         NA      NA       NA
## Ship.Date9/18/2013                              NA         NA      NA       NA
## Ship.Date9/3/2011                               NA         NA      NA       NA
## Ship.Date9/3/2015                               NA         NA      NA       NA
## Ship.Date9/30/2015                              NA         NA      NA       NA
## Units.Sold                                      NA         NA      NA       NA
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Revenue                                   NA         NA      NA       NA
## Total.Cost                                      NA         NA      NA       NA
## Total.Profit                                    NA         NA      NA       NA
## `Order Date`                                    NA         NA      NA       NA
## `Ship Date`                                     NA         NA      NA       NA
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1.0385e+02  on 74  degrees of freedom
## Residual deviance: 4.3512e-10  on  0  degrees of freedom
## AIC: 150
## 
## Number of Fisher Scoring iterations: 25
df_50000_large_mod <- glm(newcode ~ Region + Country + Item.Type + Order.Priority + Units.Sold + Unit.Price + Unit.Cost + Total.Cost + Total.Profit + Total.Revenue, data  = df_50000_large_train, family=binomial)
summary(df_50000_large_mod)
## 
## Call:
## glm(formula = newcode ~ Region + Country + Item.Type + Order.Priority + 
##     Units.Sold + Unit.Price + Unit.Cost + Total.Cost + Total.Profit + 
##     Total.Revenue, family = binomial, data = df_50000_large_train)
## 
## Coefficients: (9 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                              2.374e-02  1.464e-01   0.162  0.87118
## RegionAustralia and Oceania              1.535e-02  2.013e-01   0.076  0.93920
## RegionCentral America and the Caribbean -7.638e-02  1.892e-01  -0.404  0.68647
## RegionEurope                             1.417e-01  1.984e-01   0.715  0.47487
## RegionMiddle East and North Africa       7.617e-02  2.010e-01   0.379  0.70478
## RegionNorth America                     -9.866e-02  1.968e-01  -0.501  0.61613
## RegionSub-Saharan Africa                -1.229e-01  2.028e-01  -0.606  0.54449
## CountryAlbania                          -2.136e-01  1.982e-01  -1.078  0.28122
## CountryAlgeria                          -2.131e-01  2.018e-01  -1.056  0.29088
## CountryAndorra                           1.789e-02  1.996e-01   0.090  0.92858
## CountryAngola                            1.572e-01  2.011e-01   0.782  0.43421
## CountryAntigua and Barbuda              -3.808e-02  1.859e-01  -0.205  0.83766
## CountryArmenia                           1.403e-02  2.063e-01   0.068  0.94578
## CountryAustralia                         8.874e-03  1.997e-01   0.044  0.96455
## CountryAustria                          -5.898e-02  1.968e-01  -0.300  0.76443
## CountryAzerbaijan                        2.470e-01  2.021e-01   1.222  0.22155
## CountryBahrain                          -1.895e-03  1.988e-01  -0.010  0.99240
## CountryBangladesh                        8.350e-02  1.943e-01   0.430  0.66732
## CountryBarbados                          1.415e-02  1.918e-01   0.074  0.94119
## CountryBelarus                          -3.705e-01  2.040e-01  -1.817  0.06929
## CountryBelgium                          -1.249e-01  1.970e-01  -0.634  0.52600
## CountryBelize                            4.445e-02  1.927e-01   0.231  0.81755
## CountryBenin                             2.852e-01  2.061e-01   1.384  0.16646
## CountryBhutan                            3.837e-02  2.024e-01   0.190  0.84962
## CountryBosnia and Herzegovina           -4.190e-01  2.048e-01  -2.045  0.04081
## CountryBotswana                          8.647e-03  2.030e-01   0.043  0.96603
## CountryBrunei                            2.173e-01  2.018e-01   1.077  0.28156
## CountryBulgaria                         -2.880e-01  1.998e-01  -1.441  0.14949
## CountryBurkina Faso                      1.669e-01  2.041e-01   0.818  0.41329
## CountryBurundi                           5.060e-02  2.046e-01   0.247  0.80466
## CountryCambodia                         -2.868e-02  1.979e-01  -0.145  0.88478
## CountryCameroon                         -2.309e-02  2.055e-01  -0.112  0.91055
## CountryCanada                            1.184e-01  1.998e-01   0.592  0.55358
## CountryCape Verde                       -9.923e-02  1.989e-01  -0.499  0.61788
## CountryCentral African Republic          5.488e-01  2.023e-01   2.713  0.00667
## CountryChad                              1.067e-01  2.033e-01   0.525  0.59971
## CountryChina                            -6.176e-02  1.922e-01  -0.321  0.74793
## CountryComoros                           2.778e-02  2.089e-01   0.133  0.89423
## CountryCosta Rica                        6.503e-02  1.968e-01   0.330  0.74108
## CountryCote d'Ivoire                     3.310e-01  2.108e-01   1.570  0.11645
## CountryCroatia                          -1.324e-01  2.006e-01  -0.660  0.50912
## CountryCuba                              2.854e-01  1.929e-01   1.479  0.13901
## CountryCyprus                           -5.920e-02  1.968e-01  -0.301  0.76360
## CountryCzech Republic                   -1.109e-01  2.057e-01  -0.539  0.58960
## CountryDemocratic Republic of the Congo  2.654e-01  1.980e-01   1.340  0.18012
## CountryDenmark                           7.771e-02  1.978e-01   0.393  0.69439
## CountryDjibouti                         -6.464e-02  2.015e-01  -0.321  0.74833
## CountryDominica                          1.515e-01  1.944e-01   0.779  0.43578
## CountryDominican Republic                3.748e-02  1.868e-01   0.201  0.84098
## CountryEast Timor                        3.410e-02  2.015e-01   0.169  0.86564
## CountryEgypt                             6.554e-02  2.001e-01   0.327  0.74330
## CountryEl Salvador                       9.484e-02  1.958e-01   0.484  0.62818
## CountryEquatorial Guinea                -6.020e-02  2.047e-01  -0.294  0.76869
## CountryEritrea                           2.777e-01  2.064e-01   1.346  0.17833
## CountryEstonia                          -1.138e-01  1.996e-01  -0.570  0.56853
## CountryEthiopia                          2.783e-01  2.058e-01   1.352  0.17633
## CountryFederated States of Micronesia    6.232e-02  2.062e-01   0.302  0.76246
## CountryFiji                              1.460e-01  2.046e-01   0.714  0.47539
## CountryFinland                          -1.780e-01  1.924e-01  -0.925  0.35488
## CountryFrance                            2.870e-02  1.927e-01   0.149  0.88161
## CountryGabon                             2.643e-01  1.992e-01   1.327  0.18455
## CountryGeorgia                          -1.176e-01  2.024e-01  -0.581  0.56135
## CountryGermany                          -1.508e-01  2.001e-01  -0.754  0.45112
## CountryGhana                            -5.567e-02  2.059e-01  -0.270  0.78689
## CountryGreece                           -7.896e-02  2.025e-01  -0.390  0.69661
## CountryGreenland                        -6.347e-03  1.957e-01  -0.032  0.97413
## CountryGrenada                           2.254e-01  1.851e-01   1.218  0.22325
## CountryGuatemala                        -7.898e-02  1.905e-01  -0.415  0.67849
## CountryGuinea                            4.396e-01  1.979e-01   2.221  0.02635
## CountryGuinea-Bissau                     9.889e-02  2.069e-01   0.478  0.63258
## CountryHaiti                             1.128e-01  1.981e-01   0.570  0.56899
## CountryHonduras                         -5.709e-03  1.876e-01  -0.030  0.97572
## CountryHungary                          -6.010e-02  2.020e-01  -0.298  0.76605
## CountryIceland                          -3.084e-01  2.023e-01  -1.525  0.12733
## CountryIndia                             1.160e-01  2.062e-01   0.563  0.57358
## CountryIndonesia                         5.622e-02  1.977e-01   0.284  0.77610
## CountryIran                              3.026e-02  2.075e-01   0.146  0.88403
## CountryIraq                             -2.236e-01  1.978e-01  -1.131  0.25822
## CountryIreland                          -6.117e-02  1.964e-01  -0.312  0.75542
## CountryIsrael                           -2.567e-01  2.063e-01  -1.244  0.21354
## CountryItaly                             3.288e-02  1.962e-01   0.168  0.86694
## CountryJamaica                           1.057e-01  1.915e-01   0.552  0.58110
## CountryJapan                            -1.562e-01  2.051e-01  -0.762  0.44634
## CountryJordan                            1.635e-01  2.075e-01   0.788  0.43059
## CountryKazakhstan                       -3.814e-02  2.036e-01  -0.187  0.85140
## CountryKenya                             7.189e-02  2.036e-01   0.353  0.72401
## CountryKiribati                         -1.466e-01  2.020e-01  -0.726  0.46793
## CountryKosovo                           -2.232e-01  2.020e-01  -1.105  0.26919
## CountryKuwait                            2.410e-02  1.996e-01   0.121  0.90388
## CountryKyrgyzstan                        3.586e-01  2.010e-01   1.784  0.07446
## CountryLaos                             -1.179e-01  2.017e-01  -0.585  0.55869
## CountryLatvia                           -1.653e-01  1.970e-01  -0.839  0.40122
## CountryLebanon                          -9.530e-02  2.025e-01  -0.471  0.63795
## CountryLesotho                           3.547e-01  2.025e-01   1.751  0.07992
## CountryLiberia                           1.925e-01  2.032e-01   0.948  0.34330
## CountryLibya                            -5.811e-02  2.006e-01  -0.290  0.77201
## CountryLiechtenstein                    -3.346e-01  2.029e-01  -1.649  0.09919
## CountryLithuania                        -2.707e-01  1.963e-01  -1.379  0.16790
## CountryLuxembourg                       -2.015e-01  1.989e-01  -1.013  0.31093
## CountryMacedonia                        -3.071e-01  1.960e-01  -1.567  0.11708
## CountryMadagascar                        2.436e-01  2.028e-01   1.201  0.22973
## CountryMalawi                            1.040e-01  2.097e-01   0.496  0.62000
## CountryMalaysia                          8.798e-03  1.960e-01   0.045  0.96419
## CountryMaldives                         -2.596e-02  1.925e-01  -0.135  0.89272
## CountryMali                              1.755e-02  2.037e-01   0.086  0.93133
## CountryMalta                            -9.153e-02  1.936e-01  -0.473  0.63638
## CountryMarshall Islands                  2.568e-01  2.057e-01   1.249  0.21177
## CountryMauritania                       -1.121e-01  2.010e-01  -0.558  0.57693
## CountryMauritius                         8.079e-02  2.024e-01   0.399  0.68979
## CountryMexico                           -3.733e-02  1.936e-01  -0.193  0.84712
## CountryMoldova                           3.828e-02  2.002e-01   0.191  0.84838
## CountryMonaco                            3.742e-02  2.086e-01   0.179  0.85762
## CountryMongolia                         -1.339e-01  1.991e-01  -0.673  0.50120
## CountryMontenegro                       -1.794e-02  2.000e-01  -0.090  0.92855
## CountryMorocco                           1.341e-01  2.051e-01   0.654  0.51316
## CountryMozambique                       -2.576e-02  2.018e-01  -0.128  0.89841
## CountryMyanmar                          -1.139e-01  1.990e-01  -0.572  0.56700
## CountryNamibia                           4.303e-01  1.991e-01   2.161  0.03067
## CountryNauru                             1.113e-02  2.025e-01   0.055  0.95618
## CountryNepal                             2.864e-01  2.055e-01   1.393  0.16354
## CountryNetherlands                      -2.393e-01  1.943e-01  -1.232  0.21803
## CountryNew Zealand                      -1.524e-01  2.053e-01  -0.742  0.45789
## CountryNicaragua                        -7.772e-02  1.895e-01  -0.410  0.68171
## CountryNiger                             7.651e-02  2.069e-01   0.370  0.71148
## CountryNigeria                           1.623e-01  2.024e-01   0.802  0.42262
## CountryNorth Korea                      -5.370e-02  1.989e-01  -0.270  0.78716
## CountryNorway                           -1.441e-01  1.965e-01  -0.734  0.46322
## CountryOman                             -3.809e-02  2.040e-01  -0.187  0.85191
## CountryPakistan                         -7.631e-02  2.079e-01  -0.367  0.71360
## CountryPalau                            -2.007e-02  2.050e-01  -0.098  0.92204
## CountryPanama                           -2.233e-01  1.884e-01  -1.185  0.23583
## CountryPapua New Guinea                  8.896e-02  2.060e-01   0.432  0.66582
## CountryPhilippines                       1.891e-01  2.006e-01   0.942  0.34597
## CountryPoland                            1.445e-01  2.038e-01   0.709  0.47829
## CountryPortugal                         -2.381e-01  2.034e-01  -1.171  0.24171
## CountryQatar                            -1.886e-02  2.001e-01  -0.094  0.92494
## CountryRepublic of the Congo             1.046e-01  2.074e-01   0.505  0.61388
## CountryRomania                          -2.981e-01  2.002e-01  -1.489  0.13648
## CountryRussia                           -7.115e-02  1.986e-01  -0.358  0.72020
## CountryRwanda                            3.024e-01  2.037e-01   1.484  0.13775
## CountrySaint Kitts and Nevis             1.877e-01  1.861e-01   1.009  0.31314
## CountrySaint Lucia                       1.496e-02  1.857e-01   0.081  0.93581
## CountrySaint Vincent and the Grenadines -8.930e-02  1.922e-01  -0.465  0.64211
## CountrySamoa                            -1.442e-02  1.934e-01  -0.075  0.94058
## CountrySan Marino                       -2.398e-01  1.975e-01  -1.214  0.22484
## CountrySao Tome and Principe             1.649e-02  1.987e-01   0.083  0.93386
## CountrySaudi Arabia                      7.664e-02  1.958e-01   0.391  0.69552
## CountrySenegal                           1.306e-01  2.026e-01   0.644  0.51934
## CountrySerbia                           -5.147e-01  2.013e-01  -2.557  0.01056
## CountrySeychelles                        3.801e-02  2.009e-01   0.189  0.84996
## CountrySierra Leone                      1.626e-01  2.024e-01   0.803  0.42175
## CountrySingapore                         5.478e-02  1.924e-01   0.285  0.77580
## CountrySlovakia                         -9.291e-02  1.991e-01  -0.467  0.64078
## CountrySlovenia                         -2.897e-01  2.027e-01  -1.429  0.15302
## CountrySolomon Islands                   7.083e-02  2.025e-01   0.350  0.72655
## CountrySomalia                           3.343e-02  1.998e-01   0.167  0.86713
## CountrySouth Africa                     -9.526e-02  2.012e-01  -0.473  0.63592
## CountrySouth Korea                       1.429e-01  1.967e-01   0.726  0.46761
## CountrySouth Sudan                      -3.754e-02  2.011e-01  -0.187  0.85195
## CountrySpain                            -3.600e-01  2.028e-01  -1.776  0.07581
## CountrySri Lanka                         4.060e-02  1.962e-01   0.207  0.83609
## CountrySudan                             9.619e-02  2.000e-01   0.481  0.63054
## CountrySwaziland                         5.879e-02  2.086e-01   0.282  0.77804
## CountrySweden                           -9.805e-02  2.041e-01  -0.480  0.63100
## CountrySwitzerland                      -2.209e-01  2.034e-01  -1.086  0.27735
## CountrySyria                            -1.902e-01  2.075e-01  -0.917  0.35921
## CountryTaiwan                           -1.159e-01  1.959e-01  -0.592  0.55391
## CountryTajikistan                        8.166e-02  1.977e-01   0.413  0.67957
## CountryTanzania                         -2.997e-01  2.100e-01  -1.427  0.15351
## CountryThailand                          7.130e-02  1.932e-01   0.369  0.71210
## CountryThe Bahamas                       3.842e-01  1.899e-01   2.023  0.04308
## CountryThe Gambia                       -8.433e-02  2.015e-01  -0.418  0.67560
## CountryTogo                              1.355e-01  2.082e-01   0.651  0.51524
## CountryTonga                            -1.714e-01  2.043e-01  -0.839  0.40142
## CountryTrinidad and Tobago                      NA         NA      NA       NA
## CountryTunisia                          -1.947e-01  2.012e-01  -0.968  0.33321
## CountryTurkey                           -2.739e-01  2.100e-01  -1.305  0.19197
## CountryTurkmenistan                     -9.191e-02  1.980e-01  -0.464  0.64247
## CountryTuvalu                            4.688e-02  2.023e-01   0.232  0.81672
## CountryUganda                            2.067e-01  2.018e-01   1.024  0.30568
## CountryUkraine                          -3.154e-01  1.958e-01  -1.611  0.10722
## CountryUnited Arab Emirates             -7.991e-02  1.984e-01  -0.403  0.68706
## CountryUnited Kingdom                   -1.020e-01  2.009e-01  -0.508  0.61142
## CountryUnited States of America                 NA         NA      NA       NA
## CountryUzbekistan                        5.642e-02  1.953e-01   0.289  0.77271
## CountryVanuatu                                  NA         NA      NA       NA
## CountryVatican City                             NA         NA      NA       NA
## CountryVietnam                                  NA         NA      NA       NA
## CountryYemen                             2.538e-01  2.126e-01   1.194  0.23257
## CountryZambia                            9.347e-02  2.051e-01   0.456  0.64853
## CountryZimbabwe                                 NA         NA      NA       NA
## Item.TypeBeverages                      -3.091e-02  5.893e-02  -0.524  0.60000
## Item.TypeCereal                         -5.367e-02  5.149e-02  -1.042  0.29726
## Item.TypeClothes                        -8.740e-02  5.289e-02  -1.652  0.09848
## Item.TypeCosmetics                      -9.111e-02  5.917e-02  -1.540  0.12364
## Item.TypeFruits                         -5.489e-02  6.107e-02  -0.899  0.36882
## Item.TypeHousehold                      -4.888e-02  6.193e-02  -0.789  0.42990
## Item.TypeMeat                           -8.891e-02  6.876e-02  -1.293  0.19595
## Item.TypeOffice Supplies                -2.834e-02  6.752e-02  -0.420  0.67468
## Item.TypePersonal Care                  -4.829e-02  5.755e-02  -0.839  0.40144
## Item.TypeSnacks                         -1.933e-02  5.331e-02  -0.363  0.71691
## Item.TypeVegetables                     -7.770e-03  5.247e-02  -0.148  0.88227
## Order.PriorityH                          1.668e-02  2.938e-02   0.568  0.57031
## Order.PriorityL                          4.043e-05  2.933e-02   0.001  0.99890
## Order.PriorityM                         -3.654e-02  2.934e-02  -1.245  0.21303
## Units.Sold                               4.020e-06  6.451e-06   0.623  0.53312
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Cost                              -1.228e-08  2.969e-08  -0.414  0.67917
## Total.Profit                             5.080e-08  9.919e-08   0.512  0.60854
## Total.Revenue                                   NA         NA      NA       NA
##                                           
## (Intercept)                               
## RegionAustralia and Oceania               
## RegionCentral America and the Caribbean   
## RegionEurope                              
## RegionMiddle East and North Africa        
## RegionNorth America                       
## RegionSub-Saharan Africa                  
## CountryAlbania                            
## CountryAlgeria                            
## CountryAndorra                            
## CountryAngola                             
## CountryAntigua and Barbuda                
## CountryArmenia                            
## CountryAustralia                          
## CountryAustria                            
## CountryAzerbaijan                         
## CountryBahrain                            
## CountryBangladesh                         
## CountryBarbados                           
## CountryBelarus                          . 
## CountryBelgium                            
## CountryBelize                             
## CountryBenin                              
## CountryBhutan                             
## CountryBosnia and Herzegovina           * 
## CountryBotswana                           
## CountryBrunei                             
## CountryBulgaria                           
## CountryBurkina Faso                       
## CountryBurundi                            
## CountryCambodia                           
## CountryCameroon                           
## CountryCanada                             
## CountryCape Verde                         
## CountryCentral African Republic         **
## CountryChad                               
## CountryChina                              
## CountryComoros                            
## CountryCosta Rica                         
## CountryCote d'Ivoire                      
## CountryCroatia                            
## CountryCuba                               
## CountryCyprus                             
## CountryCzech Republic                     
## CountryDemocratic Republic of the Congo   
## CountryDenmark                            
## CountryDjibouti                           
## CountryDominica                           
## CountryDominican Republic                 
## CountryEast Timor                         
## CountryEgypt                              
## CountryEl Salvador                        
## CountryEquatorial Guinea                  
## CountryEritrea                            
## CountryEstonia                            
## CountryEthiopia                           
## CountryFederated States of Micronesia     
## CountryFiji                               
## CountryFinland                            
## CountryFrance                             
## CountryGabon                              
## CountryGeorgia                            
## CountryGermany                            
## CountryGhana                              
## CountryGreece                             
## CountryGreenland                          
## CountryGrenada                            
## CountryGuatemala                          
## CountryGuinea                           * 
## CountryGuinea-Bissau                      
## CountryHaiti                              
## CountryHonduras                           
## CountryHungary                            
## CountryIceland                            
## CountryIndia                              
## CountryIndonesia                          
## CountryIran                               
## CountryIraq                               
## CountryIreland                            
## CountryIsrael                             
## CountryItaly                              
## CountryJamaica                            
## CountryJapan                              
## CountryJordan                             
## CountryKazakhstan                         
## CountryKenya                              
## CountryKiribati                           
## CountryKosovo                             
## CountryKuwait                             
## CountryKyrgyzstan                       . 
## CountryLaos                               
## CountryLatvia                             
## CountryLebanon                            
## CountryLesotho                          . 
## CountryLiberia                            
## CountryLibya                              
## CountryLiechtenstein                    . 
## CountryLithuania                          
## CountryLuxembourg                         
## CountryMacedonia                          
## CountryMadagascar                         
## CountryMalawi                             
## CountryMalaysia                           
## CountryMaldives                           
## CountryMali                               
## CountryMalta                              
## CountryMarshall Islands                   
## CountryMauritania                         
## CountryMauritius                          
## CountryMexico                             
## CountryMoldova                            
## CountryMonaco                             
## CountryMongolia                           
## CountryMontenegro                         
## CountryMorocco                            
## CountryMozambique                         
## CountryMyanmar                            
## CountryNamibia                          * 
## CountryNauru                              
## CountryNepal                              
## CountryNetherlands                        
## CountryNew Zealand                        
## CountryNicaragua                          
## CountryNiger                              
## CountryNigeria                            
## CountryNorth Korea                        
## CountryNorway                             
## CountryOman                               
## CountryPakistan                           
## CountryPalau                              
## CountryPanama                             
## CountryPapua New Guinea                   
## CountryPhilippines                        
## CountryPoland                             
## CountryPortugal                           
## CountryQatar                              
## CountryRepublic of the Congo              
## CountryRomania                            
## CountryRussia                             
## CountryRwanda                             
## CountrySaint Kitts and Nevis              
## CountrySaint Lucia                        
## CountrySaint Vincent and the Grenadines   
## CountrySamoa                              
## CountrySan Marino                         
## CountrySao Tome and Principe              
## CountrySaudi Arabia                       
## CountrySenegal                            
## CountrySerbia                           * 
## CountrySeychelles                         
## CountrySierra Leone                       
## CountrySingapore                          
## CountrySlovakia                           
## CountrySlovenia                           
## CountrySolomon Islands                    
## CountrySomalia                            
## CountrySouth Africa                       
## CountrySouth Korea                        
## CountrySouth Sudan                        
## CountrySpain                            . 
## CountrySri Lanka                          
## CountrySudan                              
## CountrySwaziland                          
## CountrySweden                             
## CountrySwitzerland                        
## CountrySyria                              
## CountryTaiwan                             
## CountryTajikistan                         
## CountryTanzania                           
## CountryThailand                           
## CountryThe Bahamas                      * 
## CountryThe Gambia                         
## CountryTogo                               
## CountryTonga                              
## CountryTrinidad and Tobago                
## CountryTunisia                            
## CountryTurkey                             
## CountryTurkmenistan                       
## CountryTuvalu                             
## CountryUganda                             
## CountryUkraine                            
## CountryUnited Arab Emirates               
## CountryUnited Kingdom                     
## CountryUnited States of America           
## CountryUzbekistan                         
## CountryVanuatu                            
## CountryVatican City                       
## CountryVietnam                            
## CountryYemen                              
## CountryZambia                             
## CountryZimbabwe                           
## Item.TypeBeverages                        
## Item.TypeCereal                           
## Item.TypeClothes                        . 
## Item.TypeCosmetics                        
## Item.TypeFruits                           
## Item.TypeHousehold                        
## Item.TypeMeat                             
## Item.TypeOffice Supplies                  
## Item.TypePersonal Care                    
## Item.TypeSnacks                           
## Item.TypeVegetables                       
## Order.PriorityH                           
## Order.PriorityL                           
## Order.PriorityM                           
## Units.Sold                                
## Unit.Price                                
## Unit.Cost                                 
## Total.Cost                                
## Total.Profit                              
## Total.Revenue                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 51986  on 37499  degrees of freedom
## Residual deviance: 51778  on 37298  degrees of freedom
## AIC: 52182
## 
## Number of Fisher Scoring iterations: 3
exp(coef(df_100_small_mod)["RegionAustralia and Oceania"])
## RegionAustralia and Oceania 
##                   0.9999954
exp(coef(df_100_small_mod)["RegionCentral America and the Caribbean"])
## RegionCentral America and the Caribbean 
##                                       1
exp(coef(df_100_small_mod)["RegionEurope"])
## RegionEurope 
##    0.9999998

Predict results using logistics regression model

We see that the test dataset for the model with 100 records is very small but are able to predict using the second training dataset of 12,500 records. The results show the probability that Sales.Channel = “Online” for each of the observations. For example, record 3 “Russia in Europe” has a 50% probability of using the online sales channel while record 21 “Antigua & Barbuda” has a 47% online sales channel sales probability.

Let’s compare our model’s predicted values for Sales.Channel with the actual values in the test data set. To do this, we create a confusion matrix, which shows the interaction between the predicted and actual values. Using the base R table() function, we can create a simple confusion matrix. This tells us that our model has a prediction accuracy of 49.0 percent.

filter(df_100_small_test, Country =="Bangladesh" | Country =="Belize" | Country =="Brunei" | Country =="Costa Rica" | Country =="Iran" | Country =="Kyrgyzstan" | Country =="Lebanon" | Country =="Lithuania" | Country =="Malaysia" | Country =="Mongolia" | Country =="Mozambique" | Country =="New Zealand" | Country =="Nicaragua" | Country =="Pakistan" | Country =="Saudi Arabia" | Country =="Slovakia" | Country =="United Kingdom")
##                               Region        Country       Item.Type
## 1                               Asia     Kyrgyzstan      Vegetables
## 2                               Asia     Bangladesh         Clothes
## 3                               Asia       Mongolia   Personal Care
## 4              Australia and Oceania    New Zealand          Fruits
## 5  Central America and the Caribbean     Costa Rica   Personal Care
## 6                               Asia         Brunei Office Supplies
## 7                             Europe       Slovakia      Vegetables
## 8       Middle East and North Africa   Saudi Arabia          Cereal
## 9                             Europe United Kingdom       Household
## 10 Central America and the Caribbean         Belize         Clothes
## 11                            Europe      Lithuania Office Supplies
## 12      Middle East and North Africa       Pakistan       Cosmetics
## 13      Middle East and North Africa        Lebanon         Clothes
## 14      Middle East and North Africa           Iran       Cosmetics
## 15 Central America and the Caribbean      Nicaragua       Beverages
## 16                              Asia       Malaysia          Fruits
## 17                Sub-Saharan Africa     Mozambique       Household
##    Sales.Channel Order.Priority Order.Date  Order.ID  Ship.Date Units.Sold
## 1         Online              H  6/24/2011 814711606  7/12/2011        124
## 2         Online              L  1/13/2017 187310731   3/1/2017       8263
## 3        Offline              C  2/19/2014 832401311  2/23/2014       4901
## 4         Online              H   9/8/2014 142278373  10/4/2014       2187
## 5        Offline              L   5/8/2017 456767165  5/21/2017       6409
## 6         Online              L   4/1/2012 320009267   5/8/2012       6708
## 7         Online              H  10/6/2012 759224212 11/10/2012        171
## 8         Online              M  3/25/2013 844530045  3/28/2013       4063
## 9         Online              L   1/5/2012 955357205  2/14/2012        282
## 10       Offline              M  7/25/2016 807025039   9/7/2016       5498
## 11       Offline              H 10/24/2010 166460740 11/17/2010       8287
## 12       Offline              L   7/5/2013 231145322  8/16/2013       9892
## 13        Online              L  9/18/2012 663110148  10/8/2012       7884
## 14        Online              H 11/15/2016 286959302  12/8/2016       6489
## 15       Offline              C   2/8/2011 963392674  3/21/2011       8156
## 16       Offline              L 11/11/2011 810711038 12/28/2011       6267
## 17       Offline              L  2/10/2012 665095412  2/15/2012       5367
##    Unit.Price Unit.Cost Total.Revenue Total.Cost Total.Profit Order Date
## 1      154.06     90.93      19103.44   11275.32      7828.12 2011-06-24
## 2      109.28     35.84     902980.64  296145.92    606834.72 2017-01-13
## 3       81.73     56.67     400558.73  277739.67    122819.06 2014-02-19
## 4        9.33      6.92      20404.71   15134.04      5270.67 2014-09-08
## 5       81.73     56.67     523807.57  363198.03    160609.54 2017-05-08
## 6      651.21    524.96    4368316.68 3521431.68    846885.00 2012-04-01
## 7      154.06     90.93      26344.26   15549.03     10795.23 2012-10-06
## 8      205.70    117.11     835759.10  475817.93    359941.17 2013-03-25
## 9      668.27    502.54     188452.14  141716.28     46735.86 2012-01-05
## 10     109.28     35.84     600821.44  197048.32    403773.12 2016-07-25
## 11     651.21    524.96    5396577.27 4350343.52   1046233.75 2010-10-24
## 12     437.20    263.33    4324782.40 2604860.36   1719922.04 2013-07-05
## 13     109.28     35.84     861563.52  282562.56    579000.96 2012-09-18
## 14     437.20    263.33    2836990.80 1708748.37   1128242.43 2016-11-15
## 15      47.45     31.79     387002.20  259279.24    127722.96 2011-02-08
## 16       9.33      6.92      58471.11   43367.64     15103.47 2011-11-11
## 17     668.27    502.54    3586605.09 2697132.18    889472.91 2012-02-10
##     Ship Date newcode
## 1  2011-07-12       1
## 2  2017-03-01       1
## 3  2014-02-23       0
## 4  2014-10-04       1
## 5  2017-05-21       0
## 6  2012-05-08       1
## 7  2012-11-10       1
## 8  2013-03-28       1
## 9  2012-02-14       1
## 10 2016-09-07       0
## 11 2010-11-17       0
## 12 2013-08-16       0
## 13 2012-10-08       1
## 14 2016-12-08       1
## 15 2011-03-21       0
## 16 2011-12-28       0
## 17 2012-02-15       0
df_100_small_test1 <- df_100_small_test %>%
filter(!Country %in% c('Bangladesh', 'Belize', 'Brunei', 'Costa Rica', 'Iran', 'Kyrgyzstan', 'Lebanon', 'Lithuania', 'Malaysia', 'Mongolia', 'Mozambique', 'New Zealand', 'Nicaragua', 'Pakistan', 'Saudi Arabia', 'Slovakia', 'United Kingdom'))

#df_100_small_mod_pred1 <- predict(df_100_small_mod, df_100_small_test1, type = 'response')
#head(df_100_small_mod_pred1)

df_50000_large_mod_pred1 <- predict(df_50000_large_mod, df_50000_large_test, type = 'response')
head(df_50000_large_mod_pred1)
##         3         4        15        19        21        28 
## 0.5043303 0.5330945 0.4523842 0.4962811 0.4600560 0.4679573
df_50000_large_mod_pred2 <- ifelse(df_50000_large_mod_pred1 >= 0.5, 1, 0)
head(df_50000_large_mod_pred2)
##  3  4 15 19 21 28 
##  1  1  0  0  0  0
df_50000_large_mod_pred1_table <- table(df_50000_large_test$Sales.Channel, df_50000_large_mod_pred2)
df_50000_large_mod_pred1_table
##          df_50000_large_mod_pred2
##              0    1
##   Offline 3050 3209
##   Online  3158 3083
sum(diag(df_50000_large_mod_pred1_table)) / nrow(df_50000_large_test)
## [1] 0.49064

Improving the model

I tried improving the model further but ran into multicollinearity issues that couldn’t be identified. Unit price and unit cost as well as total revenue and total cost were correlated but removing one of these pairs of variables didn’t resolve the issue.

library(stats)
library(corrplot)
## corrplot 0.92 loaded
df_50000_large_train %>%
  keep(is.numeric) %>%
  cor() %>%
  corrplot()

library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
df_50000_large_mod1 <- glm(newcode ~ Region + Country + Item.Type + Order.Priority + Units.Sold + Total.Cost + Total.Profit, data  = df_50000_large_train, family=binomial)
summary(df_50000_large_mod1)
## 
## Call:
## glm(formula = newcode ~ Region + Country + Item.Type + Order.Priority + 
##     Units.Sold + Total.Cost + Total.Profit, family = binomial, 
##     data = df_50000_large_train)
## 
## Coefficients: (6 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                              2.374e-02  1.464e-01   0.162  0.87118
## RegionAustralia and Oceania              1.535e-02  2.013e-01   0.076  0.93920
## RegionCentral America and the Caribbean -7.638e-02  1.892e-01  -0.404  0.68647
## RegionEurope                             1.417e-01  1.984e-01   0.715  0.47487
## RegionMiddle East and North Africa       7.617e-02  2.010e-01   0.379  0.70478
## RegionNorth America                     -9.866e-02  1.968e-01  -0.501  0.61613
## RegionSub-Saharan Africa                -1.229e-01  2.028e-01  -0.606  0.54449
## CountryAlbania                          -2.136e-01  1.982e-01  -1.078  0.28122
## CountryAlgeria                          -2.131e-01  2.018e-01  -1.056  0.29088
## CountryAndorra                           1.789e-02  1.996e-01   0.090  0.92858
## CountryAngola                            1.572e-01  2.011e-01   0.782  0.43421
## CountryAntigua and Barbuda              -3.808e-02  1.859e-01  -0.205  0.83766
## CountryArmenia                           1.403e-02  2.063e-01   0.068  0.94578
## CountryAustralia                         8.874e-03  1.997e-01   0.044  0.96455
## CountryAustria                          -5.898e-02  1.968e-01  -0.300  0.76443
## CountryAzerbaijan                        2.470e-01  2.021e-01   1.222  0.22155
## CountryBahrain                          -1.895e-03  1.988e-01  -0.010  0.99240
## CountryBangladesh                        8.350e-02  1.943e-01   0.430  0.66732
## CountryBarbados                          1.415e-02  1.918e-01   0.074  0.94119
## CountryBelarus                          -3.705e-01  2.040e-01  -1.817  0.06929
## CountryBelgium                          -1.249e-01  1.970e-01  -0.634  0.52600
## CountryBelize                            4.445e-02  1.927e-01   0.231  0.81755
## CountryBenin                             2.852e-01  2.061e-01   1.384  0.16646
## CountryBhutan                            3.837e-02  2.024e-01   0.190  0.84962
## CountryBosnia and Herzegovina           -4.190e-01  2.048e-01  -2.045  0.04081
## CountryBotswana                          8.647e-03  2.030e-01   0.043  0.96603
## CountryBrunei                            2.173e-01  2.018e-01   1.077  0.28156
## CountryBulgaria                         -2.880e-01  1.998e-01  -1.441  0.14949
## CountryBurkina Faso                      1.669e-01  2.041e-01   0.818  0.41329
## CountryBurundi                           5.060e-02  2.046e-01   0.247  0.80466
## CountryCambodia                         -2.868e-02  1.979e-01  -0.145  0.88478
## CountryCameroon                         -2.309e-02  2.055e-01  -0.112  0.91055
## CountryCanada                            1.184e-01  1.998e-01   0.592  0.55358
## CountryCape Verde                       -9.923e-02  1.989e-01  -0.499  0.61788
## CountryCentral African Republic          5.488e-01  2.023e-01   2.713  0.00667
## CountryChad                              1.067e-01  2.033e-01   0.525  0.59971
## CountryChina                            -6.176e-02  1.922e-01  -0.321  0.74793
## CountryComoros                           2.778e-02  2.089e-01   0.133  0.89423
## CountryCosta Rica                        6.503e-02  1.968e-01   0.330  0.74108
## CountryCote d'Ivoire                     3.310e-01  2.108e-01   1.570  0.11645
## CountryCroatia                          -1.324e-01  2.006e-01  -0.660  0.50912
## CountryCuba                              2.854e-01  1.929e-01   1.479  0.13901
## CountryCyprus                           -5.920e-02  1.968e-01  -0.301  0.76360
## CountryCzech Republic                   -1.109e-01  2.057e-01  -0.539  0.58960
## CountryDemocratic Republic of the Congo  2.654e-01  1.980e-01   1.340  0.18012
## CountryDenmark                           7.771e-02  1.978e-01   0.393  0.69439
## CountryDjibouti                         -6.464e-02  2.015e-01  -0.321  0.74833
## CountryDominica                          1.515e-01  1.944e-01   0.779  0.43578
## CountryDominican Republic                3.748e-02  1.868e-01   0.201  0.84098
## CountryEast Timor                        3.410e-02  2.015e-01   0.169  0.86564
## CountryEgypt                             6.554e-02  2.001e-01   0.327  0.74330
## CountryEl Salvador                       9.484e-02  1.958e-01   0.484  0.62818
## CountryEquatorial Guinea                -6.020e-02  2.047e-01  -0.294  0.76869
## CountryEritrea                           2.777e-01  2.064e-01   1.346  0.17833
## CountryEstonia                          -1.138e-01  1.996e-01  -0.570  0.56853
## CountryEthiopia                          2.783e-01  2.058e-01   1.352  0.17633
## CountryFederated States of Micronesia    6.232e-02  2.062e-01   0.302  0.76246
## CountryFiji                              1.460e-01  2.046e-01   0.714  0.47539
## CountryFinland                          -1.780e-01  1.924e-01  -0.925  0.35488
## CountryFrance                            2.870e-02  1.927e-01   0.149  0.88161
## CountryGabon                             2.643e-01  1.992e-01   1.327  0.18455
## CountryGeorgia                          -1.176e-01  2.024e-01  -0.581  0.56135
## CountryGermany                          -1.508e-01  2.001e-01  -0.754  0.45112
## CountryGhana                            -5.567e-02  2.059e-01  -0.270  0.78689
## CountryGreece                           -7.896e-02  2.025e-01  -0.390  0.69661
## CountryGreenland                        -6.347e-03  1.957e-01  -0.032  0.97413
## CountryGrenada                           2.254e-01  1.851e-01   1.218  0.22325
## CountryGuatemala                        -7.898e-02  1.905e-01  -0.415  0.67849
## CountryGuinea                            4.396e-01  1.979e-01   2.221  0.02635
## CountryGuinea-Bissau                     9.889e-02  2.069e-01   0.478  0.63258
## CountryHaiti                             1.128e-01  1.981e-01   0.570  0.56899
## CountryHonduras                         -5.709e-03  1.876e-01  -0.030  0.97572
## CountryHungary                          -6.010e-02  2.020e-01  -0.298  0.76605
## CountryIceland                          -3.084e-01  2.023e-01  -1.525  0.12733
## CountryIndia                             1.160e-01  2.062e-01   0.563  0.57358
## CountryIndonesia                         5.622e-02  1.977e-01   0.284  0.77610
## CountryIran                              3.026e-02  2.075e-01   0.146  0.88403
## CountryIraq                             -2.236e-01  1.978e-01  -1.131  0.25822
## CountryIreland                          -6.117e-02  1.964e-01  -0.312  0.75542
## CountryIsrael                           -2.567e-01  2.063e-01  -1.244  0.21354
## CountryItaly                             3.288e-02  1.962e-01   0.168  0.86694
## CountryJamaica                           1.057e-01  1.915e-01   0.552  0.58110
## CountryJapan                            -1.562e-01  2.051e-01  -0.762  0.44634
## CountryJordan                            1.635e-01  2.075e-01   0.788  0.43059
## CountryKazakhstan                       -3.814e-02  2.036e-01  -0.187  0.85140
## CountryKenya                             7.189e-02  2.036e-01   0.353  0.72401
## CountryKiribati                         -1.466e-01  2.020e-01  -0.726  0.46793
## CountryKosovo                           -2.232e-01  2.020e-01  -1.105  0.26919
## CountryKuwait                            2.410e-02  1.996e-01   0.121  0.90388
## CountryKyrgyzstan                        3.586e-01  2.010e-01   1.784  0.07446
## CountryLaos                             -1.179e-01  2.017e-01  -0.585  0.55869
## CountryLatvia                           -1.653e-01  1.970e-01  -0.839  0.40122
## CountryLebanon                          -9.530e-02  2.025e-01  -0.471  0.63795
## CountryLesotho                           3.547e-01  2.025e-01   1.751  0.07992
## CountryLiberia                           1.925e-01  2.032e-01   0.948  0.34330
## CountryLibya                            -5.811e-02  2.006e-01  -0.290  0.77201
## CountryLiechtenstein                    -3.346e-01  2.029e-01  -1.649  0.09919
## CountryLithuania                        -2.707e-01  1.963e-01  -1.379  0.16790
## CountryLuxembourg                       -2.015e-01  1.989e-01  -1.013  0.31093
## CountryMacedonia                        -3.071e-01  1.960e-01  -1.567  0.11708
## CountryMadagascar                        2.436e-01  2.028e-01   1.201  0.22973
## CountryMalawi                            1.040e-01  2.097e-01   0.496  0.62000
## CountryMalaysia                          8.798e-03  1.960e-01   0.045  0.96419
## CountryMaldives                         -2.596e-02  1.925e-01  -0.135  0.89272
## CountryMali                              1.755e-02  2.037e-01   0.086  0.93133
## CountryMalta                            -9.153e-02  1.936e-01  -0.473  0.63638
## CountryMarshall Islands                  2.568e-01  2.057e-01   1.249  0.21177
## CountryMauritania                       -1.121e-01  2.010e-01  -0.558  0.57693
## CountryMauritius                         8.079e-02  2.024e-01   0.399  0.68979
## CountryMexico                           -3.733e-02  1.936e-01  -0.193  0.84712
## CountryMoldova                           3.828e-02  2.002e-01   0.191  0.84838
## CountryMonaco                            3.742e-02  2.086e-01   0.179  0.85762
## CountryMongolia                         -1.339e-01  1.991e-01  -0.673  0.50120
## CountryMontenegro                       -1.794e-02  2.000e-01  -0.090  0.92855
## CountryMorocco                           1.341e-01  2.051e-01   0.654  0.51316
## CountryMozambique                       -2.576e-02  2.018e-01  -0.128  0.89841
## CountryMyanmar                          -1.139e-01  1.990e-01  -0.572  0.56700
## CountryNamibia                           4.303e-01  1.991e-01   2.161  0.03067
## CountryNauru                             1.113e-02  2.025e-01   0.055  0.95618
## CountryNepal                             2.864e-01  2.055e-01   1.393  0.16354
## CountryNetherlands                      -2.393e-01  1.943e-01  -1.232  0.21803
## CountryNew Zealand                      -1.524e-01  2.053e-01  -0.742  0.45789
## CountryNicaragua                        -7.772e-02  1.895e-01  -0.410  0.68171
## CountryNiger                             7.651e-02  2.069e-01   0.370  0.71148
## CountryNigeria                           1.623e-01  2.024e-01   0.802  0.42262
## CountryNorth Korea                      -5.370e-02  1.989e-01  -0.270  0.78716
## CountryNorway                           -1.441e-01  1.965e-01  -0.734  0.46322
## CountryOman                             -3.809e-02  2.040e-01  -0.187  0.85191
## CountryPakistan                         -7.631e-02  2.079e-01  -0.367  0.71360
## CountryPalau                            -2.007e-02  2.050e-01  -0.098  0.92204
## CountryPanama                           -2.233e-01  1.884e-01  -1.185  0.23583
## CountryPapua New Guinea                  8.896e-02  2.060e-01   0.432  0.66582
## CountryPhilippines                       1.891e-01  2.006e-01   0.942  0.34597
## CountryPoland                            1.445e-01  2.038e-01   0.709  0.47829
## CountryPortugal                         -2.381e-01  2.034e-01  -1.171  0.24171
## CountryQatar                            -1.886e-02  2.001e-01  -0.094  0.92494
## CountryRepublic of the Congo             1.046e-01  2.074e-01   0.505  0.61388
## CountryRomania                          -2.981e-01  2.002e-01  -1.489  0.13648
## CountryRussia                           -7.115e-02  1.986e-01  -0.358  0.72020
## CountryRwanda                            3.024e-01  2.037e-01   1.484  0.13775
## CountrySaint Kitts and Nevis             1.877e-01  1.861e-01   1.009  0.31314
## CountrySaint Lucia                       1.496e-02  1.857e-01   0.081  0.93581
## CountrySaint Vincent and the Grenadines -8.930e-02  1.922e-01  -0.465  0.64211
## CountrySamoa                            -1.442e-02  1.934e-01  -0.075  0.94058
## CountrySan Marino                       -2.398e-01  1.975e-01  -1.214  0.22484
## CountrySao Tome and Principe             1.649e-02  1.987e-01   0.083  0.93386
## CountrySaudi Arabia                      7.664e-02  1.958e-01   0.391  0.69552
## CountrySenegal                           1.306e-01  2.026e-01   0.644  0.51934
## CountrySerbia                           -5.147e-01  2.013e-01  -2.557  0.01056
## CountrySeychelles                        3.801e-02  2.009e-01   0.189  0.84996
## CountrySierra Leone                      1.626e-01  2.024e-01   0.803  0.42175
## CountrySingapore                         5.478e-02  1.924e-01   0.285  0.77580
## CountrySlovakia                         -9.291e-02  1.991e-01  -0.467  0.64078
## CountrySlovenia                         -2.897e-01  2.027e-01  -1.429  0.15302
## CountrySolomon Islands                   7.083e-02  2.025e-01   0.350  0.72655
## CountrySomalia                           3.343e-02  1.998e-01   0.167  0.86713
## CountrySouth Africa                     -9.526e-02  2.012e-01  -0.473  0.63592
## CountrySouth Korea                       1.429e-01  1.967e-01   0.726  0.46761
## CountrySouth Sudan                      -3.754e-02  2.011e-01  -0.187  0.85195
## CountrySpain                            -3.600e-01  2.028e-01  -1.776  0.07581
## CountrySri Lanka                         4.060e-02  1.962e-01   0.207  0.83609
## CountrySudan                             9.619e-02  2.000e-01   0.481  0.63054
## CountrySwaziland                         5.879e-02  2.086e-01   0.282  0.77804
## CountrySweden                           -9.805e-02  2.041e-01  -0.480  0.63100
## CountrySwitzerland                      -2.209e-01  2.034e-01  -1.086  0.27735
## CountrySyria                            -1.902e-01  2.075e-01  -0.917  0.35921
## CountryTaiwan                           -1.159e-01  1.959e-01  -0.592  0.55391
## CountryTajikistan                        8.166e-02  1.977e-01   0.413  0.67957
## CountryTanzania                         -2.997e-01  2.100e-01  -1.427  0.15351
## CountryThailand                          7.130e-02  1.932e-01   0.369  0.71210
## CountryThe Bahamas                       3.842e-01  1.899e-01   2.023  0.04308
## CountryThe Gambia                       -8.433e-02  2.015e-01  -0.418  0.67560
## CountryTogo                              1.355e-01  2.082e-01   0.651  0.51524
## CountryTonga                            -1.714e-01  2.043e-01  -0.839  0.40142
## CountryTrinidad and Tobago                      NA         NA      NA       NA
## CountryTunisia                          -1.947e-01  2.012e-01  -0.968  0.33321
## CountryTurkey                           -2.739e-01  2.100e-01  -1.305  0.19197
## CountryTurkmenistan                     -9.191e-02  1.980e-01  -0.464  0.64247
## CountryTuvalu                            4.688e-02  2.023e-01   0.232  0.81672
## CountryUganda                            2.067e-01  2.018e-01   1.024  0.30568
## CountryUkraine                          -3.154e-01  1.958e-01  -1.611  0.10722
## CountryUnited Arab Emirates             -7.991e-02  1.984e-01  -0.403  0.68706
## CountryUnited Kingdom                   -1.020e-01  2.009e-01  -0.508  0.61142
## CountryUnited States of America                 NA         NA      NA       NA
## CountryUzbekistan                        5.642e-02  1.953e-01   0.289  0.77271
## CountryVanuatu                                  NA         NA      NA       NA
## CountryVatican City                             NA         NA      NA       NA
## CountryVietnam                                  NA         NA      NA       NA
## CountryYemen                             2.538e-01  2.126e-01   1.194  0.23257
## CountryZambia                            9.347e-02  2.051e-01   0.456  0.64853
## CountryZimbabwe                                 NA         NA      NA       NA
## Item.TypeBeverages                      -3.091e-02  5.893e-02  -0.524  0.60000
## Item.TypeCereal                         -5.367e-02  5.149e-02  -1.042  0.29726
## Item.TypeClothes                        -8.740e-02  5.289e-02  -1.652  0.09848
## Item.TypeCosmetics                      -9.111e-02  5.917e-02  -1.540  0.12364
## Item.TypeFruits                         -5.489e-02  6.107e-02  -0.899  0.36882
## Item.TypeHousehold                      -4.888e-02  6.193e-02  -0.789  0.42990
## Item.TypeMeat                           -8.891e-02  6.876e-02  -1.293  0.19595
## Item.TypeOffice Supplies                -2.834e-02  6.752e-02  -0.420  0.67468
## Item.TypePersonal Care                  -4.829e-02  5.755e-02  -0.839  0.40144
## Item.TypeSnacks                         -1.933e-02  5.331e-02  -0.363  0.71691
## Item.TypeVegetables                     -7.770e-03  5.247e-02  -0.148  0.88227
## Order.PriorityH                          1.668e-02  2.938e-02   0.568  0.57031
## Order.PriorityL                          4.043e-05  2.933e-02   0.001  0.99890
## Order.PriorityM                         -3.654e-02  2.934e-02  -1.245  0.21303
## Units.Sold                               4.020e-06  6.451e-06   0.623  0.53312
## Total.Cost                              -1.228e-08  2.969e-08  -0.414  0.67917
## Total.Profit                             5.080e-08  9.919e-08   0.512  0.60854
##                                           
## (Intercept)                               
## RegionAustralia and Oceania               
## RegionCentral America and the Caribbean   
## RegionEurope                              
## RegionMiddle East and North Africa        
## RegionNorth America                       
## RegionSub-Saharan Africa                  
## CountryAlbania                            
## CountryAlgeria                            
## CountryAndorra                            
## CountryAngola                             
## CountryAntigua and Barbuda                
## CountryArmenia                            
## CountryAustralia                          
## CountryAustria                            
## CountryAzerbaijan                         
## CountryBahrain                            
## CountryBangladesh                         
## CountryBarbados                           
## CountryBelarus                          . 
## CountryBelgium                            
## CountryBelize                             
## CountryBenin                              
## CountryBhutan                             
## CountryBosnia and Herzegovina           * 
## CountryBotswana                           
## CountryBrunei                             
## CountryBulgaria                           
## CountryBurkina Faso                       
## CountryBurundi                            
## CountryCambodia                           
## CountryCameroon                           
## CountryCanada                             
## CountryCape Verde                         
## CountryCentral African Republic         **
## CountryChad                               
## CountryChina                              
## CountryComoros                            
## CountryCosta Rica                         
## CountryCote d'Ivoire                      
## CountryCroatia                            
## CountryCuba                               
## CountryCyprus                             
## CountryCzech Republic                     
## CountryDemocratic Republic of the Congo   
## CountryDenmark                            
## CountryDjibouti                           
## CountryDominica                           
## CountryDominican Republic                 
## CountryEast Timor                         
## CountryEgypt                              
## CountryEl Salvador                        
## CountryEquatorial Guinea                  
## CountryEritrea                            
## CountryEstonia                            
## CountryEthiopia                           
## CountryFederated States of Micronesia     
## CountryFiji                               
## CountryFinland                            
## CountryFrance                             
## CountryGabon                              
## CountryGeorgia                            
## CountryGermany                            
## CountryGhana                              
## CountryGreece                             
## CountryGreenland                          
## CountryGrenada                            
## CountryGuatemala                          
## CountryGuinea                           * 
## CountryGuinea-Bissau                      
## CountryHaiti                              
## CountryHonduras                           
## CountryHungary                            
## CountryIceland                            
## CountryIndia                              
## CountryIndonesia                          
## CountryIran                               
## CountryIraq                               
## CountryIreland                            
## CountryIsrael                             
## CountryItaly                              
## CountryJamaica                            
## CountryJapan                              
## CountryJordan                             
## CountryKazakhstan                         
## CountryKenya                              
## CountryKiribati                           
## CountryKosovo                             
## CountryKuwait                             
## CountryKyrgyzstan                       . 
## CountryLaos                               
## CountryLatvia                             
## CountryLebanon                            
## CountryLesotho                          . 
## CountryLiberia                            
## CountryLibya                              
## CountryLiechtenstein                    . 
## CountryLithuania                          
## CountryLuxembourg                         
## CountryMacedonia                          
## CountryMadagascar                         
## CountryMalawi                             
## CountryMalaysia                           
## CountryMaldives                           
## CountryMali                               
## CountryMalta                              
## CountryMarshall Islands                   
## CountryMauritania                         
## CountryMauritius                          
## CountryMexico                             
## CountryMoldova                            
## CountryMonaco                             
## CountryMongolia                           
## CountryMontenegro                         
## CountryMorocco                            
## CountryMozambique                         
## CountryMyanmar                            
## CountryNamibia                          * 
## CountryNauru                              
## CountryNepal                              
## CountryNetherlands                        
## CountryNew Zealand                        
## CountryNicaragua                          
## CountryNiger                              
## CountryNigeria                            
## CountryNorth Korea                        
## CountryNorway                             
## CountryOman                               
## CountryPakistan                           
## CountryPalau                              
## CountryPanama                             
## CountryPapua New Guinea                   
## CountryPhilippines                        
## CountryPoland                             
## CountryPortugal                           
## CountryQatar                              
## CountryRepublic of the Congo              
## CountryRomania                            
## CountryRussia                             
## CountryRwanda                             
## CountrySaint Kitts and Nevis              
## CountrySaint Lucia                        
## CountrySaint Vincent and the Grenadines   
## CountrySamoa                              
## CountrySan Marino                         
## CountrySao Tome and Principe              
## CountrySaudi Arabia                       
## CountrySenegal                            
## CountrySerbia                           * 
## CountrySeychelles                         
## CountrySierra Leone                       
## CountrySingapore                          
## CountrySlovakia                           
## CountrySlovenia                           
## CountrySolomon Islands                    
## CountrySomalia                            
## CountrySouth Africa                       
## CountrySouth Korea                        
## CountrySouth Sudan                        
## CountrySpain                            . 
## CountrySri Lanka                          
## CountrySudan                              
## CountrySwaziland                          
## CountrySweden                             
## CountrySwitzerland                        
## CountrySyria                              
## CountryTaiwan                             
## CountryTajikistan                         
## CountryTanzania                           
## CountryThailand                           
## CountryThe Bahamas                      * 
## CountryThe Gambia                         
## CountryTogo                               
## CountryTonga                              
## CountryTrinidad and Tobago                
## CountryTunisia                            
## CountryTurkey                             
## CountryTurkmenistan                       
## CountryTuvalu                             
## CountryUganda                             
## CountryUkraine                            
## CountryUnited Arab Emirates               
## CountryUnited Kingdom                     
## CountryUnited States of America           
## CountryUzbekistan                         
## CountryVanuatu                            
## CountryVatican City                       
## CountryVietnam                            
## CountryYemen                              
## CountryZambia                             
## CountryZimbabwe                           
## Item.TypeBeverages                        
## Item.TypeCereal                           
## Item.TypeClothes                        . 
## Item.TypeCosmetics                        
## Item.TypeFruits                           
## Item.TypeHousehold                        
## Item.TypeMeat                             
## Item.TypeOffice Supplies                  
## Item.TypePersonal Care                    
## Item.TypeSnacks                           
## Item.TypeVegetables                       
## Order.PriorityH                           
## Order.PriorityL                           
## Order.PriorityM                           
## Units.Sold                                
## Total.Cost                                
## Total.Profit                              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 51986  on 37499  degrees of freedom
## Residual deviance: 51778  on 37298  degrees of freedom
## AIC: 52182
## 
## Number of Fisher Scoring iterations: 3
#vif(df_50000_large_mod1)