Assignment

Visit the following website and explore the range of sizes of this dataset (from 100 to 5 million records).

https://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/

Based on your computer’s capabilities (memory, CPU), select 2 files you can handle (recommended one small, one large)

Review the structure and content of the tables, and think which two machine learning algorithms presented so far could be used to analyze the data, and how can they be applied in the suggested environment of the datasets.

Write a short essay explaining your selection. Then, select one of the 2 algorithms and explore how to analyze and predict an outcome based on the data available. This will be an exploratory exercise, so feel free to show errors and warnings that raise during the analysis. Test the code with both datasets selected and compare the results. Which result will you trust if you need to make a business decision? Do you think an analysis could be prone to errors when using too much data, or when using the least amount possible?

Develop your exploratory analysis of the data and the essay in the following 2 weeks. You’ll have until March 17 to submit both.

Choosing two files

I have chosen the smallest 100 Sales file and the 10000 Sales files.

# define the filename-manual procedure
filename1 <- "C:/Users/Lisa/OneDrive/CUNY/622/HW1/Sales100.csv"
# load the CSV file from the local directory
dataset100 <- read.csv(filename1, header=TRUE)
dataset100$Region<-as.factor(dataset100$Region)
dataset100$Country<-as.factor(dataset100$Country)
dataset100$Item.Type<-as.factor(dataset100$Item.Type)
dataset100$Sales.Channel<-as.factor(dataset100$Sales.Channel)
dataset100$Order.Priority<-as.factor(dataset100$Order.Priority)
dataset100$Ship.Date <- as.Date(dataset100$Ship.Date, "%m/%d/%Y")
dataset100$Order.Date <- as.Date(dataset100$Order.Date, "%m/%d/%Y")

dataset100<-dataset100 %>% mutate(Days=Ship.Date-Order.Date, Order.Day=format(dataset100$Order.Date, format="%a"), Order.Month=format(dataset100$Order.Date, format="%b"),Order.Year=format(dataset100$Order.Date, format="%Y"))

dataset100$Days<-as.numeric(dataset100$Days)

dataset100$Order.Day<-as.factor(dataset100$Order.Day)
dataset100$Order.Month<-as.factor(dataset100$Order.Month)
dataset100$Order.Year<-as.numeric(dataset100$Order.Year)

dataset100<-dataset100 %>% select(Region,Country, Item.Type, Sales.Channel, Order.Priority, Units.Sold, Unit.Price, Unit.Cost, Total.Cost, Total.Profit, Total.Revenue, Days, Order.Day, Order.Month, Order.Year)

dataset100_2<-dataset100

dim(dataset100)
## [1] 100  15
str(dataset100)
## 'data.frame':    100 obs. of  15 variables:
##  $ Region        : Factor w/ 7 levels "Asia","Australia and Oceania",..: 2 3 4 7 7 2 7 7 7 7 ...
##  $ Country       : Factor w/ 76 levels "Albania","Angola",..: 74 23 56 60 57 66 2 10 54 62 ...
##  $ Item.Type     : Factor w/ 12 levels "Baby Food","Beverages",..: 1 3 9 6 9 1 7 12 10 3 ...
##  $ Sales.Channel : Factor w/ 2 levels "Offline","Online": 1 2 1 2 1 2 1 2 1 2 ...
##  $ Order.Priority: Factor w/ 4 levels "C","H","L","M": 2 1 3 1 3 1 4 2 4 2 ...
##  $ Units.Sold    : int  9925 2804 1779 8102 5062 2974 4187 8082 6070 6593 ...
##  $ Unit.Price    : num  255.28 205.7 651.21 9.33 651.21 ...
##  $ Unit.Cost     : num  159.42 117.11 524.96 6.92 524.96 ...
##  $ Total.Cost    : num  1582244 328376 933904 56066 2657348 ...
##  $ Total.Profit  : num  951411 248406 224599 19526 639078 ...
##  $ Total.Revenue : num  2533654 576783 1158503 75592 3296425 ...
##  $ Days          : num  30 24 6 15 5 17 4 10 42 42 ...
##  $ Order.Day     : Factor w/ 7 levels "Fri","Mon","Sat",..: 1 7 1 1 1 7 3 6 6 1 ...
##  $ Order.Month   : Factor w/ 12 levels "Apr","Aug","Dec",..: 9 2 9 7 4 4 1 6 6 1 ...
##  $ Order.Year    : num  2010 2012 2014 2014 2013 ...
summary(dataset100)
##                                Region                    Country  
##  Asia                             :11   The Gambia           : 4  
##  Australia and Oceania            :11   Australia            : 3  
##  Central America and the Caribbean: 7   Djibouti             : 3  
##  Europe                           :22   Mexico               : 3  
##  Middle East and North Africa     :10   Sao Tome and Principe: 3  
##  North America                    : 3   Sierra Leone         : 3  
##  Sub-Saharan Africa               :36   (Other)              :81  
##            Item.Type  Sales.Channel Order.Priority   Units.Sold  
##  Clothes        :13   Offline:50    C:22           Min.   : 124  
##  Cosmetics      :13   Online :50    H:30           1st Qu.:2836  
##  Office Supplies:12                 L:27           Median :5382  
##  Fruits         :10                 M:21           Mean   :5129  
##  Personal Care  :10                                3rd Qu.:7369  
##  Household      : 9                                Max.   :9925  
##  (Other)        :33                                              
##    Unit.Price       Unit.Cost        Total.Cost       Total.Profit    
##  Min.   :  9.33   Min.   :  6.92   Min.   :   3612   Min.   :   1258  
##  1st Qu.: 81.73   1st Qu.: 35.84   1st Qu.: 168868   1st Qu.: 121444  
##  Median :179.88   Median :107.28   Median : 363566   Median : 290768  
##  Mean   :276.76   Mean   :191.05   Mean   : 931806   Mean   : 441682  
##  3rd Qu.:437.20   3rd Qu.:263.33   3rd Qu.:1613870   3rd Qu.: 635829  
##  Max.   :668.27   Max.   :524.96   Max.   :4509794   Max.   :1719922  
##                                                                       
##  Total.Revenue          Days       Order.Day  Order.Month   Order.Year  
##  Min.   :   4870   Min.   : 0.00   Fri:19    Feb    :13   Min.   :2010  
##  1st Qu.: 268721   1st Qu.: 9.75   Mon:14    Jul    :12   1st Qu.:2012  
##  Median : 752314   Median :23.50   Sat:17    May    :11   Median :2013  
##  Mean   :1373488   Mean   :23.36   Sun:11    Oct    :11   Mean   :2013  
##  3rd Qu.:2212045   3rd Qu.:36.25   Thu:10    Jun    :10   3rd Qu.:2015  
##  Max.   :5997055   Max.   :50.00   Tue:18    Apr    : 9   Max.   :2017  
##                                    Wed:11    (Other):34
##Set up for larger file
# define the filename-manual procedure
filename2 <- "C:/Users/Lisa/OneDrive/CUNY/622/HW1/Sales10000.csv"
# load the CSV file from the local directory
dataset10000 <- read.csv(filename2, header=TRUE)
dataset10000$Region<-as.factor(dataset10000$Region)
dataset10000$Country<-as.factor(dataset10000$Country)
dataset10000$Item.Type<-as.factor(dataset10000$Item.Type)
dataset10000$Sales.Channel<-as.factor(dataset10000$Sales.Channel)
dataset10000$Order.Priority<-as.factor(dataset10000$Order.Priority)
dataset10000$Ship.Date <- as.Date(dataset10000$Ship.Date, "%m/%d/%Y")
dataset10000$Order.Date <- as.Date(dataset10000$Order.Date, "%m/%d/%Y")

dataset10000<-dataset10000 %>% mutate(Days=Ship.Date-Order.Date, Order.Day=format(dataset10000$Order.Date, format="%a"), Order.Month=format(dataset10000$Order.Date, format="%b"),Order.Year=format(dataset10000$Order.Date, format="%Y"))

dataset10000$Days<-as.numeric(dataset10000$Days)

dataset10000$Order.Day<-as.factor(dataset10000$Order.Day)
dataset10000$Order.Month<-as.factor(dataset10000$Order.Month)
dataset10000$Order.Year<-as.numeric(dataset10000$Order.Year)

dataset10000<-dataset10000 %>% select(Region,Country, Item.Type, Sales.Channel, Order.Priority, Units.Sold, Unit.Price, Unit.Cost, Total.Cost, Total.Profit, Total.Revenue, Days, Order.Day, Order.Month, Order.Year)

dataset10000_2<-dataset10000

dim(dataset10000)
## [1] 10000    15
str(dataset10000)
## 'data.frame':    10000 obs. of  15 variables:
##  $ Region        : Factor w/ 7 levels "Asia","Australia and Oceania",..: 7 4 5 7 4 7 1 1 7 3 ...
##  $ Country       : Factor w/ 185 levels "Afghanistan",..: 30 86 123 39 38 151 85 31 48 65 ...
##  $ Item.Type     : Factor w/ 12 levels "Baby Food","Beverages",..: 9 2 12 7 2 2 12 1 8 9 ...
##  $ Sales.Channel : Factor w/ 2 levels "Offline","Online": 2 2 1 2 2 1 2 2 2 2 ...
##  $ Order.Priority: Factor w/ 4 levels "C","H","L","M": 3 1 1 1 1 2 3 1 3 1 ...
##  $ Units.Sold    : int  4484 1075 6515 7683 3491 9880 4825 3330 2431 6197 ...
##  $ Unit.Price    : num  651.2 47.5 154.1 668.3 47.5 ...
##  $ Unit.Cost     : num  525 31.8 90.9 502.5 31.8 ...
##  $ Total.Cost    : num  2353921 34174 592409 3861015 110979 ...
##  $ Total.Profit  : num  566105 16835 411292 1273304 54669 ...
##  $ Total.Revenue : num  2920026 51009 1003701 5134318 165648 ...
##  $ Days          : num  16 26 19 25 39 42 28 32 50 16 ...
##  $ Order.Day     : Factor w/ 7 levels "Fri","Mon","Sat",..: 5 2 5 6 6 6 4 2 1 3 ...
##  $ Order.Month   : Factor w/ 12 levels "Apr","Aug","Dec",..: 5 3 5 12 11 6 4 1 10 6 ...
##  $ Order.Year    : num  2011 2015 2011 2012 2015 ...
summary(dataset10000)
##                                Region               Country    
##  Asia                             :1469   Lithuania     :  72  
##  Australia and Oceania            : 797   United Kingdom:  72  
##  Central America and the Caribbean:1019   Moldova       :  71  
##  Europe                           :2633   Croatia       :  70  
##  Middle East and North Africa     :1264   Seychelles    :  70  
##  North America                    : 215   Botswana      :  69  
##  Sub-Saharan Africa               :2603   (Other)       :9576  
##            Item.Type    Sales.Channel  Order.Priority   Units.Sold   
##  Personal Care  : 888   Offline:4939   C:2555         Min.   :    2  
##  Household      : 875   Online :5061   H:2503         1st Qu.: 2531  
##  Clothes        : 872                  L:2494         Median : 4962  
##  Baby Food      : 842                  M:2448         Mean   : 5003  
##  Office Supplies: 837                                 3rd Qu.: 7472  
##  Vegetables     : 836                                 Max.   :10000  
##  (Other)        :4850                                                
##    Unit.Price       Unit.Cost        Total.Cost       Total.Profit      
##  Min.   :  9.33   Min.   :  6.92   Min.   :    125   Min.   :     43.4  
##  1st Qu.:109.28   1st Qu.: 56.67   1st Qu.: 164786   1st Qu.:  98329.1  
##  Median :205.70   Median :117.11   Median : 481606   Median : 289099.0  
##  Mean   :268.14   Mean   :188.81   Mean   : 938266   Mean   : 395089.3  
##  3rd Qu.:437.20   3rd Qu.:364.69   3rd Qu.:1183822   3rd Qu.: 566422.7  
##  Max.   :668.27   Max.   :524.96   Max.   :5241726   Max.   :1738178.4  
##                                                                         
##  Total.Revenue          Days       Order.Day   Order.Month     Order.Year  
##  Min.   :    168   Min.   : 0.00   Fri:1440   Jul    : 926   Min.   :2010  
##  1st Qu.: 288551   1st Qu.:12.00   Mon:1437   Mar    : 917   1st Qu.:2011  
##  Median : 800051   Median :25.00   Sat:1381   Jan    : 908   Median :2013  
##  Mean   :1333355   Mean   :25.06   Sun:1467   May    : 897   Mean   :2013  
##  3rd Qu.:1819143   3rd Qu.:37.00   Thu:1406   Jun    : 873   3rd Qu.:2015  
##  Max.   :6680027   Max.   :50.00   Tue:1422   Apr    : 850   Max.   :2017  
##                                    Wed:1447   (Other):4629
length(unique(dataset10000$Country))
## [1] 185
# Note there are 185 countries in the 100000 set.

Overview

Predicting Sales.Channel which is Online or Offline given the following predictors:

  • Region
  • Country
  • Item.Type
  • Order.Priority
  • Units.Sold
  • Unit.Price
  • Unit.Cost
  • Total.Cost
  • Total.Profit
  • Total.Revenue
  • Days: Computed below
  • Order.Day: Computed below
  • Order.Month: Computed below
  • Order.Year: Computed below

Order.ID and Ship.Date were not used.

counts1 <- table(dataset100$Sales.Channel)
barplot(counts1, main="Sales Channel file 100 ",
   xlab="Sales channel")

counts2 <- table(dataset10000$Sales.Channel)
barplot(counts2, main="Sales Channel file 10000 ",
   xlab="Sales channel")

## Variable work

The Order.Date and Ship.Date were converted from character to Date format and the following variables were created:

Days: Order.Date - Ship.Date Order.Day: Mon, Tue…. Order.Month: Jan, Feb…. Order.Year: 1999, 2000…….

And the Order.Date, Ship.Date and Order.ID variables are dropped.

KNN

Let’s predict the Sales.Channel using KNN:

There are no missing values.

We have to normalize features because of the euclidean distance calculations.

KNN required normalizing. Creation of dummy variables for categorical variables. Splitting the labels out of the dataset.

normalize <-function (x) {
  return ((x-min(x))/(max(x) - min(x)))
}

dataset100<-dataset100 %>% 
  mutate (Units.Sold=normalize(Units.Sold)) %>%
  mutate (Unit.Price=normalize(Unit.Price)) %>%
  mutate (Unit.Cost=normalize(Unit.Cost)) %>%
  mutate (Total.Revenue=normalize(Total.Revenue)) %>%
  mutate (Total.Profit=normalize(Total.Profit)) %>%
  mutate (Total.Cost=normalize(Total.Cost)) %>%
  mutate (Days=normalize(Days)) %>%
  mutate (Order.Year=normalize(Order.Year))
  
summary(dataset100)
##                                Region                    Country  
##  Asia                             :11   The Gambia           : 4  
##  Australia and Oceania            :11   Australia            : 3  
##  Central America and the Caribbean: 7   Djibouti             : 3  
##  Europe                           :22   Mexico               : 3  
##  Middle East and North Africa     :10   Sao Tome and Principe: 3  
##  North America                    : 3   Sierra Leone         : 3  
##  Sub-Saharan Africa               :36   (Other)              :81  
##            Item.Type  Sales.Channel Order.Priority   Units.Sold    
##  Clothes        :13   Offline:50    C:22           Min.   :0.0000  
##  Cosmetics      :13   Online :50    H:30           1st Qu.:0.2767  
##  Office Supplies:12                 L:27           Median :0.5365  
##  Fruits         :10                 M:21           Mean   :0.5106  
##  Personal Care  :10                                3rd Qu.:0.7392  
##  Household      : 9                                Max.   :1.0000  
##  (Other)        :33                                                
##    Unit.Price       Unit.Cost         Total.Cost       Total.Profit    
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.1099   1st Qu.:0.05583   1st Qu.:0.03667   1st Qu.:0.06993  
##  Median :0.2588   Median :0.19372   Median :0.07988   Median :0.16845  
##  Mean   :0.4059   Mean   :0.35543   Mean   :0.20598   Mean   :0.25626  
##  3rd Qu.:0.6493   3rd Qu.:0.49496   3rd Qu.:0.35734   3rd Qu.:0.36922  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##                                                                        
##  Total.Revenue          Days        Order.Day  Order.Month   Order.Year    
##  Min.   :0.00000   Min.   :0.0000   Fri:19    Feb    :13   Min.   :0.0000  
##  1st Qu.:0.04403   1st Qu.:0.1950   Mon:14    Jul    :12   1st Qu.:0.2857  
##  Median :0.12474   Median :0.4700   Sat:17    May    :11   Median :0.4286  
##  Mean   :0.22840   Mean   :0.4672   Sun:11    Oct    :11   Mean   :0.4614  
##  3rd Qu.:0.36834   3rd Qu.:0.7250   Thu:10    Jun    :10   3rd Qu.:0.7143  
##  Max.   :1.00000   Max.   :1.0000   Tue:18    Apr    : 9   Max.   :1.0000  
##                                     Wed:11    (Other):34

Categorical variables dummy code

First, split off the dataset class labels:

dataset100_labels<-dataset100 %>% select(Sales.Channel)

dataset100<-dataset100 %>% select(-Sales.Channel)
colnames(dataset100)
##  [1] "Region"         "Country"        "Item.Type"      "Order.Priority"
##  [5] "Units.Sold"     "Unit.Price"     "Unit.Cost"      "Total.Cost"    
##  [9] "Total.Profit"   "Total.Revenue"  "Days"           "Order.Day"     
## [13] "Order.Month"    "Order.Year"

Creating dummy variables:

dataset100<-dummy.data.frame(data=dataset100, sep="_")
## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored
#colnames(dataset100)

Splitting the data

set.seed(1234)
sample_index<-sample(nrow(dataset100), round(nrow(dataset100)*.75), replace=FALSE)

dataset100_train<-dataset100[sample_index,]
dataset100_test<-dataset100[-sample_index,]

#split class labels

dataset100_train_labels<-as.factor(dataset100_labels[sample_index,])

dataset100_test_labels<-as.factor(dataset100_labels[-sample_index,])

Classify using KNN from class package

dataset100_pred1<-knn(
  train =dataset100_train,
  test=dataset100_test,
  cl=dataset100_train_labels,
  k=8)

#head(dataset100_pred1)

Evaluating the model

Let’s look at our model actually did in predicting the right label…

dataset100_pred1_table<-table(dataset100_test_labels, dataset100_pred1)

dataset100_pred1_table
##                       dataset100_pred1
## dataset100_test_labels Offline Online
##                Offline       7      7
##                Online        6      5
sum(diag(dataset100_pred1_table))/nrow(dataset100_test)
## [1] 0.48

The rate is worse than 50% of correct prediction. Worse than a coin toss.

Look at the larger dataset KNN

normalize <-function (x) {
  return ((x-min(x))/(max(x) - min(x)))
}

dataset10000<-dataset10000 %>% 
  mutate (Units.Sold=normalize(Units.Sold)) %>%
  mutate (Unit.Price=normalize(Unit.Price)) %>%
  mutate (Unit.Cost=normalize(Unit.Cost)) %>%
  mutate (Total.Revenue=normalize(Total.Revenue)) %>%
  mutate (Total.Profit=normalize(Total.Profit)) %>%
  mutate (Total.Cost=normalize(Total.Cost)) %>%
  mutate (Days=normalize(Days)) %>%
  mutate (Order.Year=normalize(Order.Year))
  
summary(dataset10000)
##                                Region               Country    
##  Asia                             :1469   Lithuania     :  72  
##  Australia and Oceania            : 797   United Kingdom:  72  
##  Central America and the Caribbean:1019   Moldova       :  71  
##  Europe                           :2633   Croatia       :  70  
##  Middle East and North Africa     :1264   Seychelles    :  70  
##  North America                    : 215   Botswana      :  69  
##  Sub-Saharan Africa               :2603   (Other)       :9576  
##            Item.Type    Sales.Channel  Order.Priority   Units.Sold    
##  Personal Care  : 888   Offline:4939   C:2555         Min.   :0.0000  
##  Household      : 875   Online :5061   H:2503         1st Qu.:0.2529  
##  Clothes        : 872                  L:2494         Median :0.4961  
##  Baby Food      : 842                  M:2448         Mean   :0.5002  
##  Office Supplies: 837                                 3rd Qu.:0.7471  
##  Vegetables     : 836                                 Max.   :1.0000  
##  (Other)        :4850                                                 
##    Unit.Price       Unit.Cost         Total.Cost       Total.Profit    
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.1517   1st Qu.:0.09604   1st Qu.:0.03141   1st Qu.:0.05655  
##  Median :0.2980   Median :0.21271   Median :0.09186   Median :0.16630  
##  Mean   :0.3928   Mean   :0.35111   Mean   :0.17898   Mean   :0.22728  
##  3rd Qu.:0.6493   3rd Qu.:0.69062   3rd Qu.:0.22583   3rd Qu.:0.32585  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.00000  
##                                                                        
##  Total.Revenue          Days        Order.Day   Order.Month     Order.Year    
##  Min.   :0.00000   Min.   :0.0000   Fri:1440   Jul    : 926   Min.   :0.0000  
##  1st Qu.:0.04317   1st Qu.:0.2400   Mon:1437   Mar    : 917   1st Qu.:0.1429  
##  Median :0.11975   Median :0.5000   Sat:1381   Jan    : 908   Median :0.4286  
##  Mean   :0.19958   Mean   :0.5012   Sun:1467   May    : 897   Mean   :0.4771  
##  3rd Qu.:0.27231   3rd Qu.:0.7400   Thu:1406   Jun    : 873   3rd Qu.:0.7143  
##  Max.   :1.00000   Max.   :1.0000   Tue:1422   Apr    : 850   Max.   :1.0000  
##                                     Wed:1447   (Other):4629
dataset10000_labels<-dataset10000 %>% select(Sales.Channel)

dataset10000<-dataset10000 %>% select(-Sales.Channel)
#colnames(dataset100)

dataset10000<-dummy.data.frame(data=dataset10000, sep="_")
## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored

## Warning in model.matrix.default(~x - 1, model.frame(~x - 1), contrasts = FALSE):
## non-list contrasts argument ignored
#colnames(dataset100)

set.seed(1234)
sample_index<-sample(nrow(dataset10000), round(nrow(dataset100)*.75), replace=FALSE)

dataset10000_train<-dataset10000[sample_index,]
dataset10000_test<-dataset10000[-sample_index,]

#split class labels

dataset10000_train_labels<-as.factor(dataset10000_labels[sample_index,])

dataset10000_test_labels<-as.factor(dataset10000_labels[-sample_index,])


dataset10000_pred1<-knn(
  train =dataset10000_train,
  test=dataset10000_test,
  cl=dataset10000_train_labels,
  k=75)

#head(dataset100_pred1)

dataset10000_pred1_table<-table(dataset10000_test_labels, dataset10000_pred1)

dataset10000_pred1_table
##                         dataset10000_pred1
## dataset10000_test_labels Offline Online
##                  Offline    4897      0
##                  Online     5028      0
sum(diag(dataset10000_pred1_table))/nrow(dataset10000_test)
## [1] 0.4934005

The larger data set also has a correct prediction rate of less than 50%. Worse than a coin toss.

Logistic model

Let’s predict Sales.Channel using a logistic model

set.seed(12345)
sample_set<-sample(nrow(dataset100_2), round(nrow(dataset100_2)*.75), replace=FALSE)

dataset100_2_train<-dataset100_2[sample_set,]
dataset100_2_test<-dataset100_2[-sample_set,]

dim(dataset100_2_train)
## [1] 75 15
dim(dataset100_2_test)
## [1] 25 15
glm.log<-glm(Sales.Channel ~ Region + Country +            Item.Type + Order.Priority +
        Units.Sold + Unit.Price +
          Unit.Cost + Total.Cost +
          Total.Profit + Total.Revenue +
          Days + Order.Day + Order.Month +                    Order.Year, data=dataset100_2_train, family=binomial)

summary(glm.log)
## 
## Call:
## glm(formula = Sales.Channel ~ Region + Country + Item.Type + 
##     Order.Priority + Units.Sold + Unit.Price + Unit.Cost + Total.Cost + 
##     Total.Profit + Total.Revenue + Days + Order.Day + Order.Month + 
##     Order.Year, family = binomial, data = dataset100_2_train)
## 
## Deviance Residuals: 
##  [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## [26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## [51]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## 
## Coefficients: (31 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                              2.657e+01  1.181e+06       0        1
## RegionAustralia and Oceania              9.028e-06  1.332e+06       0        1
## RegionCentral America and the Caribbean  2.657e+01  1.573e+06       0        1
## RegionEurope                             5.313e+01  1.234e+06       0        1
## RegionMiddle East and North Africa       5.313e+01  1.234e+06       0        1
## RegionNorth America                     -2.657e+01  6.662e+05       0        1
## RegionSub-Saharan Africa                 2.657e+01  9.079e+05       0        1
## CountryAngola                           -8.830e-06  1.511e+06       0        1
## CountryAustralia                         7.970e+01  1.573e+06       0        1
## CountryAustria                           5.011e-06  1.007e+06       0        1
## CountryAzerbaijan                        5.313e+01  1.424e+06       0        1
## CountryBangladesh                        7.970e+01  1.038e+06       0        1
## CountryBelize                            2.657e+01  1.689e+06       0        1
## CountryBrunei                            7.970e+01  9.079e+05       0        1
## CountryBulgaria                          5.313e+01  1.126e+06       0        1
## CountryCape Verde                        2.657e+01  1.154e+06       0        1
## CountryCosta Rica                       -5.313e+01  1.424e+06       0        1
## CountryCote d'Ivoire                     2.657e+01  1.356e+06       0        1
## CountryDjibouti                          2.657e+01  8.352e+05       0        1
## CountryEast Timor                       -2.657e+01  9.079e+05       0        1
## CountryFederated States of Micronesia    7.970e+01  1.651e+06       0        1
## CountryFiji                             -1.383e-05  1.745e+06       0        1
## CountryFrance                            5.313e+01  1.007e+06       0        1
## CountryGrenada                          -2.657e+01  1.490e+06       0        1
## CountryHaiti                             2.657e+01  1.208e+06       0        1
## CountryHonduras                         -1.018e-08  1.332e+06       0        1
## CountryIran                              5.313e+01  1.007e+06       0        1
## CountryKenya                             1.063e+02  1.234e+06       0        1
## CountryKiribati                          7.970e+01  2.238e+06       0        1
## CountryKuwait                            2.657e+01  9.753e+05       0        1
## CountryKyrgyzstan                        1.594e+02  1.234e+06       0        1
## CountryLaos                              5.313e+01  1.126e+06       0        1
## CountryLebanon                           2.657e+01  9.079e+05       0        1
## CountryLesotho                           2.657e+01  1.259e+06       0        1
## CountryLibya                            -4.627e-06  5.036e+05       0        1
## CountryMacedonia                        -5.313e+01  5.036e+05       0        1
## CountryMali                              7.970e+01  1.934e+06       0        1
## CountryMexico                                   NA         NA      NA       NA
## CountryMoldova                          -2.657e+01  1.038e+06       0        1
## CountryMongolia                         -5.313e+01  1.126e+06       0        1
## CountryMozambique                       -2.657e+01  9.079e+05       0        1
## CountryMyanmar                           2.657e+01  7.555e+05       0        1
## CountryNew Zealand                       7.970e+01  1.532e+06       0        1
## CountryNicaragua                                NA         NA      NA       NA
## CountryNiger                             2.657e+01  9.079e+05       0        1
## CountryPakistan                         -2.657e+01  1.154e+06       0        1
## CountryPortugal                          1.846e-05  1.745e+06       0        1
## CountryRepublic of the Congo            -2.657e+01  1.651e+06       0        1
## CountryRomania                           5.313e+01  1.007e+06       0        1
## CountryRussia                           -2.657e+01  9.079e+05       0        1
## CountryRwanda                            2.001e-07  1.234e+06       0        1
## CountrySamoa                             1.063e+02  1.332e+06       0        1
## CountrySan Marino                       -2.657e+01  1.612e+06       0        1
## CountrySao Tome and Principe             2.083e-07  1.332e+06       0        1
## CountrySenegal                           2.657e+01  9.079e+05       0        1
## CountrySierra Leone                      2.657e+01  1.038e+06       0        1
## CountrySlovenia                         -2.657e+01  1.098e+06       0        1
## CountrySouth Sudan                      -7.970e+01  1.154e+06       0        1
## CountrySpain                            -5.313e+01  1.234e+06       0        1
## CountrySri Lanka                         5.313e+01  1.332e+06       0        1
## CountrySwitzerland                      -4.017e-06  1.234e+06       0        1
## CountrySyria                                    NA         NA      NA       NA
## CountryThe Gambia                       -2.657e+01  8.352e+05       0        1
## CountryTurkmenistan                             NA         NA      NA       NA
## CountryTuvalu                                   NA         NA      NA       NA
## CountryUnited Kingdom                    9.417e-06  1.234e+06       0        1
## CountryZambia                                   NA         NA      NA       NA
## Item.TypeBeverages                      -7.970e+01  1.490e+06       0        1
## Item.TypeCereal                          9.028e-06  1.234e+06       0        1
## Item.TypeClothes                        -5.313e+01  1.424e+06       0        1
## Item.TypeCosmetics                      -5.313e+01  1.234e+06       0        1
## Item.TypeFruits                         -2.657e+01  1.447e+06       0        1
## Item.TypeHousehold                      -2.657e+01  9.079e+05       0        1
## Item.TypeMeat                            5.313e+01  5.036e+05       0        1
## Item.TypeOffice Supplies                -5.313e+01  1.332e+06       0        1
## Item.TypePersonal Care                   9.023e-06  1.126e+06       0        1
## Item.TypeSnacks                          4.404e-06  7.122e+05       0        1
## Item.TypeVegetables                     -1.063e+02  1.332e+06       0        1
## Order.PriorityH                         -5.313e+01  5.036e+05       0        1
## Order.PriorityL                         -2.657e+01  5.631e+05       0        1
## Order.PriorityM                         -5.313e+01  1.007e+06       0        1
## Units.Sold                                      NA         NA      NA       NA
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Cost                                      NA         NA      NA       NA
## Total.Profit                                    NA         NA      NA       NA
## Total.Revenue                                   NA         NA      NA       NA
## Days                                            NA         NA      NA       NA
## Order.DayMon                                    NA         NA      NA       NA
## Order.DaySat                                    NA         NA      NA       NA
## Order.DaySun                                    NA         NA      NA       NA
## Order.DayThu                                    NA         NA      NA       NA
## Order.DayTue                                    NA         NA      NA       NA
## Order.DayWed                                    NA         NA      NA       NA
## Order.MonthAug                                  NA         NA      NA       NA
## Order.MonthDec                                  NA         NA      NA       NA
## Order.MonthFeb                                  NA         NA      NA       NA
## Order.MonthJan                                  NA         NA      NA       NA
## Order.MonthJul                                  NA         NA      NA       NA
## Order.MonthJun                                  NA         NA      NA       NA
## Order.MonthMar                                  NA         NA      NA       NA
## Order.MonthMay                                  NA         NA      NA       NA
## Order.MonthNov                                  NA         NA      NA       NA
## Order.MonthOct                                  NA         NA      NA       NA
## Order.MonthSep                                  NA         NA      NA       NA
## Order.Year                                      NA         NA      NA       NA
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1.0396e+02  on 74  degrees of freedom
## Residual deviance: 4.3512e-10  on  0  degrees of freedom
## AIC: 150
## 
## Number of Fisher Scoring iterations: 25
#Use coef() to access coeff 
#coef(glm.log)
#summary(glm.log)$coef

The predict() function used to predict probability that Sales.Channel is offline given the predictors. “response” tells r to print probabilities.

glm.probs<- predict(glm.log, type="response")
glm.probs[1:10]
##           14           51           80           90           92           24 
## 2.900701e-12 1.000000e+00 1.000000e+00 2.900701e-12 2.900701e-12 1.000000e+00 
##           58           93           75           88 
## 2.900701e-12 1.000000e+00 2.900701e-12 2.900701e-12
contrasts(dataset100_2$Sales.Channel)
##         Online
## Offline      0
## Online       1

R created dummy variable for online 1.

Must convert probabilities to class offline or online.

Logistic training set results

glm.pred<-rep("Offline",nrow(dataset100_2_train))
glm.pred[glm.probs>.5]="Online"
table(glm.pred,dataset100_2_train$Sales.Channel)
##          
## glm.pred  Offline Online
##   Offline      37      0
##   Online        0     38
mean(glm.pred ==dataset100_2_train$Sales.Channel)
## [1] 1

This means the model correctly predicted 100% of the time on the training data. This is unbelievable.

The data is probably too small and not a good representation of the data. For example, there are 100 observations and the distinct values of countries as seen in the 10000 observation dataset is 135. So even if every observation was a different country, not all of the countries could be represented. We need to use a larger dataset.

Run on the test data

Let’s check the test data.

glm.log_test<-glm(Sales.Channel ~ Region + Country +            Item.Type + Order.Priority +
        Units.Sold + Unit.Price +
          Unit.Cost + Total.Cost +
          Total.Profit + Total.Revenue +
          Days + Order.Day + Order.Month +                    Order.Year, data=dataset100_2_test, family=binomial)

summary(glm.log_test)
## 
## Call:
## glm(formula = Sales.Channel ~ Region + Country + Item.Type + 
##     Order.Priority + Units.Sold + Unit.Price + Unit.Cost + Total.Cost + 
##     Total.Profit + Total.Revenue + Days + Order.Day + Order.Month + 
##     Order.Year, family = binomial, data = dataset100_2_test)
## 
## Deviance Residuals: 
##  [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## 
## Coefficients: (36 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                              2.557e+01  4.830e+05       0        1
## RegionAustralia and Oceania             -9.757e-08  3.055e+05       0        1
## RegionEurope                            -8.657e-08  5.291e+05       0        1
## RegionMiddle East and North Africa       5.113e+01  4.320e+05       0        1
## RegionNorth America                     -5.113e+01  5.291e+05       0        1
## RegionSub-Saharan Africa                -5.113e+01  3.055e+05       0        1
## CountryBurkina Faso                      5.113e+01  5.291e+05       0        1
## CountryCameroon                          5.113e+01  3.055e+05       0        1
## CountryComoros                           5.113e+01  4.320e+05       0        1
## CountryDemocratic Republic of the Congo  1.023e+02  4.320e+05       0        1
## CountryGabon                             5.239e-09  5.291e+05       0        1
## CountryIceland                          -7.809e-09  3.055e+05       0        1
## CountryLithuania                        -5.113e+01  5.291e+05       0        1
## CountryMadagascar                        5.239e-09  5.291e+05       0        1
## CountryMalaysia                         -5.113e+01  5.291e+05       0        1
## CountryMali                              5.113e+01  5.291e+05       0        1
## CountryMauritania                       -7.127e-16  3.055e+05       0        1
## CountryMexico                                   NA         NA      NA       NA
## CountryMonaco                           -5.113e+01  3.055e+05       0        1
## CountryMyanmar                           1.165e-08  5.291e+05       0        1
## CountryNorway                           -1.060e-08  3.055e+05       0        1
## CountryRwanda                            5.239e-09  5.291e+05       0        1
## CountrySaudi Arabia                             NA         NA      NA       NA
## CountrySierra Leone                             NA         NA      NA       NA
## CountrySlovakia                                 NA         NA      NA       NA
## CountrySolomon Islands                   4.548e-09  5.291e+05       0        1
## CountryTurkmenistan                             NA         NA      NA       NA
## Item.TypeBeverages                      -5.113e+01  3.055e+05       0        1
## Item.TypeCereal                         -5.113e+01  5.291e+05       0        1
## Item.TypeClothes                                NA         NA      NA       NA
## Item.TypeCosmetics                              NA         NA      NA       NA
## Item.TypeFruits                                 NA         NA      NA       NA
## Item.TypeOffice Supplies                 5.239e-09  4.320e+05       0        1
## Item.TypePersonal Care                          NA         NA      NA       NA
## Item.TypeVegetables                             NA         NA      NA       NA
## Order.PriorityH                                 NA         NA      NA       NA
## Order.PriorityL                                 NA         NA      NA       NA
## Order.PriorityM                                 NA         NA      NA       NA
## Units.Sold                                      NA         NA      NA       NA
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Cost                                      NA         NA      NA       NA
## Total.Profit                                    NA         NA      NA       NA
## Total.Revenue                                   NA         NA      NA       NA
## Days                                            NA         NA      NA       NA
## Order.DayMon                                    NA         NA      NA       NA
## Order.DaySat                                    NA         NA      NA       NA
## Order.DaySun                                    NA         NA      NA       NA
## Order.DayThu                                    NA         NA      NA       NA
## Order.DayTue                                    NA         NA      NA       NA
## Order.DayWed                                    NA         NA      NA       NA
## Order.MonthDec                                  NA         NA      NA       NA
## Order.MonthFeb                                  NA         NA      NA       NA
## Order.MonthJan                                  NA         NA      NA       NA
## Order.MonthJul                                  NA         NA      NA       NA
## Order.MonthJun                                  NA         NA      NA       NA
## Order.MonthMar                                  NA         NA      NA       NA
## Order.MonthMay                                  NA         NA      NA       NA
## Order.MonthNov                                  NA         NA      NA       NA
## Order.MonthOct                                  NA         NA      NA       NA
## Order.Year                                      NA         NA      NA       NA
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3.4617e+01  on 24  degrees of freedom
## Residual deviance: 3.9425e-10  on  0  degrees of freedom
## AIC: 50
## 
## Number of Fisher Scoring iterations: 24
glm.probs_test<- predict(glm.log_test, type="response") 

glm.pred_test<-rep("Offline",nrow(dataset100_2_test))
glm.pred_test [glm.probs_test>.5]="Online"
table(glm.pred_test,dataset100_2_test$Sales.Channel)
##              
## glm.pred_test Offline Online
##       Offline      13      0
##       Online        0     12
mean(glm.pred_test ==dataset100_2_test$Sales.Channel)
## [1] 1

The Test data set also had a 100% accuracy rate!

Definitely time to move on to a larger dataset for more feasible modeling results.

Logistic on the 10000 dataset

set.seed(12345)
sample_set<-sample(nrow(dataset10000_2), round(nrow(dataset10000_2)*.75), replace=FALSE)

dataset10000_2_train<-dataset10000_2[sample_set,]
dataset10000_2_test<-dataset10000_2[-sample_set,]

dim(dataset10000_2_train)
## [1] 7500   15
dim(dataset10000_2_test)
## [1] 2500   15
glm.log2<-glm(Sales.Channel ~ Region + Country +            Item.Type + Order.Priority +
        Units.Sold + Unit.Price +
          Unit.Cost + Total.Cost +
          Total.Profit + Total.Revenue +
          Days + Order.Day + Order.Month +                    Order.Year, data=dataset10000_2_train, family=binomial)

summary(glm.log2)
## 
## Call:
## glm(formula = Sales.Channel ~ Region + Country + Item.Type + 
##     Order.Priority + Units.Sold + Unit.Price + Unit.Cost + Total.Cost + 
##     Total.Profit + Total.Revenue + Days + Order.Day + Order.Month + 
##     Order.Year, family = binomial, data = dataset10000_2_train)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6883  -1.1553   0.8458   1.1528   1.5899  
## 
## Coefficients: (9 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                             -1.489e+01  2.203e+01  -0.676  0.49930
## RegionAustralia and Oceania              5.589e-02  4.600e-01   0.121  0.90330
## RegionCentral America and the Caribbean -6.958e-02  4.518e-01  -0.154  0.87762
## RegionEurope                             2.494e-01  4.276e-01   0.583  0.55973
## RegionMiddle East and North Africa       1.504e-01  4.348e-01   0.346  0.72940
## RegionNorth America                      6.237e-01  4.358e-01   1.431  0.15232
## RegionSub-Saharan Africa                 7.453e-01  4.425e-01   1.684  0.09210
## CountryAlbania                          -2.393e-01  4.327e-01  -0.553  0.58028
## CountryAlgeria                           2.986e-01  4.488e-01   0.665  0.50575
## CountryAndorra                           2.824e-02  4.583e-01   0.062  0.95087
## CountryAngola                           -1.263e+00  4.742e-01  -2.663  0.00776
## CountryAntigua and Barbuda              -1.970e-01  4.648e-01  -0.424  0.67172
## CountryArmenia                           1.385e-01  4.379e-01   0.316  0.75174
## CountryAustralia                         6.731e-01  4.839e-01   1.391  0.16424
## CountryAustria                          -8.794e-02  4.169e-01  -0.211  0.83294
## CountryAzerbaijan                        4.393e-01  4.414e-01   0.995  0.31963
## CountryBahrain                           8.235e-02  4.176e-01   0.197  0.84367
## CountryBangladesh                        2.378e-01  4.370e-01   0.544  0.58627
## CountryBarbados                          9.515e-01  4.764e-01   1.997  0.04581
## CountryBelarus                           3.672e-01  4.344e-01   0.845  0.39796
## CountryBelgium                           1.649e-02  4.434e-01   0.037  0.97033
## CountryBelize                            6.439e-01  5.023e-01   1.282  0.19991
## CountryBenin                            -8.527e-01  4.281e-01  -1.992  0.04637
## CountryBhutan                            3.606e-01  4.355e-01   0.828  0.40764
## CountryBosnia and Herzegovina           -5.324e-01  4.267e-01  -1.248  0.21214
## CountryBotswana                         -1.802e-01  4.339e-01  -0.415  0.67790
## CountryBrunei                            1.884e-01  4.587e-01   0.411  0.68130
## CountryBulgaria                         -2.539e-01  4.559e-01  -0.557  0.57762
## CountryBurkina Faso                      1.546e-02  4.731e-01   0.033  0.97393
## CountryBurundi                          -8.295e-01  4.280e-01  -1.938  0.05261
## CountryCambodia                          6.896e-01  4.263e-01   1.618  0.10573
## CountryCameroon                         -6.862e-01  4.729e-01  -1.451  0.14677
## CountryCanada                           -8.176e-01  4.170e-01  -1.960  0.04994
## CountryCape Verde                        3.051e-02  5.107e-01   0.060  0.95236
## CountryCentral African Republic         -2.663e-01  4.558e-01  -0.584  0.55911
## CountryChad                             -2.478e-01  4.474e-01  -0.554  0.57966
## CountryChina                             3.247e-01  4.403e-01   0.737  0.46088
## CountryComoros                          -1.354e-01  4.720e-01  -0.287  0.77425
## CountryCosta Rica                        5.990e-01  4.546e-01   1.318  0.18760
## CountryCote d'Ivoire                    -5.445e-01  4.720e-01  -1.154  0.24866
## CountryCroatia                           3.323e-01  4.078e-01   0.815  0.41517
## CountryCuba                              3.209e-01  4.610e-01   0.696  0.48641
## CountryCyprus                           -8.766e-02  4.473e-01  -0.196  0.84464
## CountryCzech Republic                   -4.699e-01  4.741e-01  -0.991  0.32164
## CountryDemocratic Republic of the Congo -7.377e-01  4.375e-01  -1.686  0.09171
## CountryDenmark                           4.379e-01  4.241e-01   1.032  0.30184
## CountryDjibouti                         -6.572e-01  4.462e-01  -1.473  0.14078
## CountryDominica                          1.730e-01  4.583e-01   0.377  0.70580
## CountryDominican Republic                1.805e-01  4.941e-01   0.365  0.71483
## CountryEast Timor                        1.193e+00  5.181e-01   2.303  0.02128
## CountryEgypt                             2.672e-01  4.313e-01   0.620  0.53554
## CountryEl Salvador                       2.825e-01  4.375e-01   0.646  0.51851
## CountryEquatorial Guinea                -8.096e-01  4.645e-01  -1.743  0.08133
## CountryEritrea                          -1.201e-01  4.700e-01  -0.256  0.79833
## CountryEstonia                          -4.276e-01  4.389e-01  -0.974  0.33001
## CountryEthiopia                         -6.283e-01  4.326e-01  -1.453  0.14636
## CountryFederated States of Micronesia    1.131e-01  4.562e-01   0.248  0.80418
## CountryFiji                              5.509e-01  4.491e-01   1.226  0.22001
## CountryFinland                           5.300e-04  4.291e-01   0.001  0.99901
## CountryFrance                           -2.764e-02  4.234e-01  -0.065  0.94795
## CountryGabon                            -8.523e-01  4.718e-01  -1.806  0.07089
## CountryGeorgia                          -1.184e-01  4.408e-01  -0.269  0.78814
## CountryGermany                           2.032e-02  4.290e-01   0.047  0.96221
## CountryGhana                            -4.965e-01  4.433e-01  -1.120  0.26276
## CountryGreece                            2.506e-02  4.373e-01   0.057  0.95429
## CountryGreenland                        -5.182e-01  4.702e-01  -1.102  0.27043
## CountryGrenada                          -9.627e-02  4.436e-01  -0.217  0.82817
## CountryGuatemala                         2.739e-01  4.458e-01   0.614  0.53895
## CountryGuinea                           -1.612e-01  4.487e-01  -0.359  0.71936
## CountryGuinea-Bissau                    -3.664e-01  4.691e-01  -0.781  0.43473
## CountryHaiti                             9.655e-02  4.651e-01   0.208  0.83557
## CountryHonduras                          4.383e-01  4.510e-01   0.972  0.33110
## CountryHungary                          -3.531e-01  4.499e-01  -0.785  0.43260
## CountryIceland                           3.965e-02  4.260e-01   0.093  0.92583
## CountryIndia                             6.465e-01  4.221e-01   1.532  0.12557
## CountryIndonesia                        -6.348e-02  4.542e-01  -0.140  0.88885
## CountryIran                              4.868e-02  4.120e-01   0.118  0.90594
## CountryIraq                              2.382e-01  4.244e-01   0.561  0.57458
## CountryIreland                           2.908e-01  4.312e-01   0.674  0.50009
## CountryIsrael                           -2.813e-01  4.246e-01  -0.663  0.50765
## CountryItaly                            -4.895e-02  4.626e-01  -0.106  0.91573
## CountryJamaica                           1.379e-01  4.618e-01   0.299  0.76516
## CountryJapan                             2.901e-01  4.457e-01   0.651  0.51518
## CountryJordan                            3.304e-01  4.399e-01   0.751  0.45255
## CountryKazakhstan                        7.268e-01  4.437e-01   1.638  0.10140
## CountryKenya                            -3.955e-01  4.340e-01  -0.911  0.36212
## CountryKiribati                         -1.146e-01  4.518e-01  -0.254  0.79970
## CountryKosovo                            6.104e-02  4.105e-01   0.149  0.88180
## CountryKuwait                           -1.022e-01  4.368e-01  -0.234  0.81501
## CountryKyrgyzstan                        3.270e-03  4.309e-01   0.008  0.99395
## CountryLaos                              4.132e-01  4.500e-01   0.918  0.35848
## CountryLatvia                           -6.617e-02  4.439e-01  -0.149  0.88151
## CountryLebanon                           3.054e-01  4.487e-01   0.681  0.49608
## CountryLesotho                          -8.525e-01  4.434e-01  -1.923  0.05453
## CountryLiberia                          -6.694e-01  4.520e-01  -1.481  0.13859
## CountryLibya                             1.642e-01  4.506e-01   0.364  0.71559
## CountryLiechtenstein                     1.296e-01  4.109e-01   0.315  0.75255
## CountryLithuania                         5.930e-01  4.297e-01   1.380  0.16758
## CountryLuxembourg                        5.580e-01  4.307e-01   1.295  0.19519
## CountryMacedonia                        -1.348e-01  4.475e-01  -0.301  0.76318
## CountryMadagascar                       -6.766e-01  4.500e-01  -1.504  0.13265
## CountryMalawi                           -3.297e-01  4.227e-01  -0.780  0.43536
## CountryMalaysia                         -1.946e-02  4.700e-01  -0.041  0.96697
## CountryMaldives                          5.267e-01  4.344e-01   1.213  0.22526
## CountryMali                             -5.687e-01  5.001e-01  -1.137  0.25542
## CountryMalta                             7.839e-01  4.589e-01   1.708  0.08758
## CountryMarshall Islands                  1.018e-01  4.687e-01   0.217  0.82803
## CountryMauritania                        4.907e-02  4.629e-01   0.106  0.91558
## CountryMauritius                        -4.689e-01  4.410e-01  -1.063  0.28764
## CountryMexico                           -3.170e-01  4.453e-01  -0.712  0.47651
## CountryMoldova                          -3.422e-01  4.004e-01  -0.855  0.39267
## CountryMonaco                            9.916e-02  4.550e-01   0.218  0.82747
## CountryMongolia                          2.051e-01  4.399e-01   0.466  0.64108
## CountryMontenegro                        7.217e-02  4.069e-01   0.177  0.85922
## CountryMorocco                           4.810e-01  4.280e-01   1.124  0.26106
## CountryMozambique                       -1.032e+00  4.754e-01  -2.170  0.03000
## CountryMyanmar                           3.605e-01  4.428e-01   0.814  0.41551
## CountryNamibia                          -4.220e-01  4.722e-01  -0.894  0.37153
## CountryNauru                             5.760e-01  4.703e-01   1.225  0.22071
## CountryNepal                             7.992e-01  4.764e-01   1.678  0.09343
## CountryNetherlands                      -5.212e-02  4.166e-01  -0.125  0.90044
## CountryNew Zealand                       2.009e-01  4.632e-01   0.434  0.66452
## CountryNicaragua                         5.918e-01  4.840e-01   1.223  0.22146
## CountryNiger                            -5.841e-01  4.405e-01  -1.326  0.18480
## CountryNigeria                          -8.956e-01  4.405e-01  -2.033  0.04205
## CountryNorth Korea                       2.341e-01  4.424e-01   0.529  0.59674
## CountryNorway                           -3.744e-01  4.411e-01  -0.849  0.39604
## CountryOman                              3.843e-01  4.530e-01   0.848  0.39618
## CountryPakistan                         -1.943e-01  4.709e-01  -0.413  0.67995
## CountryPalau                             3.479e-01  4.898e-01   0.710  0.47760
## CountryPanama                            9.361e-02  4.539e-01   0.206  0.83662
## CountryPapua New Guinea                  5.144e-01  4.812e-01   1.069  0.28509
## CountryPhilippines                      -2.248e-01  4.456e-01  -0.505  0.61385
## CountryPoland                           -1.671e-02  4.468e-01  -0.037  0.97016
## CountryPortugal                         -3.991e-01  4.143e-01  -0.963  0.33540
## CountryQatar                             4.984e-02  4.315e-01   0.116  0.90805
## CountryRepublic of the Congo            -6.655e-01  4.586e-01  -1.451  0.14674
## CountryRomania                           4.990e-01  4.417e-01   1.130  0.25865
## CountryRussia                           -2.279e-01  4.276e-01  -0.533  0.59415
## CountryRwanda                           -4.575e-01  4.407e-01  -1.038  0.29921
## CountrySaint Kitts and Nevis             5.154e-01  4.424e-01   1.165  0.24397
## CountrySaint Lucia                       4.607e-01  5.038e-01   0.914  0.36049
## CountrySaint Vincent and the Grenadines  4.277e-01  4.938e-01   0.866  0.38640
## CountrySamoa                             3.956e-01  4.671e-01   0.847  0.39698
## CountrySan Marino                       -5.267e-01  4.593e-01  -1.147  0.25146
## CountrySao Tome and Principe            -3.625e-01  4.519e-01  -0.802  0.42246
## CountrySaudi Arabia                      3.262e-01  4.628e-01   0.705  0.48092
## CountrySenegal                          -6.519e-01  4.227e-01  -1.542  0.12298
## CountrySerbia                           -1.925e-01  4.244e-01  -0.453  0.65021
## CountrySeychelles                       -4.212e-01  4.219e-01  -0.998  0.31808
## CountrySierra Leone                     -5.858e-01  4.579e-01  -1.279  0.20081
## CountrySingapore                        -3.813e-02  4.696e-01  -0.081  0.93528
## CountrySlovakia                         -3.963e-01  4.922e-01  -0.805  0.42063
## CountrySlovenia                         -2.354e-01  4.278e-01  -0.550  0.58212
## CountrySolomon Islands                   5.210e-01  4.612e-01   1.130  0.25863
## CountrySomalia                           2.123e-01  4.340e-01   0.489  0.62476
## CountrySouth Africa                     -6.290e-01  4.553e-01  -1.382  0.16709
## CountrySouth Korea                       1.743e-01  4.254e-01   0.410  0.68198
## CountrySouth Sudan                      -7.927e-01  4.978e-01  -1.592  0.11128
## CountrySpain                             6.553e-01  4.520e-01   1.450  0.14716
## CountrySri Lanka                         1.160e-01  4.356e-01   0.266  0.79002
## CountrySudan                            -1.128e+00  4.799e-01  -2.351  0.01871
## CountrySwaziland                        -3.879e-01  4.384e-01  -0.885  0.37635
## CountrySweden                            2.491e-01  4.492e-01   0.555  0.57913
## CountrySwitzerland                      -1.638e-01  4.517e-01  -0.363  0.71691
## CountrySyria                             2.622e-01  4.744e-01   0.553  0.58054
## CountryTaiwan                            5.800e-01  4.220e-01   1.375  0.16927
## CountryTajikistan                        9.231e-01  4.818e-01   1.916  0.05535
## CountryTanzania                         -3.599e-01  4.587e-01  -0.785  0.43267
## CountryThailand                          2.533e-01  4.425e-01   0.572  0.56707
## CountryThe Bahamas                      -3.801e-02  4.587e-01  -0.083  0.93395
## CountryThe Gambia                       -5.532e-02  4.670e-01  -0.118  0.90571
## CountryTogo                             -8.430e-01  4.556e-01  -1.850  0.06429
## CountryTonga                            -9.238e-02  4.943e-01  -0.187  0.85173
## CountryTrinidad and Tobago                      NA         NA      NA       NA
## CountryTunisia                          -2.721e-01  4.341e-01  -0.627  0.53081
## CountryTurkey                           -4.081e-01  4.581e-01  -0.891  0.37305
## CountryTurkmenistan                     -1.848e-01  4.394e-01  -0.421  0.67407
## CountryTuvalu                            2.326e-01  4.633e-01   0.502  0.61559
## CountryUganda                            4.805e-02  4.481e-01   0.107  0.91460
## CountryUkraine                          -4.707e-02  4.192e-01  -0.112  0.91059
## CountryUnited Arab Emirates             -5.428e-01  4.380e-01  -1.239  0.21523
## CountryUnited Kingdom                   -5.522e-02  4.017e-01  -0.137  0.89066
## CountryUnited States of America                 NA         NA      NA       NA
## CountryUzbekistan                        5.696e-01  4.432e-01   1.285  0.19871
## CountryVanuatu                                  NA         NA      NA       NA
## CountryVatican City                             NA         NA      NA       NA
## CountryVietnam                                  NA         NA      NA       NA
## CountryYemen                            -2.095e-01  4.521e-01  -0.463  0.64315
## CountryZambia                           -3.338e-02  4.314e-01  -0.077  0.93832
## CountryZimbabwe                                 NA         NA      NA       NA
## Item.TypeBeverages                      -2.617e-01  1.351e-01  -1.936  0.05283
## Item.TypeCereal                          7.359e-02  1.160e-01   0.635  0.52569
## Item.TypeClothes                        -1.350e-01  1.174e-01  -1.150  0.25021
## Item.TypeCosmetics                       5.541e-02  1.333e-01   0.416  0.67762
## Item.TypeFruits                         -2.580e-01  1.406e-01  -1.835  0.06647
## Item.TypeHousehold                      -4.127e-02  1.382e-01  -0.299  0.76528
## Item.TypeMeat                           -2.611e-01  1.595e-01  -1.636  0.10174
## Item.TypeOffice Supplies                -1.228e-01  1.542e-01  -0.797  0.42572
## Item.TypePersonal Care                  -1.492e-01  1.286e-01  -1.160  0.24612
## Item.TypeSnacks                         -1.603e-01  1.201e-01  -1.334  0.18205
## Item.TypeVegetables                     -2.212e-01  1.188e-01  -1.862  0.06261
## Order.PriorityH                         -9.341e-02  6.718e-02  -1.390  0.16438
## Order.PriorityL                         -5.083e-02  6.718e-02  -0.757  0.44932
## Order.PriorityM                         -6.427e-02  6.726e-02  -0.956  0.33931
## Units.Sold                               2.302e-05  1.495e-05   1.540  0.12349
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Cost                               9.646e-08  6.917e-08   1.395  0.16312
## Total.Profit                            -5.653e-07  2.316e-07  -2.441  0.01463
## Total.Revenue                                   NA         NA      NA       NA
## Days                                     2.868e-04  1.626e-03   0.176  0.86001
## Order.DayMon                             5.713e-02  8.863e-02   0.645  0.51918
## Order.DaySat                             8.407e-02  8.936e-02   0.941  0.34684
## Order.DaySun                             7.435e-02  8.813e-02   0.844  0.39882
## Order.DayThu                             2.778e-02  8.933e-02   0.311  0.75581
## Order.DayTue                             1.913e-02  8.853e-02   0.216  0.82893
## Order.DayWed                             1.111e-01  8.876e-02   1.252  0.21050
## Order.MonthAug                           8.742e-02  1.169e-01   0.748  0.45447
## Order.MonthDec                           2.089e-02  1.181e-01   0.177  0.85962
## Order.MonthFeb                           1.929e-01  1.164e-01   1.657  0.09756
## Order.MonthJan                           1.568e-02  1.134e-01   0.138  0.89008
## Order.MonthJul                           2.007e-01  1.136e-01   1.767  0.07717
## Order.MonthJun                           1.342e-01  1.141e-01   1.176  0.23954
## Order.MonthMar                           9.468e-02  1.124e-01   0.842  0.39974
## Order.MonthMay                          -5.180e-02  1.139e-01  -0.455  0.64936
## Order.MonthNov                           1.568e-01  1.210e-01   1.295  0.19520
## Order.MonthOct                           7.551e-02  1.182e-01   0.639  0.52294
## Order.MonthSep                           1.055e-01  1.200e-01   0.879  0.37928
## Order.Year                               7.297e-03  1.094e-02   0.667  0.50476
##                                           
## (Intercept)                               
## RegionAustralia and Oceania               
## RegionCentral America and the Caribbean   
## RegionEurope                              
## RegionMiddle East and North Africa        
## RegionNorth America                       
## RegionSub-Saharan Africa                . 
## CountryAlbania                            
## CountryAlgeria                            
## CountryAndorra                            
## CountryAngola                           **
## CountryAntigua and Barbuda                
## CountryArmenia                            
## CountryAustralia                          
## CountryAustria                            
## CountryAzerbaijan                         
## CountryBahrain                            
## CountryBangladesh                         
## CountryBarbados                         * 
## CountryBelarus                            
## CountryBelgium                            
## CountryBelize                             
## CountryBenin                            * 
## CountryBhutan                             
## CountryBosnia and Herzegovina             
## CountryBotswana                           
## CountryBrunei                             
## CountryBulgaria                           
## CountryBurkina Faso                       
## CountryBurundi                          . 
## CountryCambodia                           
## CountryCameroon                           
## CountryCanada                           * 
## CountryCape Verde                         
## CountryCentral African Republic           
## CountryChad                               
## CountryChina                              
## CountryComoros                            
## CountryCosta Rica                         
## CountryCote d'Ivoire                      
## CountryCroatia                            
## CountryCuba                               
## CountryCyprus                             
## CountryCzech Republic                     
## CountryDemocratic Republic of the Congo . 
## CountryDenmark                            
## CountryDjibouti                           
## CountryDominica                           
## CountryDominican Republic                 
## CountryEast Timor                       * 
## CountryEgypt                              
## CountryEl Salvador                        
## CountryEquatorial Guinea                . 
## CountryEritrea                            
## CountryEstonia                            
## CountryEthiopia                           
## CountryFederated States of Micronesia     
## CountryFiji                               
## CountryFinland                            
## CountryFrance                             
## CountryGabon                            . 
## CountryGeorgia                            
## CountryGermany                            
## CountryGhana                              
## CountryGreece                             
## CountryGreenland                          
## CountryGrenada                            
## CountryGuatemala                          
## CountryGuinea                             
## CountryGuinea-Bissau                      
## CountryHaiti                              
## CountryHonduras                           
## CountryHungary                            
## CountryIceland                            
## CountryIndia                              
## CountryIndonesia                          
## CountryIran                               
## CountryIraq                               
## CountryIreland                            
## CountryIsrael                             
## CountryItaly                              
## CountryJamaica                            
## CountryJapan                              
## CountryJordan                             
## CountryKazakhstan                         
## CountryKenya                              
## CountryKiribati                           
## CountryKosovo                             
## CountryKuwait                             
## CountryKyrgyzstan                         
## CountryLaos                               
## CountryLatvia                             
## CountryLebanon                            
## CountryLesotho                          . 
## CountryLiberia                            
## CountryLibya                              
## CountryLiechtenstein                      
## CountryLithuania                          
## CountryLuxembourg                         
## CountryMacedonia                          
## CountryMadagascar                         
## CountryMalawi                             
## CountryMalaysia                           
## CountryMaldives                           
## CountryMali                               
## CountryMalta                            . 
## CountryMarshall Islands                   
## CountryMauritania                         
## CountryMauritius                          
## CountryMexico                             
## CountryMoldova                            
## CountryMonaco                             
## CountryMongolia                           
## CountryMontenegro                         
## CountryMorocco                            
## CountryMozambique                       * 
## CountryMyanmar                            
## CountryNamibia                            
## CountryNauru                              
## CountryNepal                            . 
## CountryNetherlands                        
## CountryNew Zealand                        
## CountryNicaragua                          
## CountryNiger                              
## CountryNigeria                          * 
## CountryNorth Korea                        
## CountryNorway                             
## CountryOman                               
## CountryPakistan                           
## CountryPalau                              
## CountryPanama                             
## CountryPapua New Guinea                   
## CountryPhilippines                        
## CountryPoland                             
## CountryPortugal                           
## CountryQatar                              
## CountryRepublic of the Congo              
## CountryRomania                            
## CountryRussia                             
## CountryRwanda                             
## CountrySaint Kitts and Nevis              
## CountrySaint Lucia                        
## CountrySaint Vincent and the Grenadines   
## CountrySamoa                              
## CountrySan Marino                         
## CountrySao Tome and Principe              
## CountrySaudi Arabia                       
## CountrySenegal                            
## CountrySerbia                             
## CountrySeychelles                         
## CountrySierra Leone                       
## CountrySingapore                          
## CountrySlovakia                           
## CountrySlovenia                           
## CountrySolomon Islands                    
## CountrySomalia                            
## CountrySouth Africa                       
## CountrySouth Korea                        
## CountrySouth Sudan                        
## CountrySpain                              
## CountrySri Lanka                          
## CountrySudan                            * 
## CountrySwaziland                          
## CountrySweden                             
## CountrySwitzerland                        
## CountrySyria                              
## CountryTaiwan                             
## CountryTajikistan                       . 
## CountryTanzania                           
## CountryThailand                           
## CountryThe Bahamas                        
## CountryThe Gambia                         
## CountryTogo                             . 
## CountryTonga                              
## CountryTrinidad and Tobago                
## CountryTunisia                            
## CountryTurkey                             
## CountryTurkmenistan                       
## CountryTuvalu                             
## CountryUganda                             
## CountryUkraine                            
## CountryUnited Arab Emirates               
## CountryUnited Kingdom                     
## CountryUnited States of America           
## CountryUzbekistan                         
## CountryVanuatu                            
## CountryVatican City                       
## CountryVietnam                            
## CountryYemen                              
## CountryZambia                             
## CountryZimbabwe                           
## Item.TypeBeverages                      . 
## Item.TypeCereal                           
## Item.TypeClothes                          
## Item.TypeCosmetics                        
## Item.TypeFruits                         . 
## Item.TypeHousehold                        
## Item.TypeMeat                             
## Item.TypeOffice Supplies                  
## Item.TypePersonal Care                    
## Item.TypeSnacks                           
## Item.TypeVegetables                     . 
## Order.PriorityH                           
## Order.PriorityL                           
## Order.PriorityM                           
## Units.Sold                                
## Unit.Price                                
## Unit.Cost                                 
## Total.Cost                                
## Total.Profit                            * 
## Total.Revenue                             
## Days                                      
## Order.DayMon                              
## Order.DaySat                              
## Order.DaySun                              
## Order.DayThu                              
## Order.DayTue                              
## Order.DayWed                              
## Order.MonthAug                            
## Order.MonthDec                            
## Order.MonthFeb                          . 
## Order.MonthJan                            
## Order.MonthJul                          . 
## Order.MonthJun                            
## Order.MonthMar                            
## Order.MonthMay                            
## Order.MonthNov                            
## Order.MonthOct                            
## Order.MonthSep                            
## Order.Year                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 10396  on 7499  degrees of freedom
## Residual deviance: 10198  on 7279  degrees of freedom
## AIC: 10640
## 
## Number of Fisher Scoring iterations: 4
glm.probs2<- predict(glm.log2, type="response")

contrasts(dataset10000_2$Sales.Channel)
##         Online
## Offline      0
## Online       1
glm.pred2<-rep("Offline",nrow(dataset10000_2_train))
glm.pred2[glm.probs2>.5]="Online"
table(glm.pred2,dataset10000_2_train$Sales.Channel)
##          
## glm.pred2 Offline Online
##   Offline    2062   1597
##   Online     1650   2191
mean(glm.pred2 ==dataset10000_2_train$Sales.Channel)
## [1] 0.5670667

This means the model correctly predicted 57% of the time on the training data.

Run on the test data of the 10000

Let’s check the test data.

glm.log_test2<-glm(Sales.Channel ~ Region + Country +            Item.Type + Order.Priority +
        Units.Sold + Unit.Price +
          Unit.Cost + Total.Cost +
          Total.Profit + Total.Revenue +
          Days + Order.Day + Order.Month +                    Order.Year, data=dataset10000_2_test, family=binomial)

summary(glm.log_test2)
## 
## Call:
## glm(formula = Sales.Channel ~ Region + Country + Item.Type + 
##     Order.Priority + Units.Sold + Unit.Price + Unit.Cost + Total.Cost + 
##     Total.Profit + Total.Revenue + Days + Order.Day + Order.Month + 
##     Order.Year, family = binomial, data = dataset10000_2_test)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0024  -1.1087   0.5845   1.0877   2.0149  
## 
## Coefficients: (9 not defined because of singularities)
##                                           Estimate Std. Error z value Pr(>|z|)
## (Intercept)                              3.826e+01  4.074e+01   0.939   0.3477
## RegionAustralia and Oceania              4.006e-01  8.236e-01   0.486   0.6266
## RegionCentral America and the Caribbean -4.623e-01  8.472e-01  -0.546   0.5853
## RegionEurope                            -8.319e-01  8.550e-01  -0.973   0.3306
## RegionMiddle East and North Africa       1.848e-01  7.610e-01   0.243   0.8082
## RegionNorth America                     -3.200e-01  7.727e-01  -0.414   0.6788
## RegionSub-Saharan Africa                 1.607e-01  7.625e-01   0.211   0.8330
## CountryAlbania                           7.401e-01  8.510e-01   0.870   0.3845
## CountryAlgeria                          -1.041e+00  7.792e-01  -1.336   0.1817
## CountryAndorra                           8.006e-01  8.047e-01   0.995   0.3198
## CountryAngola                           -4.364e-01  7.225e-01  -0.604   0.5459
## CountryAntigua and Barbuda              -7.688e-02  9.369e-01  -0.082   0.9346
## CountryArmenia                          -5.427e-01  1.070e+00  -0.507   0.6119
## CountryAustralia                        -8.536e-01  8.048e-01  -1.061   0.2888
## CountryAustria                           2.691e-01  8.979e-01   0.300   0.7644
## CountryAzerbaijan                       -5.454e-01  7.422e-01  -0.735   0.4624
## CountryBahrain                          -5.096e-01  7.566e-01  -0.674   0.5006
## CountryBangladesh                        2.364e-01  7.228e-01   0.327   0.7436
## CountryBarbados                         -3.943e-01  9.079e-01  -0.434   0.6640
## CountryBelarus                           1.858e-01  9.000e-01   0.206   0.8365
## CountryBelgium                          -1.581e-01  1.096e+00  -0.144   0.8853
## CountryBelize                           -2.126e-01  8.441e-01  -0.252   0.8012
## CountryBenin                            -8.352e-01  7.957e-01  -1.050   0.2939
## CountryBhutan                           -8.967e-01  8.021e-01  -1.118   0.2636
## CountryBosnia and Herzegovina            2.372e-02  1.016e+00   0.023   0.9814
## CountryBotswana                         -4.807e-01  6.808e-01  -0.706   0.4802
## CountryBrunei                            8.619e-01  8.731e-01   0.987   0.3235
## CountryBulgaria                          8.705e-01  8.858e-01   0.983   0.3257
## CountryBurkina Faso                     -3.201e-01  7.608e-01  -0.421   0.6740
## CountryBurundi                          -1.094e+00  7.442e-01  -1.470   0.1415
## CountryCambodia                          2.244e-01  7.524e-01   0.298   0.7655
## CountryCameroon                         -1.821e+00  8.494e-01  -2.144   0.0320
## CountryCanada                           -5.098e-02  7.711e-01  -0.066   0.9473
## CountryCape Verde                       -8.689e-01  7.337e-01  -1.184   0.2363
## CountryCentral African Republic         -9.248e-01  7.982e-01  -1.159   0.2466
## CountryChad                             -1.063e-01  9.072e-01  -0.117   0.9067
## CountryChina                            -5.000e-01  7.919e-01  -0.631   0.5278
## CountryComoros                          -6.388e-01  7.456e-01  -0.857   0.3916
## CountryCosta Rica                        1.788e-01  8.866e-01   0.202   0.8402
## CountryCote d'Ivoire                     3.194e-02  8.288e-01   0.039   0.9693
## CountryCroatia                           2.867e-01  8.163e-01   0.351   0.7254
## CountryCuba                              6.699e-01  9.017e-01   0.743   0.4575
## CountryCyprus                            9.295e-01  8.721e-01   1.066   0.2865
## CountryCzech Republic                    2.535e-01  8.320e-01   0.305   0.7606
## CountryDemocratic Republic of the Congo -9.230e-01  7.519e-01  -1.227   0.2197
## CountryDenmark                           3.231e-02  8.731e-01   0.037   0.9705
## CountryDjibouti                          9.193e-01  8.340e-01   1.102   0.2704
## CountryDominica                         -1.017e+00  9.299e-01  -1.094   0.2740
## CountryDominican Republic               -5.841e-02  8.094e-01  -0.072   0.9425
## CountryEast Timor                       -5.749e-01  8.303e-01  -0.692   0.4887
## CountryEgypt                            -8.048e-01  8.061e-01  -0.998   0.3181
## CountryEl Salvador                       8.381e-01  8.498e-01   0.986   0.3240
## CountryEquatorial Guinea                 2.092e-01  8.938e-01   0.234   0.8149
## CountryEritrea                           1.416e+00  1.196e+00   1.184   0.2365
## CountryEstonia                           1.585e-01  8.232e-01   0.193   0.8473
## CountryEthiopia                          2.614e-01  7.656e-01   0.341   0.7327
## CountryFederated States of Micronesia   -3.483e-01  8.958e-01  -0.389   0.6974
## CountryFiji                              5.331e-01  1.015e+00   0.525   0.5993
## CountryFinland                           1.872e+00  1.033e+00   1.812   0.0699
## CountryFrance                            4.219e-01  8.509e-01   0.496   0.6200
## CountryGabon                            -3.162e-01  7.737e-01  -0.409   0.6828
## CountryGeorgia                           8.536e-01  8.249e-01   1.035   0.3008
## CountryGermany                           7.248e-01  8.842e-01   0.820   0.4124
## CountryGhana                             3.008e-01  7.976e-01   0.377   0.7061
## CountryGreece                           -5.336e-02  8.664e-01  -0.062   0.9509
## CountryGreenland                         2.269e-01  8.680e-01   0.261   0.7937
## CountryGrenada                           3.201e-01  8.544e-01   0.375   0.7079
## CountryGuatemala                         9.443e-01  8.874e-01   1.064   0.2873
## CountryGuinea                           -2.913e-01  6.899e-01  -0.422   0.6729
## CountryGuinea-Bissau                    -9.337e-01  7.026e-01  -1.329   0.1839
## CountryHaiti                            -3.156e-01  8.743e-01  -0.361   0.7181
## CountryHonduras                         -4.138e-01  8.550e-01  -0.484   0.6284
## CountryHungary                           9.500e-01  8.353e-01   1.137   0.2554
## CountryIceland                           5.269e-01  8.386e-01   0.628   0.5298
## CountryIndia                            -2.456e-02  8.115e-01  -0.030   0.9759
## CountryIndonesia                        -9.844e-01  7.624e-01  -1.291   0.1966
## CountryIran                             -7.833e-02  8.404e-01  -0.093   0.9257
## CountryIraq                              1.047e-01  8.919e-01   0.117   0.9065
## CountryIreland                           1.826e+00  9.305e-01   1.962   0.0497
## CountryIsrael                            5.798e-01  9.661e-01   0.600   0.5484
## CountryItaly                             2.063e-01  8.839e-01   0.233   0.8154
## CountryJamaica                          -7.420e-01  9.448e-01  -0.785   0.4322
## CountryJapan                            -3.668e-01  7.504e-01  -0.489   0.6250
## CountryJordan                            3.637e-01  7.958e-01   0.457   0.6476
## CountryKazakhstan                       -1.532e+00  9.920e-01  -1.545   0.1224
## CountryKenya                            -1.198e+00  7.743e-01  -1.547   0.1218
## CountryKiribati                         -1.059e+00  7.563e-01  -1.401   0.1613
## CountryKosovo                           -1.104e-01  8.643e-01  -0.128   0.8983
## CountryKuwait                           -8.392e-01  7.204e-01  -1.165   0.2440
## CountryKyrgyzstan                       -5.593e-01  8.224e-01  -0.680   0.4965
## CountryLaos                             -1.054e-01  7.888e-01  -0.134   0.8937
## CountryLatvia                           -3.771e-01  9.739e-01  -0.387   0.6986
## CountryLebanon                          -5.097e-01  7.852e-01  -0.649   0.5162
## CountryLesotho                          -5.407e-01  7.520e-01  -0.719   0.4721
## CountryLiberia                          -1.248e+00  8.889e-01  -1.404   0.1602
## CountryLibya                            -4.971e-01  7.417e-01  -0.670   0.5027
## CountryLiechtenstein                     5.299e-01  8.421e-01   0.629   0.5291
## CountryLithuania                         2.728e-01  7.662e-01   0.356   0.7218
## CountryLuxembourg                       -8.894e-01  1.038e+00  -0.857   0.3914
## CountryMacedonia                        -5.711e-01  8.882e-01  -0.643   0.5202
## CountryMadagascar                       -1.057e+00  7.849e-01  -1.347   0.1781
## CountryMalawi                           -1.074e+00  7.458e-01  -1.440   0.1498
## CountryMalaysia                         -6.450e-01  7.252e-01  -0.889   0.3738
## CountryMaldives                         -1.618e-01  9.516e-01  -0.170   0.8650
## CountryMali                             -9.796e-02  9.102e-01  -0.108   0.9143
## CountryMalta                            -1.311e+00  1.257e+00  -1.043   0.2969
## CountryMarshall Islands                 -2.440e+00  1.266e+00  -1.928   0.0539
## CountryMauritania                       -9.526e-01  7.501e-01  -1.270   0.2041
## CountryMauritius                        -7.874e-02  7.723e-01  -0.102   0.9188
## CountryMexico                           -2.409e-02  8.430e-01  -0.029   0.9772
## CountryMoldova                           9.004e-02  8.542e-01   0.105   0.9161
## CountryMonaco                            1.445e+00  8.868e-01   1.629   0.1033
## CountryMongolia                         -6.154e-01  7.613e-01  -0.808   0.4189
## CountryMontenegro                        5.110e-01  8.198e-01   0.623   0.5331
## CountryMorocco                          -6.976e-01  6.910e-01  -1.010   0.3127
## CountryMozambique                       -8.577e-01  7.152e-01  -1.199   0.2305
## CountryMyanmar                          -2.439e-01  7.612e-01  -0.320   0.7487
## CountryNamibia                          -2.932e-01  7.337e-01  -0.400   0.6895
## CountryNauru                            -1.447e+00  8.680e-01  -1.667   0.0955
## CountryNepal                            -8.305e-01  8.126e-01  -1.022   0.3067
## CountryNetherlands                       4.737e-01  9.177e-01   0.516   0.6057
## CountryNew Zealand                      -4.041e-01  7.898e-01  -0.512   0.6089
## CountryNicaragua                         3.341e-01  9.193e-01   0.363   0.7163
## CountryNiger                            -1.971e+00  8.438e-01  -2.336   0.0195
## CountryNigeria                           3.309e-01  8.787e-01   0.377   0.7065
## CountryNorth Korea                      -4.894e-01  7.497e-01  -0.653   0.5138
## CountryNorway                            1.262e+00  1.069e+00   1.181   0.2375
## CountryOman                             -3.798e-01  7.371e-01  -0.515   0.6064
## CountryPakistan                          2.425e-01  8.021e-01   0.302   0.7624
## CountryPalau                            -6.201e-01  7.684e-01  -0.807   0.4197
## CountryPanama                            1.137e+00  1.047e+00   1.086   0.2775
## CountryPapua New Guinea                 -5.583e-01  8.199e-01  -0.681   0.4959
## CountryPhilippines                      -3.232e-02  7.989e-01  -0.040   0.9677
## CountryPoland                           -5.750e-01  8.818e-01  -0.652   0.5144
## CountryPortugal                          1.828e+00  1.032e+00   1.771   0.0765
## CountryQatar                            -6.761e-01  7.551e-01  -0.895   0.3705
## CountryRepublic of the Congo             7.342e-02  7.371e-01   0.100   0.9207
## CountryRomania                          -1.908e-02  8.724e-01  -0.022   0.9826
## CountryRussia                           -1.990e-01  8.636e-01  -0.230   0.8178
## CountryRwanda                           -4.771e-01  6.791e-01  -0.703   0.4823
## CountrySaint Kitts and Nevis            -6.664e-01  8.429e-01  -0.791   0.4292
## CountrySaint Lucia                       6.715e-01  9.027e-01   0.744   0.4570
## CountrySaint Vincent and the Grenadines  2.765e-01  8.427e-01   0.328   0.7428
## CountrySamoa                            -3.970e-01  7.975e-01  -0.498   0.6186
## CountrySan Marino                       -2.947e-01  8.880e-01  -0.332   0.7399
## CountrySao Tome and Principe             3.787e-01  7.980e-01   0.475   0.6351
## CountrySaudi Arabia                      2.173e-01  7.627e-01   0.285   0.7757
## CountrySenegal                          -3.456e-01  7.941e-01  -0.435   0.6634
## CountrySerbia                            4.790e-01  8.275e-01   0.579   0.5627
## CountrySeychelles                       -1.418e-01  7.215e-01  -0.197   0.8442
## CountrySierra Leone                      2.627e-01  8.756e-01   0.300   0.7641
## CountrySingapore                         6.322e-01  8.835e-01   0.716   0.4743
## CountrySlovakia                         -2.057e-01  8.600e-01  -0.239   0.8109
## CountrySlovenia                          8.744e-01  8.014e-01   1.091   0.2752
## CountrySolomon Islands                  -1.356e+00  8.937e-01  -1.517   0.1292
## CountrySomalia                          -3.547e-01  7.672e-01  -0.462   0.6438
## CountrySouth Africa                     -4.640e-02  8.285e-01  -0.056   0.9553
## CountrySouth Korea                       8.686e-01  8.639e-01   1.005   0.3147
## CountrySouth Sudan                      -1.999e-01  7.086e-01  -0.282   0.7778
## CountrySpain                             9.002e-01  8.854e-01   1.017   0.3093
## CountrySri Lanka                        -1.266e+00  8.232e-01  -1.538   0.1241
## CountrySudan                            -3.619e-02  7.145e-01  -0.051   0.9596
## CountrySwaziland                        -1.386e+00  7.220e-01  -1.919   0.0550
## CountrySweden                           -6.336e-01  9.405e-01  -0.674   0.5006
## CountrySwitzerland                       1.459e+00  8.135e-01   1.794   0.0729
## CountrySyria                            -1.042e+00  7.796e-01  -1.336   0.1814
## CountryTaiwan                           -1.059e+00  7.925e-01  -1.336   0.1816
## CountryTajikistan                        6.685e-01  1.012e+00   0.661   0.5087
## CountryTanzania                         -5.805e-01  7.861e-01  -0.739   0.4602
## CountryThailand                          1.446e+00  9.461e-01   1.529   0.1263
## CountryThe Bahamas                      -7.063e-01  9.587e-01  -0.737   0.4613
## CountryThe Gambia                       -1.516e+01  3.103e+02  -0.049   0.9610
## CountryTogo                             -5.742e-01  7.137e-01  -0.805   0.4210
## CountryTonga                            -1.173e+00  8.418e-01  -1.393   0.1635
## CountryTrinidad and Tobago                      NA         NA      NA       NA
## CountryTunisia                          -8.475e-01  7.651e-01  -1.108   0.2680
## CountryTurkey                           -1.340e+00  8.847e-01  -1.515   0.1298
## CountryTurkmenistan                     -5.935e-01  7.591e-01  -0.782   0.4343
## CountryTuvalu                           -6.961e-01  7.677e-01  -0.907   0.3645
## CountryUganda                           -1.296e+00  8.127e-01  -1.594   0.1109
## CountryUkraine                           7.304e-01  8.952e-01   0.816   0.4146
## CountryUnited Arab Emirates             -1.457e+00  7.210e-01  -2.021   0.0433
## CountryUnited Kingdom                    1.253e+00  8.470e-01   1.480   0.1390
## CountryUnited States of America                 NA         NA      NA       NA
## CountryUzbekistan                       -1.062e+00  8.305e-01  -1.279   0.2010
## CountryVanuatu                                  NA         NA      NA       NA
## CountryVatican City                             NA         NA      NA       NA
## CountryVietnam                                  NA         NA      NA       NA
## CountryYemen                             8.418e-01  9.521e-01   0.884   0.3766
## CountryZambia                           -6.463e-01  8.358e-01  -0.773   0.4393
## CountryZimbabwe                                 NA         NA      NA       NA
## Item.TypeBeverages                       2.477e-01  2.537e-01   0.977   0.3288
## Item.TypeCereal                          4.283e-02  2.207e-01   0.194   0.8461
## Item.TypeClothes                        -2.379e-01  2.225e-01  -1.069   0.2849
## Item.TypeCosmetics                      -4.978e-04  2.502e-01  -0.002   0.9984
## Item.TypeFruits                          4.334e-01  2.670e-01   1.623   0.1046
## Item.TypeHousehold                      -4.746e-02  2.648e-01  -0.179   0.8577
## Item.TypeMeat                            8.467e-02  2.903e-01   0.292   0.7706
## Item.TypeOffice Supplies                 2.977e-01  2.838e-01   1.049   0.2942
## Item.TypePersonal Care                  -5.350e-02  2.406e-01  -0.222   0.8241
## Item.TypeSnacks                          9.295e-02  2.320e-01   0.401   0.6887
## Item.TypeVegetables                     -2.573e-01  2.199e-01  -1.170   0.2419
## Order.PriorityH                         -1.365e-01  1.221e-01  -1.118   0.2635
## Order.PriorityL                         -1.238e-02  1.235e-01  -0.100   0.9202
## Order.PriorityM                         -5.833e-02  1.247e-01  -0.468   0.6400
## Units.Sold                              -9.806e-06  2.776e-05  -0.353   0.7239
## Unit.Price                                      NA         NA      NA       NA
## Unit.Cost                                       NA         NA      NA       NA
## Total.Cost                              -1.821e-07  1.264e-07  -1.440   0.1498
## Total.Profit                             4.648e-07  4.267e-07   1.089   0.2760
## Total.Revenue                                   NA         NA      NA       NA
## Days                                    -6.387e-04  3.002e-03  -0.213   0.8315
## Order.DayMon                            -8.152e-02  1.640e-01  -0.497   0.6191
## Order.DaySat                            -6.101e-02  1.685e-01  -0.362   0.7173
## Order.DaySun                             1.534e-01  1.637e-01   0.937   0.3488
## Order.DayThu                             1.396e-01  1.642e-01   0.850   0.3953
## Order.DayTue                             8.353e-02  1.683e-01   0.496   0.6198
## Order.DayWed                             5.096e-02  1.612e-01   0.316   0.7519
## Order.MonthAug                           1.585e-01  2.133e-01   0.743   0.4574
## Order.MonthDec                           5.356e-02  2.058e-01   0.260   0.7947
## Order.MonthFeb                           3.081e-02  2.181e-01   0.141   0.8876
## Order.MonthJan                          -1.883e-01  2.083e-01  -0.904   0.3662
## Order.MonthJul                           5.245e-03  2.037e-01   0.026   0.9795
## Order.MonthJun                           6.941e-02  2.109e-01   0.329   0.7421
## Order.MonthMar                          -2.218e-01  2.104e-01  -1.054   0.2920
## Order.MonthMay                           2.601e-02  2.082e-01   0.125   0.9006
## Order.MonthNov                          -6.481e-02  2.217e-01  -0.292   0.7700
## Order.MonthOct                          -3.519e-01  2.165e-01  -1.625   0.1041
## Order.MonthSep                           1.591e-03  2.181e-01   0.007   0.9942
## Order.Year                              -1.878e-02  2.022e-02  -0.929   0.3531
##                                          
## (Intercept)                              
## RegionAustralia and Oceania              
## RegionCentral America and the Caribbean  
## RegionEurope                             
## RegionMiddle East and North Africa       
## RegionNorth America                      
## RegionSub-Saharan Africa                 
## CountryAlbania                           
## CountryAlgeria                           
## CountryAndorra                           
## CountryAngola                            
## CountryAntigua and Barbuda               
## CountryArmenia                           
## CountryAustralia                         
## CountryAustria                           
## CountryAzerbaijan                        
## CountryBahrain                           
## CountryBangladesh                        
## CountryBarbados                          
## CountryBelarus                           
## CountryBelgium                           
## CountryBelize                            
## CountryBenin                             
## CountryBhutan                            
## CountryBosnia and Herzegovina            
## CountryBotswana                          
## CountryBrunei                            
## CountryBulgaria                          
## CountryBurkina Faso                      
## CountryBurundi                           
## CountryCambodia                          
## CountryCameroon                         *
## CountryCanada                            
## CountryCape Verde                        
## CountryCentral African Republic          
## CountryChad                              
## CountryChina                             
## CountryComoros                           
## CountryCosta Rica                        
## CountryCote d'Ivoire                     
## CountryCroatia                           
## CountryCuba                              
## CountryCyprus                            
## CountryCzech Republic                    
## CountryDemocratic Republic of the Congo  
## CountryDenmark                           
## CountryDjibouti                          
## CountryDominica                          
## CountryDominican Republic                
## CountryEast Timor                        
## CountryEgypt                             
## CountryEl Salvador                       
## CountryEquatorial Guinea                 
## CountryEritrea                           
## CountryEstonia                           
## CountryEthiopia                          
## CountryFederated States of Micronesia    
## CountryFiji                              
## CountryFinland                          .
## CountryFrance                            
## CountryGabon                             
## CountryGeorgia                           
## CountryGermany                           
## CountryGhana                             
## CountryGreece                            
## CountryGreenland                         
## CountryGrenada                           
## CountryGuatemala                         
## CountryGuinea                            
## CountryGuinea-Bissau                     
## CountryHaiti                             
## CountryHonduras                          
## CountryHungary                           
## CountryIceland                           
## CountryIndia                             
## CountryIndonesia                         
## CountryIran                              
## CountryIraq                              
## CountryIreland                          *
## CountryIsrael                            
## CountryItaly                             
## CountryJamaica                           
## CountryJapan                             
## CountryJordan                            
## CountryKazakhstan                        
## CountryKenya                             
## CountryKiribati                          
## CountryKosovo                            
## CountryKuwait                            
## CountryKyrgyzstan                        
## CountryLaos                              
## CountryLatvia                            
## CountryLebanon                           
## CountryLesotho                           
## CountryLiberia                           
## CountryLibya                             
## CountryLiechtenstein                     
## CountryLithuania                         
## CountryLuxembourg                        
## CountryMacedonia                         
## CountryMadagascar                        
## CountryMalawi                            
## CountryMalaysia                          
## CountryMaldives                          
## CountryMali                              
## CountryMalta                             
## CountryMarshall Islands                 .
## CountryMauritania                        
## CountryMauritius                         
## CountryMexico                            
## CountryMoldova                           
## CountryMonaco                            
## CountryMongolia                          
## CountryMontenegro                        
## CountryMorocco                           
## CountryMozambique                        
## CountryMyanmar                           
## CountryNamibia                           
## CountryNauru                            .
## CountryNepal                             
## CountryNetherlands                       
## CountryNew Zealand                       
## CountryNicaragua                         
## CountryNiger                            *
## CountryNigeria                           
## CountryNorth Korea                       
## CountryNorway                            
## CountryOman                              
## CountryPakistan                          
## CountryPalau                             
## CountryPanama                            
## CountryPapua New Guinea                  
## CountryPhilippines                       
## CountryPoland                            
## CountryPortugal                         .
## CountryQatar                             
## CountryRepublic of the Congo             
## CountryRomania                           
## CountryRussia                            
## CountryRwanda                            
## CountrySaint Kitts and Nevis             
## CountrySaint Lucia                       
## CountrySaint Vincent and the Grenadines  
## CountrySamoa                             
## CountrySan Marino                        
## CountrySao Tome and Principe             
## CountrySaudi Arabia                      
## CountrySenegal                           
## CountrySerbia                            
## CountrySeychelles                        
## CountrySierra Leone                      
## CountrySingapore                         
## CountrySlovakia                          
## CountrySlovenia                          
## CountrySolomon Islands                   
## CountrySomalia                           
## CountrySouth Africa                      
## CountrySouth Korea                       
## CountrySouth Sudan                       
## CountrySpain                             
## CountrySri Lanka                         
## CountrySudan                             
## CountrySwaziland                        .
## CountrySweden                            
## CountrySwitzerland                      .
## CountrySyria                             
## CountryTaiwan                            
## CountryTajikistan                        
## CountryTanzania                          
## CountryThailand                          
## CountryThe Bahamas                       
## CountryThe Gambia                        
## CountryTogo                              
## CountryTonga                             
## CountryTrinidad and Tobago               
## CountryTunisia                           
## CountryTurkey                            
## CountryTurkmenistan                      
## CountryTuvalu                            
## CountryUganda                            
## CountryUkraine                           
## CountryUnited Arab Emirates             *
## CountryUnited Kingdom                    
## CountryUnited States of America          
## CountryUzbekistan                        
## CountryVanuatu                           
## CountryVatican City                      
## CountryVietnam                           
## CountryYemen                             
## CountryZambia                            
## CountryZimbabwe                          
## Item.TypeBeverages                       
## Item.TypeCereal                          
## Item.TypeClothes                         
## Item.TypeCosmetics                       
## Item.TypeFruits                          
## Item.TypeHousehold                       
## Item.TypeMeat                            
## Item.TypeOffice Supplies                 
## Item.TypePersonal Care                   
## Item.TypeSnacks                          
## Item.TypeVegetables                      
## Order.PriorityH                          
## Order.PriorityL                          
## Order.PriorityM                          
## Units.Sold                               
## Unit.Price                               
## Unit.Cost                                
## Total.Cost                               
## Total.Profit                             
## Total.Revenue                            
## Days                                     
## Order.DayMon                             
## Order.DaySat                             
## Order.DaySun                             
## Order.DayThu                             
## Order.DayTue                             
## Order.DayWed                             
## Order.MonthAug                           
## Order.MonthDec                           
## Order.MonthFeb                           
## Order.MonthJan                           
## Order.MonthJul                           
## Order.MonthJun                           
## Order.MonthMar                           
## Order.MonthMay                           
## Order.MonthNov                           
## Order.MonthOct                           
## Order.MonthSep                           
## Order.Year                               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3464.9  on 2499  degrees of freedom
## Residual deviance: 3217.0  on 2279  degrees of freedom
## AIC: 3659
## 
## Number of Fisher Scoring iterations: 13

Logistic test dataset results

glm.probs_test2<- predict(glm.log_test2, type="response") 

glm.pred_test2<-rep("Offline",nrow(dataset10000_2_test))
glm.pred_test2 [glm.probs_test2>.5]="Online"
table(glm.pred_test2,dataset10000_2_test$Sales.Channel)
##               
## glm.pred_test2 Offline Online
##        Offline     739    451
##        Online      488    822
mean(glm.pred_test2 ==dataset10000_2_test$Sales.Channel)
## [1] 0.6244

Conclusion

Conclusion - The logistic model performs well on the larger dataset. Because the dataset 100 records is too sparse and has complete separation. The correct prediction rate is approximately 60%.