Objective of the Task: The broad objective of this task was to better understand how specific product types perform against each other. The outcomes of this task will help the sales team better understand how types of products might impact sales across the enterprise

To do this an analysis of historical sales data was done in order to make sales volume predictions for a list of new product types. The focus would be on predicting sales of four different product types: PC, Laptops, Netbooks and Smartphones and then assessing the impact services reviews and customer reviews have on sales of different product types

Deliverables: The key deliverables for this task were to 1. Identify the Algorithms tested 2. Select an Algorithm and state the reason why it was selected 3. Do charts that show the impact of customer and service reviews on sales volume 4. Export the predicted findings to an excel sheet 5. Predict the sales of 4 specific products - PC, Laptops, Netbooks and Smartphones

Methodology: Regression was used to build machine learning models for this analyses using a choice three popular algorithms. Predictions were also done using all three algorithms and the best for the provided dataset identified. Below a detailed set of steps taken and the codes used are presented.

import the exisitng dataset

existing_products <- read.csv("C:/Users/gebruiker/Desktop/Ubiqum_1/existingproductattributes2017.csv")

#import the new products dataset

new_products <- read.csv("C:/Users/gebruiker/Desktop/Ubiqum_1/newproductattributes2017.csv")

#load the Caret package again - try to always do this so that it will be activated for the task

library(caret)
## Loading required package: lattice
## Loading required package: ggplot2

#dumify the data - typical datasets don’t contain only numeric values. Most data will contain a mixture of numeric and nominal data. Dumifying helps to incorporate both (numeric and nominal data) for developing regression models and making predictions.

#How to dumify the data - convert categorical variables (factor and character variables) to binary variables using the process below

#dumify the data - step1 - create a new dataframe made up of dummy variables from the exisiting products data

newDataFrame <- dummyVars(" ~ .", data = existing_products)

#next integrate the dummy variables df called newdataframe into the existing products dataframe and assign all to a new name called ready dataframe

readyData <- data.frame(predict(newDataFrame, newdata = existing_products))

#cross-check to ensure there are no nominal variables-check the structure

str(readyData)
## 'data.frame':    80 obs. of  29 variables:
##  $ ProductType.Accessories     : num  0 0 0 0 0 1 1 1 1 1 ...
##  $ ProductType.Display         : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.ExtendedWarranty: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.GameConsole     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Laptop          : num  0 0 0 1 1 0 0 0 0 0 ...
##  $ ProductType.Netbook         : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.PC              : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ ProductType.Printer         : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.PrinterSupplies : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Smartphone      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Software        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Tablet          : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductNum                  : num  101 102 103 104 105 106 107 108 109 110 ...
##  $ Price                       : num  949 2250 399 410 1080 ...
##  $ x5StarReviews               : num  3 2 3 49 58 83 11 33 16 10 ...
##  $ x4StarReviews               : num  3 1 0 19 31 30 3 19 9 1 ...
##  $ x3StarReviews               : num  2 0 0 8 11 10 0 12 2 1 ...
##  $ x2StarReviews               : num  0 0 0 3 7 9 0 5 0 0 ...
##  $ x1StarReviews               : num  0 0 0 9 36 40 1 9 2 0 ...
##  $ PositiveServiceReview       : num  2 1 1 7 7 12 3 5 2 2 ...
##  $ NegativeServiceReview       : num  0 0 0 8 20 5 0 3 1 0 ...
##  $ Recommendproduct            : num  0.9 0.9 0.9 0.8 0.7 0.3 0.9 0.7 0.8 0.9 ...
##  $ BestSellersRank             : num  1967 4806 12076 109 268 ...
##  $ ShippingWeight              : num  25.8 50 17.4 5.7 7 1.6 7.3 12 1.8 0.75 ...
##  $ ProductDepth                : num  23.9 35 10.5 15 12.9 ...
##  $ ProductWidth                : num  6.62 31.75 8.3 9.9 0.3 ...
##  $ ProductHeight               : num  16.9 19 10.2 1.3 8.9 ...
##  $ ProfitMargin                : num  0.15 0.25 0.08 0.08 0.09 0.05 0.05 0.05 0.05 0.05 ...
##  $ Volume                      : num  12 8 12 196 232 332 44 132 64 40 ...

#Check for missing data - all the columns/sections with NA’s

summary(readyData)
##  ProductType.Accessories ProductType.Display ProductType.ExtendedWarranty
##  Min.   :0.000           Min.   :0.0000      Min.   :0.000               
##  1st Qu.:0.000           1st Qu.:0.0000      1st Qu.:0.000               
##  Median :0.000           Median :0.0000      Median :0.000               
##  Mean   :0.325           Mean   :0.0625      Mean   :0.125               
##  3rd Qu.:1.000           3rd Qu.:0.0000      3rd Qu.:0.000               
##  Max.   :1.000           Max.   :1.0000      Max.   :1.000               
##                                                                          
##  ProductType.GameConsole ProductType.Laptop ProductType.Netbook
##  Min.   :0.000           Min.   :0.0000     Min.   :0.000      
##  1st Qu.:0.000           1st Qu.:0.0000     1st Qu.:0.000      
##  Median :0.000           Median :0.0000     Median :0.000      
##  Mean   :0.025           Mean   :0.0375     Mean   :0.025      
##  3rd Qu.:0.000           3rd Qu.:0.0000     3rd Qu.:0.000      
##  Max.   :1.000           Max.   :1.0000     Max.   :1.000      
##                                                                
##  ProductType.PC ProductType.Printer ProductType.PrinterSupplies
##  Min.   :0.00   Min.   :0.00        Min.   :0.0000             
##  1st Qu.:0.00   1st Qu.:0.00        1st Qu.:0.0000             
##  Median :0.00   Median :0.00        Median :0.0000             
##  Mean   :0.05   Mean   :0.15        Mean   :0.0375             
##  3rd Qu.:0.00   3rd Qu.:0.00        3rd Qu.:0.0000             
##  Max.   :1.00   Max.   :1.00        Max.   :1.0000             
##                                                                
##  ProductType.Smartphone ProductType.Software ProductType.Tablet
##  Min.   :0.00           Min.   :0.000        Min.   :0.0000    
##  1st Qu.:0.00           1st Qu.:0.000        1st Qu.:0.0000    
##  Median :0.00           Median :0.000        Median :0.0000    
##  Mean   :0.05           Mean   :0.075        Mean   :0.0375    
##  3rd Qu.:0.00           3rd Qu.:0.000        3rd Qu.:0.0000    
##  Max.   :1.00           Max.   :1.000        Max.   :1.0000    
##                                                                
##    ProductNum        Price         x5StarReviews    x4StarReviews   
##  Min.   :101.0   Min.   :   3.60   Min.   :   0.0   Min.   :  0.00  
##  1st Qu.:120.8   1st Qu.:  52.66   1st Qu.:  10.0   1st Qu.:  2.75  
##  Median :140.5   Median : 132.72   Median :  50.0   Median : 22.00  
##  Mean   :142.6   Mean   : 247.25   Mean   : 176.2   Mean   : 40.20  
##  3rd Qu.:160.2   3rd Qu.: 352.49   3rd Qu.: 306.5   3rd Qu.: 33.00  
##  Max.   :200.0   Max.   :2249.99   Max.   :2801.0   Max.   :431.00  
##                                                                     
##  x3StarReviews    x2StarReviews    x1StarReviews     PositiveServiceReview
##  Min.   :  0.00   Min.   :  0.00   Min.   :   0.00   Min.   :  0.00       
##  1st Qu.:  2.00   1st Qu.:  1.00   1st Qu.:   2.00   1st Qu.:  2.00       
##  Median :  7.00   Median :  3.00   Median :   8.50   Median :  5.50       
##  Mean   : 14.79   Mean   : 13.79   Mean   :  37.67   Mean   : 51.75       
##  3rd Qu.: 11.25   3rd Qu.:  7.00   3rd Qu.:  15.25   3rd Qu.: 42.00       
##  Max.   :162.00   Max.   :370.00   Max.   :1654.00   Max.   :536.00       
##                                                                           
##  NegativeServiceReview Recommendproduct BestSellersRank ShippingWeight   
##  Min.   :  0.000       Min.   :0.100    Min.   :    1   Min.   : 0.0100  
##  1st Qu.:  1.000       1st Qu.:0.700    1st Qu.:    7   1st Qu.: 0.5125  
##  Median :  3.000       Median :0.800    Median :   27   Median : 2.1000  
##  Mean   :  6.225       Mean   :0.745    Mean   : 1126   Mean   : 9.6681  
##  3rd Qu.:  6.250       3rd Qu.:0.900    3rd Qu.:  281   3rd Qu.:11.2050  
##  Max.   :112.000       Max.   :1.000    Max.   :17502   Max.   :63.0000  
##                                         NA's   :15                       
##   ProductDepth      ProductWidth    ProductHeight     ProfitMargin   
##  Min.   :  0.000   Min.   : 0.000   Min.   : 0.000   Min.   :0.0500  
##  1st Qu.:  4.775   1st Qu.: 1.750   1st Qu.: 0.400   1st Qu.:0.0500  
##  Median :  7.950   Median : 6.800   Median : 3.950   Median :0.1200  
##  Mean   : 14.425   Mean   : 7.819   Mean   : 6.259   Mean   :0.1545  
##  3rd Qu.: 15.025   3rd Qu.:11.275   3rd Qu.:10.300   3rd Qu.:0.2000  
##  Max.   :300.000   Max.   :31.750   Max.   :25.800   Max.   :0.4000  
##                                                                      
##      Volume     
##  Min.   :    0  
##  1st Qu.:   40  
##  Median :  200  
##  Mean   :  705  
##  3rd Qu.: 1226  
##  Max.   :11204  
## 

#delete all columns with missing data

readyData$ProductHeight <- NULL
readyData$BestSellersRank <- NULL

#Find the correlation between the relevant independent variables and the dependent variable

corrData <- cor(readyData)

#call the data

corrData
##                              ProductType.Accessories ProductType.Display
## ProductType.Accessories                  1.000000000         -0.17916128
## ProductType.Display                     -0.179161283          1.00000000
## ProductType.ExtendedWarranty            -0.262265264         -0.09759001
## ProductType.GameConsole                 -0.111111111         -0.04134491
## ProductType.Laptop                      -0.136963567         -0.05096472
## ProductType.Netbook                     -0.111111111         -0.04134491
## ProductType.PC                          -0.159188978         -0.05923489
## ProductType.Printer                     -0.291491544         -0.10846523
## ProductType.PrinterSupplies             -0.136963567         -0.05096472
## ProductType.Smartphone                  -0.159188978         -0.05923489
## ProductType.Software                    -0.197582993         -0.07352146
## ProductType.Tablet                      -0.136963567         -0.05096472
## ProductNum                              -0.338862490          0.08407390
## Price                                   -0.384906124          0.23172981
## x5StarReviews                            0.127803771         -0.03758386
## x4StarReviews                            0.156715126         -0.00293832
## x3StarReviews                            0.110608918         -0.03849540
## x2StarReviews                            0.033055555         -0.02708636
## x1StarReviews                           -0.041647041         -0.03628464
## PositiveServiceReview                    0.002699224         -0.09438421
## NegativeServiceReview                   -0.148034357         -0.01861755
## Recommendproduct                         0.058505351          0.07239820
## ShippingWeight                          -0.341367875          0.10374059
## ProductDepth                             0.191398963          0.01528395
## ProductWidth                            -0.154462467          0.28447123
## ProfitMargin                            -0.626935212          0.03906690
## Volume                                   0.127803771         -0.03758386
##                              ProductType.ExtendedWarranty
## ProductType.Accessories                       -0.26226526
## ProductType.Display                           -0.09759001
## ProductType.ExtendedWarranty                   1.00000000
## ProductType.GameConsole                       -0.06052275
## ProductType.Laptop                            -0.07460471
## ProductType.Netbook                           -0.06052275
## ProductType.PC                                -0.08671100
## ProductType.Printer                           -0.15877684
## ProductType.PrinterSupplies                   -0.07460471
## ProductType.Smartphone                        -0.08671100
## ProductType.Software                          -0.10762440
## ProductType.Tablet                            -0.07460471
## ProductNum                                    -0.08607897
## Price                                         -0.09780278
## x5StarReviews                                  0.07086528
## x4StarReviews                                 -0.09946665
## x3StarReviews                                 -0.09934446
## x2StarReviews                                 -0.09348376
## x1StarReviews                                 -0.05189306
## PositiveServiceReview                          0.62710951
## NegativeServiceReview                          0.01528844
## Recommendproduct                               0.14451833
## ShippingWeight                                -0.23680262
## ProductDepth                                  -0.15707124
## ProductWidth                                  -0.43640441
## ProfitMargin                                   0.80226723
## Volume                                         0.07086528
##                              ProductType.GameConsole ProductType.Laptop
## ProductType.Accessories                 -0.111111111       -0.136963567
## ProductType.Display                     -0.041344912       -0.050964719
## ProductType.ExtendedWarranty            -0.060522753       -0.074604710
## ProductType.GameConsole                  1.000000000       -0.031606977
## ProductType.Laptop                      -0.031606977        1.000000000
## ProductType.Netbook                     -0.025641026       -0.031606977
## ProductType.PC                          -0.036735918       -0.045283341
## ProductType.Printer                     -0.067267279       -0.082918499
## ProductType.PrinterSupplies             -0.031606977       -0.038961039
## ProductType.Smartphone                  -0.036735918       -0.045283341
## ProductType.Software                    -0.045596075       -0.056205010
## ProductType.Tablet                      -0.031606977       -0.038961039
## ProductNum                               0.340268975       -0.187367237
## Price                                   -0.015543759        0.296140664
## x5StarReviews                            0.388298241       -0.069799582
## x4StarReviews                            0.344636607       -0.052974299
## x3StarReviews                            0.258709076       -0.045679827
## x2StarReviews                            0.074429824       -0.038007390
## x1StarReviews                            0.003300983       -0.021994035
## PositiveServiceReview                   -0.014267327       -0.085716506
## NegativeServiceReview                    0.081949267        0.052417536
## Recommendproduct                         0.126534828       -0.011740126
## ShippingWeight                          -0.006072795       -0.055573162
## ProductDepth                            -0.019260720       -0.005033888
## ProductWidth                             0.022014271       -0.042331972
## ProfitMargin                            -0.006230125       -0.081632386
## Volume                                   0.388298241       -0.069799582
##                              ProductType.Netbook ProductType.PC
## ProductType.Accessories              -0.11111111    -0.15918898
## ProductType.Display                  -0.04134491    -0.05923489
## ProductType.ExtendedWarranty         -0.06052275    -0.08671100
## ProductType.GameConsole              -0.02564103    -0.03673592
## ProductType.Laptop                   -0.03160698    -0.04528334
## ProductType.Netbook                   1.00000000    -0.03673592
## ProductType.PC                       -0.03673592     1.00000000
## ProductType.Printer                  -0.06726728    -0.09637388
## ProductType.PrinterSupplies          -0.03160698    -0.04528334
## ProductType.Smartphone               -0.03673592    -0.05263158
## ProductType.Software                 -0.04559608    -0.06532553
## ProductType.Tablet                   -0.03160698    -0.04528334
## ProductNum                            0.22272699    -0.26383058
## Price                                 0.05587061     0.54711260
## x5StarReviews                        -0.07001054    -0.10289168
## x4StarReviews                        -0.08017983    -0.12221649
## x3StarReviews                        -0.05874134    -0.10093459
## x2StarReviews                        -0.04311403    -0.06931004
## x1StarReviews                        -0.02819859    -0.04287299
## PositiveServiceReview                -0.07750629    -0.10938596
## NegativeServiceReview                -0.04759253    -0.08835918
## Recommendproduct                     -0.36327741     0.09356725
## ShippingWeight                       -0.06005908     0.31738315
## ProductDepth                         -0.03192360     0.05401162
## ProductWidth                          0.06221219     0.20211260
## ProfitMargin                         -0.06160902    -0.02380242
## Volume                               -0.07001054    -0.10289168
##                              ProductType.Printer
## ProductType.Accessories             -0.291491544
## ProductType.Display                 -0.108465229
## ProductType.ExtendedWarranty        -0.158776837
## ProductType.GameConsole             -0.067267279
## ProductType.Laptop                  -0.082918499
## ProductType.Netbook                 -0.067267279
## ProductType.PC                      -0.096373885
## ProductType.Printer                  1.000000000
## ProductType.PrinterSupplies         -0.082918499
## ProductType.Smartphone              -0.096373885
## ProductType.Software                -0.119617833
## ProductType.Tablet                  -0.082918499
## ProductNum                           0.249589095
## Price                               -0.037212288
## x5StarReviews                       -0.149200679
## x4StarReviews                       -0.133159178
## x3StarReviews                       -0.121109706
## x2StarReviews                       -0.087025567
## x1StarReviews                       -0.060204072
## PositiveServiceReview               -0.184785826
## NegativeServiceReview                0.008126681
## Recommendproduct                    -0.149914932
## ShippingWeight                       0.757676417
## ProductDepth                         0.029243566
## ProductWidth                         0.555981505
## ProfitMargin                        -0.055691552
## Volume                              -0.149200679
##                              ProductType.PrinterSupplies
## ProductType.Accessories                      -0.13696357
## ProductType.Display                          -0.05096472
## ProductType.ExtendedWarranty                 -0.07460471
## ProductType.GameConsole                      -0.03160698
## ProductType.Laptop                           -0.03896104
## ProductType.Netbook                          -0.03160698
## ProductType.PC                               -0.04528334
## ProductType.Printer                          -0.08291850
## ProductType.PrinterSupplies                   1.00000000
## ProductType.Smartphone                       -0.04528334
## ProductType.Software                         -0.05620501
## ProductType.Tablet                           -0.03896104
## ProductNum                                   -0.09325018
## Price                                        -0.11477363
## x5StarReviews                                -0.09040334
## x4StarReviews                                -0.11100268
## x3StarReviews                                -0.09486115
## x2StarReviews                                -0.05819149
## x1StarReviews                                -0.03866021
## PositiveServiceReview                        -0.09649048
## NegativeServiceReview                        -0.08180838
## Recommendproduct                             -0.07882656
## ShippingWeight                               -0.07272702
## ProductDepth                                 -0.06686404
## ProductWidth                                 -0.18418343
## ProfitMargin                                  0.27675370
## Volume                                       -0.09040334
##                              ProductType.Smartphone ProductType.Software
## ProductType.Accessories                -0.159188978         -0.197582993
## ProductType.Display                    -0.059234888         -0.073521462
## ProductType.ExtendedWarranty           -0.086710997         -0.107624401
## ProductType.GameConsole                -0.036735918         -0.045596075
## ProductType.Laptop                     -0.045283341         -0.056205010
## ProductType.Netbook                    -0.036735918         -0.045596075
## ProductType.PC                         -0.052631579         -0.065325533
## ProductType.Printer                    -0.096373885         -0.119617833
## ProductType.PrinterSupplies            -0.045283341         -0.056205010
## ProductType.Smartphone                  1.000000000         -0.065325533
## ProductType.Software                   -0.065325533          1.000000000
## ProductType.Tablet                     -0.045283341         -0.056205010
## ProductNum                              0.431369468         -0.214914072
## Price                                   0.001358954         -0.058780827
## x5StarReviews                          -0.038508275          0.001196472
## x4StarReviews                          -0.073264624          0.154461169
## x3StarReviews                          -0.054335058          0.234863478
## x2StarReviews                          -0.037891166          0.350735807
## x1StarReviews                          -0.031436074          0.393364234
## PositiveServiceReview                  -0.093917237         -0.041827585
## NegativeServiceReview                  -0.056081857          0.406129939
## Recommendproduct                       -0.023391813         -0.065325533
## ShippingWeight                         -0.133107177         -0.167879959
## ProductDepth                           -0.074189350         -0.071848214
## ProductWidth                           -0.110745236         -0.183007483
## ProfitMargin                           -0.053555434          0.091501867
## Volume                                 -0.038508275          0.001196472
##                              ProductType.Tablet   ProductNum        Price
## ProductType.Accessories            -0.136963567 -0.338862490 -0.384906124
## ProductType.Display                -0.050964719  0.084073899  0.231729810
## ProductType.ExtendedWarranty       -0.074604710 -0.086078971 -0.097802784
## ProductType.GameConsole            -0.031606977  0.340268975 -0.015543759
## ProductType.Laptop                 -0.038961039 -0.187367237  0.296140664
## ProductType.Netbook                -0.031606977  0.222726991  0.055870610
## ProductType.PC                     -0.045283341 -0.263830575  0.547112596
## ProductType.Printer                -0.082918499  0.249589095 -0.037212288
## ProductType.PrinterSupplies        -0.038961039 -0.093250185 -0.114773627
## ProductType.Smartphone             -0.045283341  0.431369468  0.001358954
## ProductType.Software               -0.056205010 -0.214914072 -0.058780827
## ProductType.Tablet                  1.000000000  0.332753315  0.131659520
## ProductNum                          0.332753315  1.000000000 -0.039748728
## Price                               0.131659520 -0.039748728  1.000000000
## x5StarReviews                      -0.050941908  0.166120763 -0.142343990
## x4StarReviews                      -0.002433448  0.119400607 -0.165283699
## x3StarReviews                       0.005639815  0.090200642 -0.150537613
## x2StarReviews                      -0.013498120 -0.004533099 -0.110681189
## x1StarReviews                      -0.026603829 -0.063063850 -0.083957332
## PositiveServiceReview              -0.081913925 -0.057748062 -0.142143291
## NegativeServiceReview              -0.049409024 -0.019427155 -0.060790373
## Recommendproduct                    0.088889522  0.003886211  0.068930357
## ShippingWeight                     -0.098414265  0.081238782  0.416777401
## ProductDepth                       -0.036157465  0.036187970  0.010967649
## ProductWidth                        0.039281194  0.126793427  0.382397533
## ProfitMargin                        0.026452306  0.039715141  0.099669405
## Volume                             -0.050941908  0.166120763 -0.142343990
##                              x5StarReviews x4StarReviews x3StarReviews
## ProductType.Accessories        0.127803771  0.1567151258   0.110608918
## ProductType.Display           -0.037583856 -0.0029383203  -0.038495398
## ProductType.ExtendedWarranty   0.070865276 -0.0994666496  -0.099344457
## ProductType.GameConsole        0.388298241  0.3446366067   0.258709076
## ProductType.Laptop            -0.069799582 -0.0529742995  -0.045679827
## ProductType.Netbook           -0.070010545 -0.0801798318  -0.058741337
## ProductType.PC                -0.102891676 -0.1222164888  -0.100934593
## ProductType.Printer           -0.149200679 -0.1331591777  -0.121109706
## ProductType.PrinterSupplies   -0.090403335 -0.1110026840  -0.094861150
## ProductType.Smartphone        -0.038508275 -0.0732646241  -0.054335058
## ProductType.Software           0.001196472  0.1544611686   0.234863478
## ProductType.Tablet            -0.050941908 -0.0024334484   0.005639815
## ProductNum                     0.166120763  0.1194006067   0.090200642
## Price                         -0.142343990 -0.1652836990  -0.150537613
## x5StarReviews                  1.000000000  0.8790063940   0.763373189
## x4StarReviews                  0.879006394  1.0000000000   0.937214175
## x3StarReviews                  0.763373189  0.9372141751   1.000000000
## x2StarReviews                  0.487279328  0.6790056214   0.861480050
## x1StarReviews                  0.255023904  0.4449417168   0.679276158
## PositiveServiceReview          0.622260219  0.4834212832   0.418517393
## NegativeServiceReview          0.309418989  0.5332221777   0.684096619
## Recommendproduct               0.169541264  0.0714153315  -0.056613257
## ShippingWeight                -0.188023980 -0.1949140938  -0.171842042
## ProductDepth                   0.066105249 -0.0317207111  -0.049376503
## ProductWidth                  -0.143436609 -0.0006476125  -0.018838926
## ProfitMargin                  -0.013448603 -0.1466538020  -0.128706922
## Volume                         1.000000000  0.8790063940   0.763373189
##                              x2StarReviews x1StarReviews
## ProductType.Accessories        0.033055555  -0.041647041
## ProductType.Display           -0.027086357  -0.036284641
## ProductType.ExtendedWarranty  -0.093483762  -0.051893064
## ProductType.GameConsole        0.074429824   0.003300983
## ProductType.Laptop            -0.038007390  -0.021994035
## ProductType.Netbook           -0.043114035  -0.028198592
## ProductType.PC                -0.069310042  -0.042872993
## ProductType.Printer           -0.087025567  -0.060204072
## ProductType.PrinterSupplies   -0.058191494  -0.038660213
## ProductType.Smartphone        -0.037891166  -0.031436074
## ProductType.Software           0.350735807   0.393364234
## ProductType.Tablet            -0.013498120  -0.026603829
## ProductNum                    -0.004533099  -0.063063850
## Price                         -0.110681189  -0.083957332
## x5StarReviews                  0.487279328   0.255023904
## x4StarReviews                  0.679005621   0.444941717
## x3StarReviews                  0.861480050   0.679276158
## x2StarReviews                  1.000000000   0.951912978
## x1StarReviews                  0.951912978   1.000000000
## PositiveServiceReview          0.308901370   0.200035288
## NegativeServiceReview          0.864754808   0.884728323
## Recommendproduct              -0.197917979  -0.246092974
## ShippingWeight                -0.128685586  -0.095656192
## ProductDepth                  -0.042636007  -0.034639801
## ProductWidth                  -0.065799979  -0.101139826
## ProfitMargin                  -0.090093715  -0.031227760
## Volume                         0.487279328   0.255023904
##                              PositiveServiceReview NegativeServiceReview
## ProductType.Accessories                0.002699224          -0.148034357
## ProductType.Display                   -0.094384206          -0.018617554
## ProductType.ExtendedWarranty           0.627109511           0.015288441
## ProductType.GameConsole               -0.014267327           0.081949267
## ProductType.Laptop                    -0.085716506           0.052417536
## ProductType.Netbook                   -0.077506288          -0.047592529
## ProductType.PC                        -0.109385958          -0.088359185
## ProductType.Printer                   -0.184785826           0.008126681
## ProductType.PrinterSupplies           -0.096490485          -0.081808383
## ProductType.Smartphone                -0.093917237          -0.056081857
## ProductType.Software                  -0.041827585           0.406129939
## ProductType.Tablet                    -0.081913925          -0.049409024
## ProductNum                            -0.057748062          -0.019427155
## Price                                 -0.142143291          -0.060790373
## x5StarReviews                          0.622260219           0.309418989
## x4StarReviews                          0.483421283           0.533222178
## x3StarReviews                          0.418517393           0.684096619
## x2StarReviews                          0.308901370           0.864754808
## x1StarReviews                          0.200035288           0.884728323
## PositiveServiceReview                  1.000000000           0.265549747
## NegativeServiceReview                  0.265549747           1.000000000
## Recommendproduct                       0.232828810          -0.188329242
## ShippingWeight                        -0.270738543          -0.111793874
## ProductDepth                          -0.050526592          -0.067410452
## ProductWidth                          -0.339093728          -0.097207127
## ProfitMargin                           0.423591716           0.042035630
## Volume                                 0.622260219           0.309418989
##                              Recommendproduct ShippingWeight ProductDepth
## ProductType.Accessories           0.058505351   -0.341367875  0.191398963
## ProductType.Display               0.072398196    0.103740595  0.015283953
## ProductType.ExtendedWarranty      0.144518328   -0.236802620 -0.157071240
## ProductType.GameConsole           0.126534828   -0.006072795 -0.019260720
## ProductType.Laptop               -0.011740126   -0.055573162 -0.005033888
## ProductType.Netbook              -0.363277411   -0.060059077 -0.031923597
## ProductType.PC                    0.093567251    0.317383148  0.054011618
## ProductType.Printer              -0.149914932    0.757676417  0.029243566
## ProductType.PrinterSupplies      -0.078826557   -0.072727018 -0.066864039
## ProductType.Smartphone           -0.023391813   -0.133107177 -0.074189350
## ProductType.Software             -0.065325533   -0.167879959 -0.071848214
## ProductType.Tablet                0.088889522   -0.098414265 -0.036157465
## ProductNum                        0.003886211    0.081238782  0.036187970
## Price                             0.068930357    0.416777401  0.010967649
## x5StarReviews                     0.169541264   -0.188023980  0.066105249
## x4StarReviews                     0.071415331   -0.194914094 -0.031720711
## x3StarReviews                    -0.056613257   -0.171842042 -0.049376503
## x2StarReviews                    -0.197917979   -0.128685586 -0.042636007
## x1StarReviews                    -0.246092974   -0.095656192 -0.034639801
## PositiveServiceReview             0.232828810   -0.270738543 -0.050526592
## NegativeServiceReview            -0.188329242   -0.111793874 -0.067410452
## Recommendproduct                  1.000000000   -0.126043887  0.090358266
## ShippingWeight                   -0.126043887    1.000000000  0.065596924
## ProductDepth                      0.090358266    0.065596924  1.000000000
## ProductWidth                      0.011091086    0.692473518 -0.006008512
## ProfitMargin                      0.095760642   -0.079215379 -0.207176026
## Volume                            0.169541264   -0.188023980  0.066105249
##                               ProductWidth ProfitMargin       Volume
## ProductType.Accessories      -0.1544624673 -0.626935212  0.127803771
## ProductType.Display           0.2844712255  0.039066904 -0.037583856
## ProductType.ExtendedWarranty -0.4364044058  0.802267233  0.070865276
## ProductType.GameConsole       0.0220142711 -0.006230125  0.388298241
## ProductType.Laptop           -0.0423319723 -0.081632386 -0.069799582
## ProductType.Netbook           0.0622121883 -0.061609018 -0.070010545
## ProductType.PC                0.2021125967 -0.023802415 -0.102891676
## ProductType.Printer           0.5559815049 -0.055691552 -0.149200679
## ProductType.PrinterSupplies  -0.1841834287  0.276753698 -0.090403335
## ProductType.Smartphone       -0.1107452361 -0.053555434 -0.038508275
## ProductType.Software         -0.1830074830  0.091501867  0.001196472
## ProductType.Tablet            0.0392811944  0.026452306 -0.050941908
## ProductNum                    0.1267934273  0.039715141  0.166120763
## Price                         0.3823975328  0.099669405 -0.142343990
## x5StarReviews                -0.1434366092 -0.013448603  1.000000000
## x4StarReviews                -0.0006476125 -0.146653802  0.879006394
## x3StarReviews                -0.0188389256 -0.128706922  0.763373189
## x2StarReviews                -0.0657999794 -0.090093715  0.487279328
## x1StarReviews                -0.1011398264 -0.031227760  0.255023904
## PositiveServiceReview        -0.3390937285  0.423591716  0.622260219
## NegativeServiceReview        -0.0972071272  0.042035630  0.309418989
## Recommendproduct              0.0110910859  0.095760642  0.169541264
## ShippingWeight                0.6924735181 -0.079215379 -0.188023980
## ProductDepth                 -0.0060085117 -0.207176026  0.066105249
## ProductWidth                  1.0000000000 -0.291436397 -0.143436609
## ProfitMargin                 -0.2914363968  1.000000000 -0.013448603
## Volume                       -0.1434366092 -0.013448603  1.000000000

#note: Correlation values fall within -1 and 1 with variables have string positive relationships having correlation values closer to 1 and strong negative relationships with values closer to -1.

#visualize the correlation matrix using a heat map

install.packages(“corrplot”)

#load corrplot- a correllation matrix heatwave creator

library(corrplot)
## corrplot 0.84 loaded

#call the corrplot matrix for the data

corrplot(corrData)

#blue (cooler) colors show a positive relationship and red (warmer) colors indicate more negative relationships #create training and test sets after allowing for creation of random numbers using set seed

set.seed(123)

#assign names and calculate the taining size and test size

trainSize <- round(nrow(readyData)*0.7)
testSize <- round(nrow(readyData)- trainSize)

#check training and test size

trainSize
## [1] 56
testSize
## [1] 24

#train the dataset

training_indices<-sample(seq_len(nrow(readyData)),size =trainSize)

#Assign the training and test data into the names trainingset and testsize

trainSet<-readyData[training_indices,]
testSet<-readyData[-training_indices,]

#run linear regression model

readydata_LM<-lm(Volume ~ ., trainSet)

#check the outcome of the linear regression model

readydata_LM
## 
## Call:
## lm(formula = Volume ~ ., data = trainSet)
## 
## Coefficients:
##                  (Intercept)       ProductType.Accessories  
##                   -1.377e-12                     7.275e-13  
##          ProductType.Display  ProductType.ExtendedWarranty  
##                    4.620e-13                    -3.395e-13  
##      ProductType.GameConsole            ProductType.Laptop  
##                    1.053e-12                     5.793e-13  
##          ProductType.Netbook                ProductType.PC  
##                   -6.979e-14                     5.547e-13  
##          ProductType.Printer   ProductType.PrinterSupplies  
##                    1.100e-13                    -4.381e-13  
##       ProductType.Smartphone          ProductType.Software  
##                   -3.830e-14                     3.388e-13  
##           ProductType.Tablet                    ProductNum  
##                           NA                     8.616e-15  
##                        Price                 x5StarReviews  
##                   -3.079e-16                     4.000e+00  
##                x4StarReviews                 x3StarReviews  
##                    6.992e-17                     9.073e-15  
##                x2StarReviews                 x1StarReviews  
##                    8.772e-16                    -1.159e-15  
##        PositiveServiceReview         NegativeServiceReview  
##                    1.256e-15                     3.137e-17  
##             Recommendproduct                ShippingWeight  
##                   -2.794e-13                     5.957e-15  
##                 ProductDepth                  ProductWidth  
##                    4.160e-16                    -2.053e-14  
##                 ProfitMargin  
##                    2.073e-12

#get a summary of the content of the finding

summary(readydata_LM)
## Warning in summary.lm(readydata_LM): essentially perfect fit: summary may
## be unreliable
## 
## Call:
## lm(formula = Volume ~ ., data = trainSet)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.331e-12 -1.152e-13  2.310e-14  1.943e-13  1.396e-12 
## 
## Coefficients: (1 not defined because of singularities)
##                                Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)                  -1.377e-12  2.002e-12 -6.880e-01    0.497    
## ProductType.Accessories       7.275e-13  9.713e-13  7.490e-01    0.460    
## ProductType.Display           4.620e-13  7.612e-13  6.070e-01    0.548    
## ProductType.ExtendedWarranty -3.395e-13  1.549e-12 -2.190e-01    0.828    
## ProductType.GameConsole       1.053e-12  9.956e-13  1.058e+00    0.299    
## ProductType.Laptop            5.793e-13  9.961e-13  5.820e-01    0.565    
## ProductType.Netbook          -6.979e-14  9.880e-13 -7.100e-02    0.944    
## ProductType.PC                5.547e-13  9.691e-13  5.720e-01    0.571    
## ProductType.Printer           1.100e-13  1.186e-12  9.300e-02    0.927    
## ProductType.PrinterSupplies  -4.381e-13  1.318e-12 -3.320e-01    0.742    
## ProductType.Smartphone       -3.830e-14  7.363e-13 -5.200e-02    0.959    
## ProductType.Software          3.388e-13  1.160e-12  2.920e-01    0.772    
## ProductType.Tablet                   NA         NA         NA       NA    
## ProductNum                    8.616e-15  8.942e-15  9.640e-01    0.343    
## Price                        -3.080e-16  8.915e-16 -3.450e-01    0.732    
## x5StarReviews                 4.000e+00  1.387e-15  2.885e+15   <2e-16 ***
## x4StarReviews                 6.992e-17  1.175e-14  6.000e-03    0.995    
## x3StarReviews                 9.073e-15  3.695e-14  2.460e-01    0.808    
## x2StarReviews                 8.772e-16  4.077e-14  2.200e-02    0.983    
## x1StarReviews                -1.159e-15  8.343e-15 -1.390e-01    0.890    
## PositiveServiceReview         1.256e-15  2.245e-15  5.600e-01    0.580    
## NegativeServiceReview         3.137e-17  2.693e-14  1.000e-03    0.999    
## Recommendproduct             -2.794e-13  6.903e-13 -4.050e-01    0.688    
## ShippingWeight                5.957e-15  2.197e-14  2.710e-01    0.788    
## ProductDepth                  4.160e-16  3.175e-15  1.310e-01    0.897    
## ProductWidth                 -2.053e-14  4.488e-14 -4.570e-01    0.651    
## ProfitMargin                  2.073e-12  4.011e-12  5.170e-01    0.609    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.273e-13 on 30 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 1.291e+31 on 25 and 30 DF,  p-value: < 2.2e-16

#non-parametric machine learning models - support veector machine, random forest and K-NN. #we will work with two of these models

#rF

#randomize the dataset

set.seed(123)

#assign names and calculate the taining size and test size

trainSize <- round(nrow(readyData)*0.7)
testSize <- round(nrow(readyData)- trainSize)

#check training and test size

trainSize
## [1] 56
testSize
## [1] 24

#train the dataset

training_indices<-sample(seq_len(nrow(readyData)),size =trainSize)

#Assign the training and test data into the names trainingset and testsize

trainSet<-readyData[training_indices,]
testSet<-readyData[-training_indices,]

#10 fold cross validation

fit_Control <- trainControl(method = "repeatedcv", number = 10, repeats = 1)

#train Random Forest Regression model with a tuneLenght = 1 (trains with 1 mtry value for RandomForest)

readydata_rf <- train(Volume ~ ., data = trainSet, method = "rf", trControl=fit_Control, tuneLength = 1, importance = T)

#training results

readydata_rf
## Random Forest 
## 
## 56 samples
## 26 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 1 times) 
## Summary of sample sizes: 51, 50, 50, 50, 52, 49, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   744.0774  0.9094836  365.2133
## 
## Tuning parameter 'mtry' was held constant at a value of 5

#see asummary of the model created

summary(readydata_rf)
##                 Length Class      Mode     
## call              5    -none-     call     
## type              1    -none-     character
## predicted        56    -none-     numeric  
## mse             500    -none-     numeric  
## rsq             500    -none-     numeric  
## oob.times        56    -none-     numeric  
## importance       52    -none-     numeric  
## importanceSD     26    -none-     numeric  
## localImportance   0    -none-     NULL     
## proximity         0    -none-     NULL     
## ntree             1    -none-     numeric  
## mtry              1    -none-     numeric  
## forest           11    -none-     list     
## coefs             0    -none-     NULL     
## y                56    -none-     numeric  
## test              0    -none-     NULL     
## inbag             0    -none-     NULL     
## xNames           26    -none-     character
## problemType       1    -none-     character
## tuneValue         1    data.frame list     
## obsLevels         1    -none-     logical  
## param             1    -none-     list

#predict the findings

readydata_rf_predict<-predict(object = readydata_rf, newdata=testSet, na.action = na.pass)

#to see all the predictions

readydata_rf_predict
##          2          3          4         11         16         20 
##   51.45960   30.86053  237.20227  175.33970   68.04251  104.56920 
##         22         24         28         30         33         35 
## 2416.66453  132.83141   66.16314   38.20800   48.83653 1228.60438 
##         37         44         46         47         48         52 
## 1228.60438  460.19996 1230.46023  867.05756 4179.61370  342.12703 
##         59         60         61         63         70         76 
##   89.79021  312.41008  345.58541   33.01707   22.49693  403.67973

#this tells the most important independent variables

varImp(readydata_rf)
## rf variable importance
## 
##   only 20 most important variables shown (out of 26)
## 
##                             Overall
## x5StarReviews                100.00
## PositiveServiceReview         98.02
## x4StarReviews                 72.82
## x3StarReviews                 70.14
## x1StarReviews                 63.92
## x2StarReviews                 62.28
## ProductNum                    51.37
## Recommendproduct              50.21
## NegativeServiceReview         43.74
## ShippingWeight                35.70
## Price                         29.18
## ProfitMargin                  28.66
## ProductType.Printer           25.57
## ProductType.PrinterSupplies   24.23
## ProductType.Display           20.56
## ProductType.Tablet            17.83
## ProductWidth                  17.58
## ProductType.GameConsole       14.95
## ProductType.Netbook           14.48
## ProductType.Accessories       11.16

#Error check 1 - test of testSet result

postResample(testSet$Volume, readydata_rf_predict)
##       RMSE   Rsquared        MAE 
## 456.600407   0.873047 167.923326

#Error check 2 - test of trainSet result

readydata_rf_predict2 <- predict(object = readydata_rf, 
                                  newdata = trainSet)


postResample(testSet$Volume, readydata_rf_predict2)
## Warning in pred - obs: longer object length is not a multiple of shorter
## object length

## Warning in pred - obs: longer object length is not a multiple of shorter
## object length
##      RMSE  Rsquared       MAE 
## 1432.1413        NA  875.6286

#Error check 3 - confusion matrix - not working confusionMatrix(table(testSet$Volume, readydata_rf_predict))

#because the confusion matrix directly is giving an error, then use the union function to unify it

U <- union(readydata_rf_predict, testSet$Volume)

#create a model that incorporates the both items as factor and adds the unifying model done earlier readydata_conf_matrix <- table(factor(readydata_KNN_predict, U), factor(testSet$Volume, U))

#run the confusion matrix confusionMatrix(readydata_conf_matrix)

#KNN

#randomize the dataset

set.seed
## function (seed, kind = NULL, normal.kind = NULL, sample.kind = NULL) 
## {
##     kinds <- c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", 
##         "Mersenne-Twister", "Knuth-TAOCP", "user-supplied", "Knuth-TAOCP-2002", 
##         "L'Ecuyer-CMRG", "default")
##     n.kinds <- c("Buggy Kinderman-Ramage", "Ahrens-Dieter", "Box-Muller", 
##         "user-supplied", "Inversion", "Kinderman-Ramage", "default")
##     s.kinds <- c("Rounding", "Rejection", "default")
##     if (length(kind)) {
##         if (!is.character(kind) || length(kind) > 1L) 
##             stop("'kind' must be a character string of length 1 (RNG to be used).")
##         if (is.na(i.knd <- pmatch(kind, kinds) - 1L)) 
##             stop(gettextf("'%s' is not a valid abbreviation of an RNG", 
##                 kind), domain = NA)
##         if (i.knd == length(kinds) - 1L) 
##             i.knd <- -1L
##     }
##     else i.knd <- NULL
##     if (!is.null(normal.kind)) {
##         if (!is.character(normal.kind) || length(normal.kind) != 
##             1L) 
##             stop("'normal.kind' must be a character string of length 1")
##         normal.kind <- pmatch(normal.kind, n.kinds) - 1L
##         if (is.na(normal.kind)) 
##             stop(gettextf("'%s' is not a valid choice", normal.kind), 
##                 domain = NA)
##         if (normal.kind == 0L) 
##             stop("buggy version of Kinderman-Ramage generator is not allowed", 
##                 domain = NA)
##         if (normal.kind == length(n.kinds) - 1L) 
##             normal.kind <- -1L
##     }
##     if (!is.null(sample.kind)) {
##         if (!is.character(sample.kind) || length(sample.kind) != 
##             1L) 
##             stop("'sample.kind' must be a character string of length 1")
##         sample.kind <- pmatch(sample.kind, s.kinds) - 1L
##         if (is.na(sample.kind)) 
##             stop(gettextf("'%s' is not a valid choice", sample.kind), 
##                 domain = NA)
##         if (sample.kind == 0L) 
##             warning("non-uniform 'Rounding' sampler used", domain = NA)
##         if (sample.kind == length(s.kinds) - 1L) 
##             sample.kind <- -1L
##     }
##     .Internal(set.seed(seed, i.knd, normal.kind, sample.kind))
## }
## <bytecode: 0x000000001d4bc0b0>
## <environment: namespace:base>

#assign names and calculate the taining size and test size

trainSize <- round(nrow(readyData)*0.7)
testSize <- round(nrow(readyData)- trainSize)

#check training and test size

trainSize
## [1] 56
testSize
## [1] 24

#train the dataset

training_indices<-sample(seq_len(nrow(readyData)),size =trainSize)

#Assign the training and test data into the names trainingset and testsize

trainSet<-readyData[training_indices,]
testSet<-readyData[-training_indices,]

#10 fold cross validation

fit_Control <- trainControl(method = "repeatedcv", number = 10, repeats = 1)

#train Random Forest Regression model with a tuneLenght = 1 (trains with 1 mtry value for RandomForest)

readydata_KNN <- train(Volume ~ ., data = trainSet, method = "knn", trControl=fit_Control, tuneLength = 1)

#training results

readydata_KNN
## k-Nearest Neighbors 
## 
## 56 samples
## 26 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 1 times) 
## Summary of sample sizes: 52, 50, 51, 52, 51, 49, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   409.2678  0.9129648  237.8632
## 
## Tuning parameter 'k' was held constant at a value of 5

#this helps you to see all the data

summary(readydata_KNN)
##             Length Class      Mode     
## learn        2     -none-     list     
## k            1     -none-     numeric  
## theDots      0     -none-     list     
## xNames      26     -none-     character
## problemType  1     -none-     character
## tuneValue    1     data.frame list     
## obsLevels    1     -none-     logical  
## param        0     -none-     list

#predict the findings

readydata_KNN_predict<-predict(object = readydata_KNN, newdata=testSet, na.action = na.pass)

#to see all the predictions

readydata_KNN_predict
##  [1]  231.2   90.4   44.0 1227.2 1150.4   44.8  284.0   94.4   40.8   44.0
## [11] 1278.4 1278.4 1278.4 1322.4  240.8 2794.4 1439.2  129.6   46.4   46.4
## [21]   97.6   85.6   90.4   90.4

#this tells the most important independent variables

varImp(readydata_KNN)
## loess r-squared variable importance
## 
##   only 20 most important variables shown (out of 26)
## 
##                               Overall
## x5StarReviews                100.0000
## x4StarReviews                 81.7153
## x3StarReviews                 64.5191
## ProductType.GameConsole       46.6173
## x2StarReviews                 38.8412
## NegativeServiceReview         34.5710
## PositiveServiceReview         32.0117
## ProductNum                    31.1917
## ShippingWeight                 8.9524
## ProductDepth                   7.8388
## Recommendproduct               4.4946
## Price                          3.9370
## ProfitMargin                   3.3200
## ProductWidth                   3.2377
## ProductType.Printer            1.9986
## ProductType.PC                 1.8028
## ProductType.Netbook            0.6405
## ProductType.PrinterSupplies    0.6001
## ProductType.Accessories        0.5932
## ProductType.ExtendedWarranty   0.4183

#Error check 1 - test of testSet result

postResample(testSet$Volume, readydata_KNN_predict)
##         RMSE     Rsquared          MAE 
## 1723.3695754    0.6703293  440.4333333

#Error check 2 - test of trainSet result

readydata_KNN_predict2 <- predict(object = readydata_KNN, 
                                  newdata = trainSet)

postResample(testSet$Volume, readydata_KNN_predict2)
## Warning in pred - obs: longer object length is not a multiple of shorter
## object length

## Warning in pred - obs: longer object length is not a multiple of shorter
## object length
##      RMSE  Rsquared       MAE 
## 2241.0706        NA  965.9857

#Error check 3 - confusion matrix - not yet working confusionMatrix(table(testSet$Volume, readydata_KNN_predict))

#because the confusion matrix directly is giving an error, then use the union function to unify it

U <- union(readydata_KNN_predict, testSet$Volume)

#create a model that incorporates the both items as factor and adds the unifying model done earlier

readydata_conf_matrix <- table(factor(readydata_KNN_predict, U), factor(testSet$Volume, U))

#run the confusion matrix

confusionMatrix(readydata_conf_matrix)
## Confusion Matrix and Statistics
## 
##         
##          231.2 90.4 44 1227.2 1150.4 44.8 284 94.4 40.8 1278.4 1322.4
##   231.2      0    0  0      0      0    0   0    0    0      0      0
##   90.4       0    0  0      0      0    0   0    0    0      0      0
##   44         0    0  0      0      0    0   0    0    0      0      0
##   1227.2     0    0  0      0      0    0   0    0    0      0      0
##   1150.4     0    0  0      0      0    0   0    0    0      0      0
##   44.8       0    0  0      0      0    0   0    0    0      0      0
##   284        0    0  0      0      0    0   0    0    0      0      0
##   94.4       0    0  0      0      0    0   0    0    0      0      0
##   40.8       0    0  0      0      0    0   0    0    0      0      0
##   1278.4     0    0  0      0      0    0   0    0    0      0      0
##   1322.4     0    0  0      0      0    0   0    0    0      0      0
##   240.8      0    0  0      0      0    0   0    0    0      0      0
##   2794.4     0    0  0      0      0    0   0    0    0      0      0
##   1439.2     0    0  0      0      0    0   0    0    0      0      0
##   129.6      0    0  0      0      0    0   0    0    0      0      0
##   46.4       0    0  0      0      0    0   0    0    0      0      0
##   97.6       0    0  0      0      0    0   0    0    0      0      0
##   85.6       0    0  0      0      0    0   0    0    0      0      0
##   12         0    0  0      0      0    0   0    0    0      0      0
##   196        0    0  0      0      0    0   0    0    0      0      0
##   64         0    0  0      0      0    0   0    0    0      0      0
##   1252       0    0  0      0      0    0   0    0    0      0      0
##   680        0    0  0      0      0    0   0    0    0      0      0
##   60         0    0  0      0      0    0   0    0    0      0      0
##   308        0    0  0      0      0    0   0    0    0      0      0
##   88         0    0  0      0      0    0   0    0    0      0      0
##   0          0    0  0      0      0    0   0    0    0      0      0
##   20         0    0  0      0      0    0   0    0    0      0      0
##   1232       0    0  0      0      0    0   0    0    0      0      0
##   11204      0    0  0      0      0    0   0    0    0      0      0
##   1896       0    0  0      0      0    0   0    0    0      0      0
##   232        0    0  0      0      0    0   0    0    0      0      0
##   32         0    0  0      0      0    0   0    0    0      0      0
##   8          0    0  0      0      0    0   0    0    0      0      0
##   16         0    0  0      0      0    0   0    0    0      0      0
##         
##          240.8 2794.4 1439.2 129.6 46.4 97.6 85.6 12 196 64 1252 680 60
##   231.2      0      0      0     0    0    0    0  1   0  0    0   0  0
##   90.4       0      0      0     0    0    0    0  1   1  0    0   0  0
##   44         0      0      0     0    0    0    0  0   0  1    0   0  0
##   1227.2     0      0      0     0    0    0    0  0   0  0    1   0  0
##   1150.4     0      0      0     0    0    0    0  0   0  0    0   1  0
##   44.8       0      0      0     0    0    0    0  0   0  0    0   0  1
##   284        0      0      0     0    0    0    0  0   0  0    0   0  0
##   94.4       0      0      0     0    0    0    0  0   0  0    0   0  0
##   40.8       0      0      0     0    0    0    0  0   0  0    0   0  0
##   1278.4     0      0      0     0    0    0    0  0   0  0    0   0  0
##   1322.4     0      0      0     0    0    0    0  0   0  0    0   0  0
##   240.8      0      0      0     0    0    0    0  0   0  0    0   0  0
##   2794.4     0      0      0     0    0    0    0  0   0  0    0   0  0
##   1439.2     0      0      0     0    0    0    0  0   0  0    0   0  0
##   129.6      0      0      0     0    0    0    0  0   0  0    0   0  0
##   46.4       0      0      0     0    0    0    0  0   0  0    0   0  0
##   97.6       0      0      0     0    0    0    0  0   0  0    0   0  0
##   85.6       0      0      0     0    0    0    0  0   0  0    0   0  0
##   12         0      0      0     0    0    0    0  0   0  0    0   0  0
##   196        0      0      0     0    0    0    0  0   0  0    0   0  0
##   64         0      0      0     0    0    0    0  0   0  0    0   0  0
##   1252       0      0      0     0    0    0    0  0   0  0    0   0  0
##   680        0      0      0     0    0    0    0  0   0  0    0   0  0
##   60         0      0      0     0    0    0    0  0   0  0    0   0  0
##   308        0      0      0     0    0    0    0  0   0  0    0   0  0
##   88         0      0      0     0    0    0    0  0   0  0    0   0  0
##   0          0      0      0     0    0    0    0  0   0  0    0   0  0
##   20         0      0      0     0    0    0    0  0   0  0    0   0  0
##   1232       0      0      0     0    0    0    0  0   0  0    0   0  0
##   11204      0      0      0     0    0    0    0  0   0  0    0   0  0
##   1896       0      0      0     0    0    0    0  0   0  0    0   0  0
##   232        0      0      0     0    0    0    0  0   0  0    0   0  0
##   32         0      0      0     0    0    0    0  0   0  0    0   0  0
##   8          0      0      0     0    0    0    0  0   0  0    0   0  0
##   16         0      0      0     0    0    0    0  0   0  0    0   0  0
##         
##          308 88 0 20 1232 11204 1896 232 32 8 16
##   231.2    0  0 0  0    0     0    0   0  0 0  0
##   90.4     0  1 0  0    0     0    0   0  0 0  0
##   44       0  0 0  1    0     0    0   0  0 0  0
##   1227.2   0  0 0  0    0     0    0   0  0 0  0
##   1150.4   0  0 0  0    0     0    0   0  0 0  0
##   44.8     0  0 0  0    0     0    0   0  0 0  0
##   284      1  0 0  0    0     0    0   0  0 0  0
##   94.4     0  1 0  0    0     0    0   0  0 0  0
##   40.8     0  0 1  0    0     0    0   0  0 0  0
##   1278.4   0  0 0  0    3     0    0   0  0 0  0
##   1322.4   0  0 0  0    1     0    0   0  0 0  0
##   240.8    0  1 0  0    0     0    0   0  0 0  0
##   2794.4   0  0 0  0    0     1    0   0  0 0  0
##   1439.2   0  0 0  0    0     0    1   0  0 0  0
##   129.6    0  0 0  0    0     0    0   1  0 0  0
##   46.4     0  0 0  0    0     0    0   0  1 1  0
##   97.6     0  0 0  0    0     0    0   0  1 0  0
##   85.6     0  0 0  0    0     0    0   0  0 0  1
##   12       0  0 0  0    0     0    0   0  0 0  0
##   196      0  0 0  0    0     0    0   0  0 0  0
##   64       0  0 0  0    0     0    0   0  0 0  0
##   1252     0  0 0  0    0     0    0   0  0 0  0
##   680      0  0 0  0    0     0    0   0  0 0  0
##   60       0  0 0  0    0     0    0   0  0 0  0
##   308      0  0 0  0    0     0    0   0  0 0  0
##   88       0  0 0  0    0     0    0   0  0 0  0
##   0        0  0 0  0    0     0    0   0  0 0  0
##   20       0  0 0  0    0     0    0   0  0 0  0
##   1232     0  0 0  0    0     0    0   0  0 0  0
##   11204    0  0 0  0    0     0    0   0  0 0  0
##   1896     0  0 0  0    0     0    0   0  0 0  0
##   232      0  0 0  0    0     0    0   0  0 0  0
##   32       0  0 0  0    0     0    0   0  0 0  0
##   8        0  0 0  0    0     0    0   0  0 0  0
##   16       0  0 0  0    0     0    0   0  0 0  0
## 
## Overall Statistics
##                                      
##                Accuracy : 0          
##                  95% CI : (0, 0.1425)
##     No Information Rate : 0.1667     
##     P-Value [Acc > NIR] : 1          
##                                      
##                   Kappa : 0          
##                                      
##  Mcnemar's Test P-Value : NA         
## 
## Statistics by Class:
## 
##                      Class: 231.2 Class: 90.4 Class: 44 Class: 1227.2
## Sensitivity                    NA          NA        NA            NA
## Specificity               0.95833       0.875   0.91667       0.95833
## Pos Pred Value                 NA          NA        NA            NA
## Neg Pred Value                 NA          NA        NA            NA
## Prevalence                0.00000       0.000   0.00000       0.00000
## Detection Rate            0.00000       0.000   0.00000       0.00000
## Detection Prevalence      0.04167       0.125   0.08333       0.04167
## Balanced Accuracy              NA          NA        NA            NA
##                      Class: 1150.4 Class: 44.8 Class: 284 Class: 94.4
## Sensitivity                     NA          NA         NA          NA
## Specificity                0.95833     0.95833    0.95833     0.95833
## Pos Pred Value                  NA          NA         NA          NA
## Neg Pred Value                  NA          NA         NA          NA
## Prevalence                 0.00000     0.00000    0.00000     0.00000
## Detection Rate             0.00000     0.00000    0.00000     0.00000
## Detection Prevalence       0.04167     0.04167    0.04167     0.04167
## Balanced Accuracy               NA          NA         NA          NA
##                      Class: 40.8 Class: 1278.4 Class: 1322.4 Class: 240.8
## Sensitivity                   NA            NA            NA           NA
## Specificity              0.95833         0.875       0.95833      0.95833
## Pos Pred Value                NA            NA            NA           NA
## Neg Pred Value                NA            NA            NA           NA
## Prevalence               0.00000         0.000       0.00000      0.00000
## Detection Rate           0.00000         0.000       0.00000      0.00000
## Detection Prevalence     0.04167         0.125       0.04167      0.04167
## Balanced Accuracy             NA            NA            NA           NA
##                      Class: 2794.4 Class: 1439.2 Class: 129.6 Class: 46.4
## Sensitivity                     NA            NA           NA          NA
## Specificity                0.95833       0.95833      0.95833     0.91667
## Pos Pred Value                  NA            NA           NA          NA
## Neg Pred Value                  NA            NA           NA          NA
## Prevalence                 0.00000       0.00000      0.00000     0.00000
## Detection Rate             0.00000       0.00000      0.00000     0.00000
## Detection Prevalence       0.04167       0.04167      0.04167     0.08333
## Balanced Accuracy               NA            NA           NA          NA
##                      Class: 97.6 Class: 85.6 Class: 12 Class: 196
## Sensitivity                   NA          NA   0.00000    0.00000
## Specificity              0.95833     0.95833   1.00000    1.00000
## Pos Pred Value                NA          NA       NaN        NaN
## Neg Pred Value                NA          NA   0.91667    0.95833
## Prevalence               0.00000     0.00000   0.08333    0.04167
## Detection Rate           0.00000     0.00000   0.00000    0.00000
## Detection Prevalence     0.04167     0.04167   0.00000    0.00000
## Balanced Accuracy             NA          NA   0.50000    0.50000
##                      Class: 64 Class: 1252 Class: 680 Class: 60 Class: 308
## Sensitivity            0.00000     0.00000    0.00000   0.00000    0.00000
## Specificity            1.00000     1.00000    1.00000   1.00000    1.00000
## Pos Pred Value             NaN         NaN        NaN       NaN        NaN
## Neg Pred Value         0.95833     0.95833    0.95833   0.95833    0.95833
## Prevalence             0.04167     0.04167    0.04167   0.04167    0.04167
## Detection Rate         0.00000     0.00000    0.00000   0.00000    0.00000
## Detection Prevalence   0.00000     0.00000    0.00000   0.00000    0.00000
## Balanced Accuracy      0.50000     0.50000    0.50000   0.50000    0.50000
##                      Class: 88 Class: 0 Class: 20 Class: 1232 Class: 11204
## Sensitivity              0.000  0.00000   0.00000      0.0000      0.00000
## Specificity              1.000  1.00000   1.00000      1.0000      1.00000
## Pos Pred Value             NaN      NaN       NaN         NaN          NaN
## Neg Pred Value           0.875  0.95833   0.95833      0.8333      0.95833
## Prevalence               0.125  0.04167   0.04167      0.1667      0.04167
## Detection Rate           0.000  0.00000   0.00000      0.0000      0.00000
## Detection Prevalence     0.000  0.00000   0.00000      0.0000      0.00000
## Balanced Accuracy        0.500  0.50000   0.50000      0.5000      0.50000
##                      Class: 1896 Class: 232 Class: 32 Class: 8 Class: 16
## Sensitivity              0.00000    0.00000   0.00000  0.00000   0.00000
## Specificity              1.00000    1.00000   1.00000  1.00000   1.00000
## Pos Pred Value               NaN        NaN       NaN      NaN       NaN
## Neg Pred Value           0.95833    0.95833   0.91667  0.95833   0.95833
## Prevalence               0.04167    0.04167   0.08333  0.04167   0.04167
## Detection Rate           0.00000    0.00000   0.00000  0.00000   0.00000
## Detection Prevalence     0.00000    0.00000   0.00000  0.00000   0.00000
## Balanced Accuracy        0.50000    0.50000   0.50000  0.50000   0.50000

#svm

#this helps to randomize the dataset

set.seed(123)

#assign names and calculate the taining size and test size

trainSize <- round(nrow(readyData)*0.7)
testSize <- round(nrow(readyData)- trainSize)

#check training and test size

trainSize
## [1] 56
testSize
## [1] 24

#train the dataset

training_indices<-sample(seq_len(nrow(readyData)),size =trainSize)

#Assign the training and test data into the names trainingset and testsize

trainSet<-readyData[training_indices,]
testSet<-readyData[-training_indices,]

#10 fold cross validation

fit_Control <- trainControl(method = "repeatedcv", number = 10, repeats = 1)

#train Random Forest Regression model with a tuneLenght = 1 (trains with 1 mtry value for RandomForest)

readydata_svm <- caret::train(Volume ~ ., 
                       data = trainSet, 
                       method = 'svmLinear', 
                       trControl=fit_Control)
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.

#training results - run the findings

readydata_svm
## Support Vector Machines with Linear Kernel 
## 
## 56 samples
## 26 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 1 times) 
## Summary of sample sizes: 51, 50, 50, 50, 52, 49, ... 
## Resampling results:
## 
##   RMSE     Rsquared   MAE     
##   297.188  0.9449585  203.8192
## 
## Tuning parameter 'C' was held constant at a value of 1
summary(readydata_svm)
## Length  Class   Mode 
##      1   ksvm     S4

#predict the findings

readydata_svm_predict<-predict(object = readydata_svm, newdata=testSet, na.action = na.pass)

#to see all the predictions

readydata_svm_predict
##          2          3          4         11         16         20 
## -111.34558 -139.76812   69.13923 -139.55534  -77.01048 -102.91078 
##         22         24         28         30         33         35 
## 2050.59371   13.55979   74.91169  -18.07959  172.93467 1058.55421 
##         37         44         46         47         48         52 
## 1064.13237  363.76145 1433.87597 1244.44290 3850.96163  326.11444 
##         59         60         61         63         70         76 
##  105.29633  415.54477  424.60908  106.74523  -41.21731  446.69317

#this tells the most important independent variables

varImp(readydata_svm)
## loess r-squared variable importance
## 
##   only 20 most important variables shown (out of 26)
## 
##                              Overall
## x5StarReviews               100.0000
## x4StarReviews                94.4107
## PositiveServiceReview        81.5216
## x3StarReviews                71.0288
## x1StarReviews                61.2867
## x2StarReviews                47.6658
## NegativeServiceReview        32.1437
## ProductType.GameConsole      15.2493
## ProductNum                   14.1920
## ProductWidth                  7.7526
## ShippingWeight                7.6467
## ProductDepth                  6.4082
## Price                         3.8756
## Recommendproduct              3.5714
## ProductType.Printer           2.2693
## ProfitMargin                  2.2596
## ProductType.Accessories       1.1128
## ProductType.PrinterSupplies   0.7925
## ProductType.PC                0.7197
## ProductType.Laptop            0.5254

#Error check 1 - test of testSet result

postResample(testSet$Volume, readydata_svm_predict)
##        RMSE    Rsquared         MAE 
## 391.4194368   0.8898941 206.9953504

#Error check 2 - test of trainSet result

readydata_svm_predict2 <- predict(object = readydata_svm, 
                             newdata = trainSet)
postResample(testSet$Volume, readydata_svm_predict2)
## Warning in pred - obs: longer object length is not a multiple of shorter
## object length

## Warning in pred - obs: longer object length is not a multiple of shorter
## object length
##      RMSE  Rsquared       MAE 
## 1808.1496        NA  929.0069

#Error check 3 - confusion matrix - not yet working round(prop.table(confusionMatrix(testSet\(Volume, readydata_svm_predict)\)table))

confusionMatrix(table(testSet$Volume, readydata_svm_predict))

#because the confusion matrix directly is giving an error, then use the union function to unify it

U <- union(readydata_svm_predict, testSet$Volume)

#create a model that incorporates the both items as factor and adds the unifying model done earlier

readydata_conf_matrix <- table(factor(readydata_svm_predict, U), factor(testSet$Volume, U))

#run the confusion matrix

confusionMatrix(readydata_conf_matrix)
## Confusion Matrix and Statistics
## 
##                    
##                     -111.345580871333 -139.768119082221 69.1392320836945
##   -111.345580871333                 0                 0                0
##   -139.768119082221                 0                 0                0
##   69.1392320836945                  0                 0                0
##   -139.555340084582                 0                 0                0
##   -77.0104840810835                 0                 0                0
##   -102.910781300902                 0                 0                0
##   2050.5937087161                   0                 0                0
##   13.5597926256926                  0                 0                0
##   74.9116912054817                  0                 0                0
##   -18.0795931178567                 0                 0                0
##   172.934673095712                  0                 0                0
##   1058.5542072308                   0                 0                0
##   1064.13237335324                  0                 0                0
##   363.76145495496                   0                 0                0
##   1433.87597119037                  0                 0                0
##   1244.44290005606                  0                 0                0
##   3850.96162627287                  0                 0                0
##   326.114439752832                  0                 0                0
##   105.296328373813                  0                 0                0
##   415.544773264371                  0                 0                0
##   424.609077423724                  0                 0                0
##   106.745231243858                  0                 0                0
##   -41.2173078999225                 0                 0                0
##   446.693167676331                  0                 0                0
##   8                                 0                 0                0
##   12                                0                 0                0
##   196                               0                 0                0
##   84                                0                 0                0
##   32                                0                 0                0
##   80                                0                 0                0
##   1576                              0                 0                0
##   116                               0                 0                0
##   88                                0                 0                0
##   24                                0                 0                0
##   20                                0                 0                0
##   1232                              0                 0                0
##   368                               0                 0                0
##   1464                              0                 0                0
##   836                               0                 0                0
##   2140                              0                 0                0
##   204                               0                 0                0
##   296                               0                 0                0
##   232                               0                 0                0
##   4                                 0                 0                0
##   248                               0                 0                0
##                    
##                     -139.555340084582 -77.0104840810835 -102.910781300902
##   -111.345580871333                 0                 0                 0
##   -139.768119082221                 0                 0                 0
##   69.1392320836945                  0                 0                 0
##   -139.555340084582                 0                 0                 0
##   -77.0104840810835                 0                 0                 0
##   -102.910781300902                 0                 0                 0
##   2050.5937087161                   0                 0                 0
##   13.5597926256926                  0                 0                 0
##   74.9116912054817                  0                 0                 0
##   -18.0795931178567                 0                 0                 0
##   172.934673095712                  0                 0                 0
##   1058.5542072308                   0                 0                 0
##   1064.13237335324                  0                 0                 0
##   363.76145495496                   0                 0                 0
##   1433.87597119037                  0                 0                 0
##   1244.44290005606                  0                 0                 0
##   3850.96162627287                  0                 0                 0
##   326.114439752832                  0                 0                 0
##   105.296328373813                  0                 0                 0
##   415.544773264371                  0                 0                 0
##   424.609077423724                  0                 0                 0
##   106.745231243858                  0                 0                 0
##   -41.2173078999225                 0                 0                 0
##   446.693167676331                  0                 0                 0
##   8                                 0                 0                 0
##   12                                0                 0                 0
##   196                               0                 0                 0
##   84                                0                 0                 0
##   32                                0                 0                 0
##   80                                0                 0                 0
##   1576                              0                 0                 0
##   116                               0                 0                 0
##   88                                0                 0                 0
##   24                                0                 0                 0
##   20                                0                 0                 0
##   1232                              0                 0                 0
##   368                               0                 0                 0
##   1464                              0                 0                 0
##   836                               0                 0                 0
##   2140                              0                 0                 0
##   204                               0                 0                 0
##   296                               0                 0                 0
##   232                               0                 0                 0
##   4                                 0                 0                 0
##   248                               0                 0                 0
##                    
##                     2050.5937087161 13.5597926256926 74.9116912054817
##   -111.345580871333               0                0                0
##   -139.768119082221               0                0                0
##   69.1392320836945                0                0                0
##   -139.555340084582               0                0                0
##   -77.0104840810835               0                0                0
##   -102.910781300902               0                0                0
##   2050.5937087161                 0                0                0
##   13.5597926256926                0                0                0
##   74.9116912054817                0                0                0
##   -18.0795931178567               0                0                0
##   172.934673095712                0                0                0
##   1058.5542072308                 0                0                0
##   1064.13237335324                0                0                0
##   363.76145495496                 0                0                0
##   1433.87597119037                0                0                0
##   1244.44290005606                0                0                0
##   3850.96162627287                0                0                0
##   326.114439752832                0                0                0
##   105.296328373813                0                0                0
##   415.544773264371                0                0                0
##   424.609077423724                0                0                0
##   106.745231243858                0                0                0
##   -41.2173078999225               0                0                0
##   446.693167676331                0                0                0
##   8                               0                0                0
##   12                              0                0                0
##   196                             0                0                0
##   84                              0                0                0
##   32                              0                0                0
##   80                              0                0                0
##   1576                            0                0                0
##   116                             0                0                0
##   88                              0                0                0
##   24                              0                0                0
##   20                              0                0                0
##   1232                            0                0                0
##   368                             0                0                0
##   1464                            0                0                0
##   836                             0                0                0
##   2140                            0                0                0
##   204                             0                0                0
##   296                             0                0                0
##   232                             0                0                0
##   4                               0                0                0
##   248                             0                0                0
##                    
##                     -18.0795931178567 172.934673095712 1058.5542072308
##   -111.345580871333                 0                0               0
##   -139.768119082221                 0                0               0
##   69.1392320836945                  0                0               0
##   -139.555340084582                 0                0               0
##   -77.0104840810835                 0                0               0
##   -102.910781300902                 0                0               0
##   2050.5937087161                   0                0               0
##   13.5597926256926                  0                0               0
##   74.9116912054817                  0                0               0
##   -18.0795931178567                 0                0               0
##   172.934673095712                  0                0               0
##   1058.5542072308                   0                0               0
##   1064.13237335324                  0                0               0
##   363.76145495496                   0                0               0
##   1433.87597119037                  0                0               0
##   1244.44290005606                  0                0               0
##   3850.96162627287                  0                0               0
##   326.114439752832                  0                0               0
##   105.296328373813                  0                0               0
##   415.544773264371                  0                0               0
##   424.609077423724                  0                0               0
##   106.745231243858                  0                0               0
##   -41.2173078999225                 0                0               0
##   446.693167676331                  0                0               0
##   8                                 0                0               0
##   12                                0                0               0
##   196                               0                0               0
##   84                                0                0               0
##   32                                0                0               0
##   80                                0                0               0
##   1576                              0                0               0
##   116                               0                0               0
##   88                                0                0               0
##   24                                0                0               0
##   20                                0                0               0
##   1232                              0                0               0
##   368                               0                0               0
##   1464                              0                0               0
##   836                               0                0               0
##   2140                              0                0               0
##   204                               0                0               0
##   296                               0                0               0
##   232                               0                0               0
##   4                                 0                0               0
##   248                               0                0               0
##                    
##                     1064.13237335324 363.76145495496 1433.87597119037
##   -111.345580871333                0               0                0
##   -139.768119082221                0               0                0
##   69.1392320836945                 0               0                0
##   -139.555340084582                0               0                0
##   -77.0104840810835                0               0                0
##   -102.910781300902                0               0                0
##   2050.5937087161                  0               0                0
##   13.5597926256926                 0               0                0
##   74.9116912054817                 0               0                0
##   -18.0795931178567                0               0                0
##   172.934673095712                 0               0                0
##   1058.5542072308                  0               0                0
##   1064.13237335324                 0               0                0
##   363.76145495496                  0               0                0
##   1433.87597119037                 0               0                0
##   1244.44290005606                 0               0                0
##   3850.96162627287                 0               0                0
##   326.114439752832                 0               0                0
##   105.296328373813                 0               0                0
##   415.544773264371                 0               0                0
##   424.609077423724                 0               0                0
##   106.745231243858                 0               0                0
##   -41.2173078999225                0               0                0
##   446.693167676331                 0               0                0
##   8                                0               0                0
##   12                               0               0                0
##   196                              0               0                0
##   84                               0               0                0
##   32                               0               0                0
##   80                               0               0                0
##   1576                             0               0                0
##   116                              0               0                0
##   88                               0               0                0
##   24                               0               0                0
##   20                               0               0                0
##   1232                             0               0                0
##   368                              0               0                0
##   1464                             0               0                0
##   836                              0               0                0
##   2140                             0               0                0
##   204                              0               0                0
##   296                              0               0                0
##   232                              0               0                0
##   4                                0               0                0
##   248                              0               0                0
##                    
##                     1244.44290005606 3850.96162627287 326.114439752832
##   -111.345580871333                0                0                0
##   -139.768119082221                0                0                0
##   69.1392320836945                 0                0                0
##   -139.555340084582                0                0                0
##   -77.0104840810835                0                0                0
##   -102.910781300902                0                0                0
##   2050.5937087161                  0                0                0
##   13.5597926256926                 0                0                0
##   74.9116912054817                 0                0                0
##   -18.0795931178567                0                0                0
##   172.934673095712                 0                0                0
##   1058.5542072308                  0                0                0
##   1064.13237335324                 0                0                0
##   363.76145495496                  0                0                0
##   1433.87597119037                 0                0                0
##   1244.44290005606                 0                0                0
##   3850.96162627287                 0                0                0
##   326.114439752832                 0                0                0
##   105.296328373813                 0                0                0
##   415.544773264371                 0                0                0
##   424.609077423724                 0                0                0
##   106.745231243858                 0                0                0
##   -41.2173078999225                0                0                0
##   446.693167676331                 0                0                0
##   8                                0                0                0
##   12                               0                0                0
##   196                              0                0                0
##   84                               0                0                0
##   32                               0                0                0
##   80                               0                0                0
##   1576                             0                0                0
##   116                              0                0                0
##   88                               0                0                0
##   24                               0                0                0
##   20                               0                0                0
##   1232                             0                0                0
##   368                              0                0                0
##   1464                             0                0                0
##   836                              0                0                0
##   2140                             0                0                0
##   204                              0                0                0
##   296                              0                0                0
##   232                              0                0                0
##   4                                0                0                0
##   248                              0                0                0
##                    
##                     105.296328373813 415.544773264371 424.609077423724
##   -111.345580871333                0                0                0
##   -139.768119082221                0                0                0
##   69.1392320836945                 0                0                0
##   -139.555340084582                0                0                0
##   -77.0104840810835                0                0                0
##   -102.910781300902                0                0                0
##   2050.5937087161                  0                0                0
##   13.5597926256926                 0                0                0
##   74.9116912054817                 0                0                0
##   -18.0795931178567                0                0                0
##   172.934673095712                 0                0                0
##   1058.5542072308                  0                0                0
##   1064.13237335324                 0                0                0
##   363.76145495496                  0                0                0
##   1433.87597119037                 0                0                0
##   1244.44290005606                 0                0                0
##   3850.96162627287                 0                0                0
##   326.114439752832                 0                0                0
##   105.296328373813                 0                0                0
##   415.544773264371                 0                0                0
##   424.609077423724                 0                0                0
##   106.745231243858                 0                0                0
##   -41.2173078999225                0                0                0
##   446.693167676331                 0                0                0
##   8                                0                0                0
##   12                               0                0                0
##   196                              0                0                0
##   84                               0                0                0
##   32                               0                0                0
##   80                               0                0                0
##   1576                             0                0                0
##   116                              0                0                0
##   88                               0                0                0
##   24                               0                0                0
##   20                               0                0                0
##   1232                             0                0                0
##   368                              0                0                0
##   1464                             0                0                0
##   836                              0                0                0
##   2140                             0                0                0
##   204                              0                0                0
##   296                              0                0                0
##   232                              0                0                0
##   4                                0                0                0
##   248                              0                0                0
##                    
##                     106.745231243858 -41.2173078999225 446.693167676331 8
##   -111.345580871333                0                 0                0 1
##   -139.768119082221                0                 0                0 0
##   69.1392320836945                 0                 0                0 0
##   -139.555340084582                0                 0                0 0
##   -77.0104840810835                0                 0                0 0
##   -102.910781300902                0                 0                0 0
##   2050.5937087161                  0                 0                0 0
##   13.5597926256926                 0                 0                0 0
##   74.9116912054817                 0                 0                0 0
##   -18.0795931178567                0                 0                0 0
##   172.934673095712                 0                 0                0 0
##   1058.5542072308                  0                 0                0 0
##   1064.13237335324                 0                 0                0 0
##   363.76145495496                  0                 0                0 0
##   1433.87597119037                 0                 0                0 0
##   1244.44290005606                 0                 0                0 0
##   3850.96162627287                 0                 0                0 0
##   326.114439752832                 0                 0                0 0
##   105.296328373813                 0                 0                0 0
##   415.544773264371                 0                 0                0 0
##   424.609077423724                 0                 0                0 0
##   106.745231243858                 0                 0                0 0
##   -41.2173078999225                0                 0                0 0
##   446.693167676331                 0                 0                0 0
##   8                                0                 0                0 0
##   12                               0                 0                0 0
##   196                              0                 0                0 0
##   84                               0                 0                0 0
##   32                               0                 0                0 0
##   80                               0                 0                0 0
##   1576                             0                 0                0 0
##   116                              0                 0                0 0
##   88                               0                 0                0 0
##   24                               0                 0                0 0
##   20                               0                 0                0 0
##   1232                             0                 0                0 0
##   368                              0                 0                0 0
##   1464                             0                 0                0 0
##   836                              0                 0                0 0
##   2140                             0                 0                0 0
##   204                              0                 0                0 0
##   296                              0                 0                0 0
##   232                              0                 0                0 0
##   4                                0                 0                0 0
##   248                              0                 0                0 0
##                    
##                     12 196 84 32 80 1576 116 88 24 20 1232 368 1464 836
##   -111.345580871333  0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   -139.768119082221  1   0  0  0  0    0   0  0  0  0    0   0    0   0
##   69.1392320836945   0   1  0  0  0    0   0  0  0  0    0   0    0   0
##   -139.555340084582  0   0  1  0  0    0   0  0  0  0    0   0    0   0
##   -77.0104840810835  0   0  0  1  0    0   0  0  0  0    0   0    0   0
##   -102.910781300902  0   0  0  0  1    0   0  0  0  0    0   0    0   0
##   2050.5937087161    0   0  0  0  0    1   0  0  0  0    0   0    0   0
##   13.5597926256926   0   0  0  0  0    0   1  0  0  0    0   0    0   0
##   74.9116912054817   0   0  0  0  0    0   0  1  0  0    0   0    0   0
##   -18.0795931178567  0   0  0  0  0    0   0  0  1  0    0   0    0   0
##   172.934673095712   0   0  0  0  0    0   0  0  0  1    0   0    0   0
##   1058.5542072308    0   0  0  0  0    0   0  0  0  0    1   0    0   0
##   1064.13237335324   0   0  0  0  0    0   0  0  0  0    1   0    0   0
##   363.76145495496    0   0  0  0  0    0   0  0  0  0    0   1    0   0
##   1433.87597119037   0   0  0  0  0    0   0  0  0  0    0   0    1   0
##   1244.44290005606   0   0  0  0  0    0   0  0  0  0    0   0    0   1
##   3850.96162627287   0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   326.114439752832   0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   105.296328373813   0   0  1  0  0    0   0  0  0  0    0   0    0   0
##   415.544773264371   0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   424.609077423724   0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   106.745231243858   0   0  0  1  0    0   0  0  0  0    0   0    0   0
##   -41.2173078999225  0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   446.693167676331   0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   8                  0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   12                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   196                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   84                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   32                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   80                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   1576               0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   116                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   88                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   24                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   20                 0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   1232               0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   368                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   1464               0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   836                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   2140               0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   204                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   296                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   232                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   4                  0   0  0  0  0    0   0  0  0  0    0   0    0   0
##   248                0   0  0  0  0    0   0  0  0  0    0   0    0   0
##                    
##                     2140 204 296 232 4 248
##   -111.345580871333    0   0   0   0 0   0
##   -139.768119082221    0   0   0   0 0   0
##   69.1392320836945     0   0   0   0 0   0
##   -139.555340084582    0   0   0   0 0   0
##   -77.0104840810835    0   0   0   0 0   0
##   -102.910781300902    0   0   0   0 0   0
##   2050.5937087161      0   0   0   0 0   0
##   13.5597926256926     0   0   0   0 0   0
##   74.9116912054817     0   0   0   0 0   0
##   -18.0795931178567    0   0   0   0 0   0
##   172.934673095712     0   0   0   0 0   0
##   1058.5542072308      0   0   0   0 0   0
##   1064.13237335324     0   0   0   0 0   0
##   363.76145495496      0   0   0   0 0   0
##   1433.87597119037     0   0   0   0 0   0
##   1244.44290005606     0   0   0   0 0   0
##   3850.96162627287     1   0   0   0 0   0
##   326.114439752832     0   1   0   0 0   0
##   105.296328373813     0   0   0   0 0   0
##   415.544773264371     0   0   1   0 0   0
##   424.609077423724     0   0   0   1 0   0
##   106.745231243858     0   0   0   0 0   0
##   -41.2173078999225    0   0   0   0 1   0
##   446.693167676331     0   0   0   0 0   1
##   8                    0   0   0   0 0   0
##   12                   0   0   0   0 0   0
##   196                  0   0   0   0 0   0
##   84                   0   0   0   0 0   0
##   32                   0   0   0   0 0   0
##   80                   0   0   0   0 0   0
##   1576                 0   0   0   0 0   0
##   116                  0   0   0   0 0   0
##   88                   0   0   0   0 0   0
##   24                   0   0   0   0 0   0
##   20                   0   0   0   0 0   0
##   1232                 0   0   0   0 0   0
##   368                  0   0   0   0 0   0
##   1464                 0   0   0   0 0   0
##   836                  0   0   0   0 0   0
##   2140                 0   0   0   0 0   0
##   204                  0   0   0   0 0   0
##   296                  0   0   0   0 0   0
##   232                  0   0   0   0 0   0
##   4                    0   0   0   0 0   0
##   248                  0   0   0   0 0   0
## 
## Overall Statistics
##                                      
##                Accuracy : 0          
##                  95% CI : (0, 0.1425)
##     No Information Rate : 0.0833     
##     P-Value [Acc > NIR] : 1          
##                                      
##                   Kappa : 0          
##                                      
##  Mcnemar's Test P-Value : NA         
## 
## Statistics by Class:
## 
##                      Class: -111.345580871333 Class: -139.768119082221
## Sensitivity                                NA                       NA
## Specificity                           0.95833                  0.95833
## Pos Pred Value                             NA                       NA
## Neg Pred Value                             NA                       NA
## Prevalence                            0.00000                  0.00000
## Detection Rate                        0.00000                  0.00000
## Detection Prevalence                  0.04167                  0.04167
## Balanced Accuracy                          NA                       NA
##                      Class: 69.1392320836945 Class: -139.555340084582
## Sensitivity                               NA                       NA
## Specificity                          0.95833                  0.95833
## Pos Pred Value                            NA                       NA
## Neg Pred Value                            NA                       NA
## Prevalence                           0.00000                  0.00000
## Detection Rate                       0.00000                  0.00000
## Detection Prevalence                 0.04167                  0.04167
## Balanced Accuracy                         NA                       NA
##                      Class: -77.0104840810835 Class: -102.910781300902
## Sensitivity                                NA                       NA
## Specificity                           0.95833                  0.95833
## Pos Pred Value                             NA                       NA
## Neg Pred Value                             NA                       NA
## Prevalence                            0.00000                  0.00000
## Detection Rate                        0.00000                  0.00000
## Detection Prevalence                  0.04167                  0.04167
## Balanced Accuracy                          NA                       NA
##                      Class: 2050.5937087161 Class: 13.5597926256926
## Sensitivity                              NA                      NA
## Specificity                         0.95833                 0.95833
## Pos Pred Value                           NA                      NA
## Neg Pred Value                           NA                      NA
## Prevalence                          0.00000                 0.00000
## Detection Rate                      0.00000                 0.00000
## Detection Prevalence                0.04167                 0.04167
## Balanced Accuracy                        NA                      NA
##                      Class: 74.9116912054817 Class: -18.0795931178567
## Sensitivity                               NA                       NA
## Specificity                          0.95833                  0.95833
## Pos Pred Value                            NA                       NA
## Neg Pred Value                            NA                       NA
## Prevalence                           0.00000                  0.00000
## Detection Rate                       0.00000                  0.00000
## Detection Prevalence                 0.04167                  0.04167
## Balanced Accuracy                         NA                       NA
##                      Class: 172.934673095712 Class: 1058.5542072308
## Sensitivity                               NA                     NA
## Specificity                          0.95833                0.95833
## Pos Pred Value                            NA                     NA
## Neg Pred Value                            NA                     NA
## Prevalence                           0.00000                0.00000
## Detection Rate                       0.00000                0.00000
## Detection Prevalence                 0.04167                0.04167
## Balanced Accuracy                         NA                     NA
##                      Class: 1064.13237335324 Class: 363.76145495496
## Sensitivity                               NA                     NA
## Specificity                          0.95833                0.95833
## Pos Pred Value                            NA                     NA
## Neg Pred Value                            NA                     NA
## Prevalence                           0.00000                0.00000
## Detection Rate                       0.00000                0.00000
## Detection Prevalence                 0.04167                0.04167
## Balanced Accuracy                         NA                     NA
##                      Class: 1433.87597119037 Class: 1244.44290005606
## Sensitivity                               NA                      NA
## Specificity                          0.95833                 0.95833
## Pos Pred Value                            NA                      NA
## Neg Pred Value                            NA                      NA
## Prevalence                           0.00000                 0.00000
## Detection Rate                       0.00000                 0.00000
## Detection Prevalence                 0.04167                 0.04167
## Balanced Accuracy                         NA                      NA
##                      Class: 3850.96162627287 Class: 326.114439752832
## Sensitivity                               NA                      NA
## Specificity                          0.95833                 0.95833
## Pos Pred Value                            NA                      NA
## Neg Pred Value                            NA                      NA
## Prevalence                           0.00000                 0.00000
## Detection Rate                       0.00000                 0.00000
## Detection Prevalence                 0.04167                 0.04167
## Balanced Accuracy                         NA                      NA
##                      Class: 105.296328373813 Class: 415.544773264371
## Sensitivity                               NA                      NA
## Specificity                          0.95833                 0.95833
## Pos Pred Value                            NA                      NA
## Neg Pred Value                            NA                      NA
## Prevalence                           0.00000                 0.00000
## Detection Rate                       0.00000                 0.00000
## Detection Prevalence                 0.04167                 0.04167
## Balanced Accuracy                         NA                      NA
##                      Class: 424.609077423724 Class: 106.745231243858
## Sensitivity                               NA                      NA
## Specificity                          0.95833                 0.95833
## Pos Pred Value                            NA                      NA
## Neg Pred Value                            NA                      NA
## Prevalence                           0.00000                 0.00000
## Detection Rate                       0.00000                 0.00000
## Detection Prevalence                 0.04167                 0.04167
## Balanced Accuracy                         NA                      NA
##                      Class: -41.2173078999225 Class: 446.693167676331
## Sensitivity                                NA                      NA
## Specificity                           0.95833                 0.95833
## Pos Pred Value                             NA                      NA
## Neg Pred Value                             NA                      NA
## Prevalence                            0.00000                 0.00000
## Detection Rate                        0.00000                 0.00000
## Detection Prevalence                  0.04167                 0.04167
## Balanced Accuracy                          NA                      NA
##                      Class: 8 Class: 12 Class: 196 Class: 84 Class: 32
## Sensitivity           0.00000   0.00000    0.00000   0.00000   0.00000
## Specificity           1.00000   1.00000    1.00000   1.00000   1.00000
## Pos Pred Value            NaN       NaN        NaN       NaN       NaN
## Neg Pred Value        0.95833   0.95833    0.95833   0.91667   0.91667
## Prevalence            0.04167   0.04167    0.04167   0.08333   0.08333
## Detection Rate        0.00000   0.00000    0.00000   0.00000   0.00000
## Detection Prevalence  0.00000   0.00000    0.00000   0.00000   0.00000
## Balanced Accuracy     0.50000   0.50000    0.50000   0.50000   0.50000
##                      Class: 80 Class: 1576 Class: 116 Class: 88 Class: 24
## Sensitivity            0.00000     0.00000    0.00000   0.00000   0.00000
## Specificity            1.00000     1.00000    1.00000   1.00000   1.00000
## Pos Pred Value             NaN         NaN        NaN       NaN       NaN
## Neg Pred Value         0.95833     0.95833    0.95833   0.95833   0.95833
## Prevalence             0.04167     0.04167    0.04167   0.04167   0.04167
## Detection Rate         0.00000     0.00000    0.00000   0.00000   0.00000
## Detection Prevalence   0.00000     0.00000    0.00000   0.00000   0.00000
## Balanced Accuracy      0.50000     0.50000    0.50000   0.50000   0.50000
##                      Class: 20 Class: 1232 Class: 368 Class: 1464
## Sensitivity            0.00000     0.00000    0.00000     0.00000
## Specificity            1.00000     1.00000    1.00000     1.00000
## Pos Pred Value             NaN         NaN        NaN         NaN
## Neg Pred Value         0.95833     0.91667    0.95833     0.95833
## Prevalence             0.04167     0.08333    0.04167     0.04167
## Detection Rate         0.00000     0.00000    0.00000     0.00000
## Detection Prevalence   0.00000     0.00000    0.00000     0.00000
## Balanced Accuracy      0.50000     0.50000    0.50000     0.50000
##                      Class: 836 Class: 2140 Class: 204 Class: 296
## Sensitivity             0.00000     0.00000    0.00000    0.00000
## Specificity             1.00000     1.00000    1.00000    1.00000
## Pos Pred Value              NaN         NaN        NaN        NaN
## Neg Pred Value          0.95833     0.95833    0.95833    0.95833
## Prevalence              0.04167     0.04167    0.04167    0.04167
## Detection Rate          0.00000     0.00000    0.00000    0.00000
## Detection Prevalence    0.00000     0.00000    0.00000    0.00000
## Balanced Accuracy       0.50000     0.50000    0.50000    0.50000
##                      Class: 232 Class: 4 Class: 248
## Sensitivity             0.00000  0.00000    0.00000
## Specificity             1.00000  1.00000    1.00000
## Pos Pred Value              NaN      NaN        NaN
## Neg Pred Value          0.95833  0.95833    0.95833
## Prevalence              0.04167  0.04167    0.04167
## Detection Rate          0.00000  0.00000    0.00000
## Detection Prevalence    0.00000  0.00000    0.00000
## Balanced Accuracy       0.50000  0.50000    0.50000

#Error check 4 - ggplot (this can be used to compare all the final results - this is done at the end of the process)

#the tested models are then run on the other data set

new_products <- read.csv("C:/Users/gebruiker/Desktop/Ubiqum_1/newproductattributes2017.csv")

#call the caret folder

library(caret)

#dumify the data - typical datasets don’t contain only numeric values. Most data will contain a mixture of numeric and nominal data. Dumifying helps to incorporate both for developing regression models and making predictions.

#How to dumify the data - convert categorical variables (factor and character variables) to binary variables using the process below

#dumify the data - step1 - create a new dataframe made up of dummy variables from the exisiting products data newDataFrame <- dummyVars(" ~ .", data = new_products)

#next integrate the dummy variables df called newdataframe into the existing products dataframe and assign all to a new name called ready dataframe

readyData_newprod <- data.frame(predict(newDataFrame, newdata = new_products))

#cross-check to ensure there are no nominal variables check the structure

str(readyData_newprod)
## 'data.frame':    24 obs. of  29 variables:
##  $ ProductType.Accessories     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Display         : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.ExtendedWarranty: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.GameConsole     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Laptop          : num  0 0 1 1 1 0 0 0 0 0 ...
##  $ ProductType.Netbook         : num  0 0 0 0 0 1 1 1 1 0 ...
##  $ ProductType.PC              : num  1 1 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Printer         : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.PrinterSupplies : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Smartphone      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Software        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ ProductType.Tablet          : num  0 0 0 0 0 0 0 0 0 1 ...
##  $ ProductNum                  : num  171 172 173 175 176 178 180 181 183 186 ...
##  $ Price                       : num  699 860 1199 1199 1999 ...
##  $ x5StarReviews               : num  96 51 74 7 1 19 312 23 3 296 ...
##  $ x4StarReviews               : num  26 11 10 2 1 8 112 18 4 66 ...
##  $ x3StarReviews               : num  14 10 3 1 1 4 28 7 0 30 ...
##  $ x2StarReviews               : num  14 10 3 1 3 1 31 22 1 21 ...
##  $ x1StarReviews               : num  25 21 11 1 0 10 47 18 0 36 ...
##  $ PositiveServiceReview       : num  12 7 11 2 0 2 28 5 1 28 ...
##  $ NegativeServiceReview       : num  3 5 5 1 1 4 16 16 0 9 ...
##  $ Recommendproduct            : num  0.7 0.6 0.8 0.6 0.3 0.6 0.7 0.4 0.7 0.8 ...
##  $ BestSellersRank             : num  2498 490 111 4446 2820 ...
##  $ ShippingWeight              : num  19.9 27 6.6 13 11.6 5.8 4.6 4.8 4.3 3 ...
##  $ ProductDepth                : num  20.63 21.89 8.94 16.3 16.81 ...
##  $ ProductWidth                : num  19.2 27 12.8 10.8 10.9 ...
##  $ ProductHeight               : num  8.39 9.13 0.68 1.4 0.88 1.2 0.95 1.5 0.97 0.37 ...
##  $ ProfitMargin                : num  0.25 0.2 0.1 0.15 0.23 0.08 0.09 0.11 0.09 0.1 ...
##  $ Volume                      : num  0 0 0 0 0 0 0 0 0 0 ...

#Check for missing data

summary(readyData_newprod)
##  ProductType.Accessories ProductType.Display ProductType.ExtendedWarranty
##  Min.   :0.00000         Min.   :0.00000     Min.   :0.00000             
##  1st Qu.:0.00000         1st Qu.:0.00000     1st Qu.:0.00000             
##  Median :0.00000         Median :0.00000     Median :0.00000             
##  Mean   :0.08333         Mean   :0.04167     Mean   :0.04167             
##  3rd Qu.:0.00000         3rd Qu.:0.00000     3rd Qu.:0.00000             
##  Max.   :1.00000         Max.   :1.00000     Max.   :1.00000             
##  ProductType.GameConsole ProductType.Laptop ProductType.Netbook
##  Min.   :0.00000         Min.   :0.000      Min.   :0.0000     
##  1st Qu.:0.00000         1st Qu.:0.000      1st Qu.:0.0000     
##  Median :0.00000         Median :0.000      Median :0.0000     
##  Mean   :0.08333         Mean   :0.125      Mean   :0.1667     
##  3rd Qu.:0.00000         3rd Qu.:0.000      3rd Qu.:0.0000     
##  Max.   :1.00000         Max.   :1.000      Max.   :1.0000     
##  ProductType.PC    ProductType.Printer ProductType.PrinterSupplies
##  Min.   :0.00000   Min.   :0.00000     Min.   :0.00000            
##  1st Qu.:0.00000   1st Qu.:0.00000     1st Qu.:0.00000            
##  Median :0.00000   Median :0.00000     Median :0.00000            
##  Mean   :0.08333   Mean   :0.04167     Mean   :0.04167            
##  3rd Qu.:0.00000   3rd Qu.:0.00000     3rd Qu.:0.00000            
##  Max.   :1.00000   Max.   :1.00000     Max.   :1.00000            
##  ProductType.Smartphone ProductType.Software ProductType.Tablet
##  Min.   :0.0000         Min.   :0.00000      Min.   :0.00000   
##  1st Qu.:0.0000         1st Qu.:0.00000      1st Qu.:0.00000   
##  Median :0.0000         Median :0.00000      Median :0.00000   
##  Mean   :0.1667         Mean   :0.04167      Mean   :0.08333   
##  3rd Qu.:0.0000         3rd Qu.:0.00000      3rd Qu.:0.00000   
##  Max.   :1.0000         Max.   :1.00000      Max.   :1.00000   
##    ProductNum        Price        x5StarReviews     x4StarReviews   
##  Min.   :171.0   Min.   :   8.5   Min.   :   0.00   Min.   :  0.00  
##  1st Qu.:179.5   1st Qu.: 130.0   1st Qu.:  16.00   1st Qu.:  2.00  
##  Median :193.5   Median : 275.0   Median :  46.00   Median : 10.50  
##  Mean   :219.5   Mean   : 425.6   Mean   : 178.50   Mean   : 48.04  
##  3rd Qu.:301.2   3rd Qu.: 486.5   3rd Qu.:  99.25   3rd Qu.: 26.00  
##  Max.   :307.0   Max.   :1999.0   Max.   :1525.00   Max.   :437.00  
##  x3StarReviews    x2StarReviews    x1StarReviews    PositiveServiceReview
##  Min.   :  0.00   Min.   :  0.00   Min.   :  0.00   Min.   : 0.00        
##  1st Qu.:  1.75   1st Qu.:  1.00   1st Qu.:  1.75   1st Qu.: 2.00        
##  Median :  4.50   Median :  4.00   Median : 13.00   Median : 5.00        
##  Mean   : 21.92   Mean   : 17.50   Mean   : 27.58   Mean   :13.46        
##  3rd Qu.: 16.75   3rd Qu.: 20.25   3rd Qu.: 35.25   3rd Qu.:12.50        
##  Max.   :224.00   Max.   :160.00   Max.   :247.00   Max.   :90.00        
##  NegativeServiceReview Recommendproduct BestSellersRank   
##  Min.   : 0.000        Min.   :0.3000   Min.   :    1.00  
##  1st Qu.: 1.000        1st Qu.:0.6000   1st Qu.:   93.25  
##  Median : 3.500        Median :0.7000   Median :  750.50  
##  Mean   : 5.667        Mean   :0.6708   Mean   : 3957.62  
##  3rd Qu.: 7.500        3rd Qu.:0.8000   3rd Qu.: 3150.00  
##  Max.   :23.000        Max.   :1.0000   Max.   :44465.00  
##  ShippingWeight    ProductDepth     ProductWidth    ProductHeight   
##  Min.   : 0.200   Min.   : 0.000   Min.   : 0.000   Min.   : 0.000  
##  1st Qu.: 0.900   1st Qu.: 5.225   1st Qu.: 5.832   1st Qu.: 0.400  
##  Median : 4.450   Median : 8.000   Median : 9.950   Median : 0.985  
##  Mean   : 7.802   Mean   : 9.094   Mean   :10.408   Mean   : 3.541  
##  3rd Qu.: 9.575   3rd Qu.:11.425   3rd Qu.:12.875   3rd Qu.: 2.888  
##  Max.   :42.000   Max.   :21.890   Max.   :27.010   Max.   :25.800  
##   ProfitMargin        Volume 
##  Min.   :0.0500   Min.   :0  
##  1st Qu.:0.0975   1st Qu.:0  
##  Median :0.1150   Median :0  
##  Mean   :0.1817   Mean   :0  
##  3rd Qu.:0.2000   3rd Qu.:0  
##  Max.   :0.9000   Max.   :0

#delete all missing data

readyData_newprod[is.na(readyData_newprod)] <- 0

#Find the correlation between the relevant independent variables and the dependent variable

corrData <- cor(readyData_newprod)
## Warning in cor(readyData_newprod): the standard deviation is zero

#call the correlation of the dataset

corrData
##                              ProductType.Accessories ProductType.Display
## ProductType.Accessories                  1.000000000         -0.06286946
## ProductType.Display                     -0.062869461          1.00000000
## ProductType.ExtendedWarranty            -0.062869461         -0.04347826
## ProductType.GameConsole                 -0.090909091         -0.06286946
## ProductType.Laptop                      -0.113960576         -0.07881104
## ProductType.Netbook                     -0.134839972         -0.09325048
## ProductType.PC                          -0.090909091         -0.06286946
## ProductType.Printer                     -0.062869461         -0.04347826
## ProductType.PrinterSupplies             -0.062869461         -0.04347826
## ProductType.Smartphone                  -0.134839972         -0.09325048
## ProductType.Software                    -0.062869461         -0.04347826
## ProductType.Tablet                      -0.090909091         -0.06286946
## ProductNum                               0.450781996         -0.07033263
## Price                                   -0.265412932         -0.12759805
## x5StarReviews                           -0.130951120         -0.10465503
## x4StarReviews                           -0.143700957         -0.10258130
## x3StarReviews                           -0.118301893         -0.09736158
## x2StarReviews                           -0.142932035         -0.11160107
## x1StarReviews                           -0.123548795         -0.10884094
## PositiveServiceReview                   -0.164968075         -0.12404259
## NegativeServiceReview                   -0.257361129         -0.16075767
## Recommendproduct                         0.050438089          0.03488117
## BestSellersRank                         -0.129158661         -0.08259333
## ShippingWeight                          -0.207289431          0.02272106
## ProductDepth                            -0.004891727          0.16260679
## ProductWidth                            -0.017290702          0.24028011
## ProductHeight                           -0.127310054          0.13555836
## ProfitMargin                            -0.188703553         -0.16108668
## Volume                                            NA                  NA
##                              ProductType.ExtendedWarranty
## ProductType.Accessories                       -0.06286946
## ProductType.Display                           -0.04347826
## ProductType.ExtendedWarranty                   1.00000000
## ProductType.GameConsole                       -0.06286946
## ProductType.Laptop                            -0.07881104
## ProductType.Netbook                           -0.09325048
## ProductType.PC                                -0.06286946
## ProductType.Printer                           -0.04347826
## ProductType.PrinterSupplies                   -0.04347826
## ProductType.Smartphone                        -0.09325048
## ProductType.Software                          -0.04347826
## ProductType.Tablet                            -0.06286946
## ProductNum                                     0.32885257
## Price                                         -0.14547070
## x5StarReviews                                 -0.10705400
## x4StarReviews                                 -0.10044605
## x3StarReviews                                 -0.09291922
## x2StarReviews                                 -0.10522386
## x1StarReviews                                 -0.11309531
## PositiveServiceReview                         -0.13399919
## NegativeServiceReview                         -0.09186153
## Recommendproduct                              -0.32389658
## BestSellersRank                               -0.08984429
## ShippingWeight                                -0.15732285
## ProductDepth                                  -0.32814546
## ProductWidth                                  -0.34771780
## ProductHeight                                 -0.12771428
## ProfitMargin                                   0.26711841
## Volume                                                 NA
##                              ProductType.GameConsole ProductType.Laptop
## ProductType.Accessories                  -0.09090909        -0.11396058
## ProductType.Display                      -0.06286946        -0.07881104
## ProductType.ExtendedWarranty             -0.06286946        -0.07881104
## ProductType.GameConsole                   1.00000000        -0.11396058
## ProductType.Laptop                       -0.11396058         1.00000000
## ProductType.Netbook                      -0.13483997        -0.16903085
## ProductType.PC                           -0.09090909        -0.11396058
## ProductType.Printer                      -0.06286946        -0.07881104
## ProductType.PrinterSupplies              -0.06286946        -0.07881104
## ProductType.Smartphone                   -0.13483997        -0.16903085
## ProductType.Software                     -0.06286946        -0.07881104
## ProductType.Tablet                       -0.09090909        -0.11396058
## ProductNum                                0.18416094        -0.30895915
## Price                                    -0.05693774         0.84212919
## x5StarReviews                             0.70678916        -0.16433710
## x4StarReviews                             0.39044978        -0.16917232
## x3StarReviews                             0.25748059        -0.16306209
## x2StarReviews                             0.17520701        -0.17532157
## x1StarReviews                             0.14713072        -0.18186756
## PositiveServiceReview                     0.46131073        -0.16468675
## NegativeServiceReview                     0.34038085        -0.20814145
## Recommendproduct                          0.30983398        -0.22581247
## BestSellersRank                          -0.12465575        -0.06174655
## ShippingWeight                            0.19145439         0.09745391
## ProductDepth                             -0.09098613         0.32200363
## ProductWidth                             -0.03902988         0.06613537
## ProductHeight                             0.20514324        -0.16700053
## ProfitMargin                             -0.08255780        -0.04804971
## Volume                                            NA                 NA
##                              ProductType.Netbook ProductType.PC
## ProductType.Accessories              -0.13483997    -0.09090909
## ProductType.Display                  -0.09325048    -0.06286946
## ProductType.ExtendedWarranty         -0.09325048    -0.06286946
## ProductType.GameConsole              -0.13483997    -0.09090909
## ProductType.Laptop                   -0.16903085    -0.11396058
## ProductType.Netbook                   1.00000000    -0.13483997
## ProductType.PC                       -0.13483997     1.00000000
## ProductType.Printer                  -0.09325048    -0.06286946
## ProductType.PrinterSupplies          -0.09325048    -0.06286946
## ProductType.Smartphone               -0.20000000    -0.13483997
## ProductType.Software                 -0.09325048    -0.06286946
## ProductType.Tablet                   -0.13483997    -0.09090909
## ProductNum                           -0.31800113    -0.26387239
## Price                                -0.04900114     0.22856832
## x5StarReviews                        -0.11480263    -0.09105873
## x4StarReviews                        -0.05743602    -0.09121216
## x3StarReviews                        -0.11592140    -0.06370102
## x2StarReviews                        -0.05129092    -0.05071782
## x1StarReviews                        -0.08060067    -0.02819578
## PositiveServiceReview                -0.09520556    -0.05698897
## NegativeServiceReview                 0.24627629    -0.08301972
## Recommendproduct                     -0.18168574    -0.03602721
## BestSellersRank                      -0.02631946    -0.08097427
## ShippingWeight                       -0.12991915     0.46825592
## ProductDepth                         -0.04595216     0.63481575
## ProductWidth                         -0.01489812     0.61459487
## ProductHeight                        -0.18457706     0.27215539
## ProfitMargin                         -0.23397272     0.07666082
## Volume                                        NA             NA
##                              ProductType.Printer
## ProductType.Accessories              -0.06286946
## ProductType.Display                  -0.04347826
## ProductType.ExtendedWarranty         -0.04347826
## ProductType.GameConsole              -0.06286946
## ProductType.Laptop                   -0.07881104
## ProductType.Netbook                  -0.09325048
## ProductType.PC                       -0.06286946
## ProductType.Printer                   1.00000000
## ProductType.PrinterSupplies          -0.04347826
## ProductType.Smartphone               -0.09325048
## ProductType.Software                 -0.04347826
## ProductType.Tablet                   -0.06286946
## ProductNum                            0.32124904
## Price                                -0.10080023
## x5StarReviews                        -0.05427668
## x4StarReviews                        -0.08549925
## x3StarReviews                        -0.08403452
## x2StarReviews                        -0.10522386
## x1StarReviews                        -0.10458657
## PositiveServiceReview                -0.08421621
## NegativeServiceReview                -0.16075767
## Recommendproduct                      0.15447375
## BestSellersRank                      -0.08904873
## ShippingWeight                        0.70771569
## ProductDepth                          0.29612026
## ProductWidth                          0.43739305
## ProductHeight                         0.80275615
## ProfitMargin                          0.87883997
## Volume                                        NA
##                              ProductType.PrinterSupplies
## ProductType.Accessories                      -0.06286946
## ProductType.Display                          -0.04347826
## ProductType.ExtendedWarranty                 -0.04347826
## ProductType.GameConsole                      -0.06286946
## ProductType.Laptop                           -0.07881104
## ProductType.Netbook                          -0.09325048
## ProductType.PC                               -0.06286946
## ProductType.Printer                          -0.04347826
## ProductType.PrinterSupplies                   1.00000000
## ProductType.Smartphone                       -0.09325048
## ProductType.Software                         -0.04347826
## ProductType.Tablet                           -0.06286946
## ProductNum                                    0.32505081
## Price                                        -0.18076038
## x5StarReviews                                -0.10405529
## x4StarReviews                                -0.10258130
## x3StarReviews                                -0.09736158
## x2StarReviews                                -0.11160107
## x1StarReviews                                -0.11734967
## PositiveServiceReview                        -0.12404259
## NegativeServiceReview                        -0.19520575
## Recommendproduct                              0.39365892
## BestSellersRank                              -0.06697762
## ShippingWeight                               -0.14076709
## ProductDepth                                 -0.15854725
## ProductWidth                                 -0.25083178
## ProductHeight                                 0.09949362
## ProfitMargin                                  0.14477410
## Volume                                                NA
##                              ProductType.Smartphone ProductType.Software
## ProductType.Accessories                -0.134839972          -0.06286946
## ProductType.Display                    -0.093250481          -0.04347826
## ProductType.ExtendedWarranty           -0.093250481          -0.04347826
## ProductType.GameConsole                -0.134839972          -0.06286946
## ProductType.Laptop                     -0.169030851          -0.07881104
## ProductType.Netbook                    -0.200000000          -0.09325048
## ProductType.PC                         -0.134839972          -0.06286946
## ProductType.Printer                    -0.093250481          -0.04347826
## ProductType.PrinterSupplies            -0.093250481          -0.04347826
## ProductType.Smartphone                  1.000000000          -0.09325048
## ProductType.Software                   -0.093250481           1.00000000
## ProductType.Tablet                     -0.134839972          -0.06286946
## ProductNum                             -0.203846876           0.31744728
## Price                                  -0.240853254          -0.15842514
## x5StarReviews                          -0.136026645          -0.08966148
## x4StarReviews                          -0.129564971          -0.06414668
## x3StarReviews                          -0.051608842          -0.08403452
## x2StarReviews                           0.010258184          -0.10522386
## x1StarReviews                           0.008364221          -0.08331472
## PositiveServiceReview                  -0.121898713          -0.09417280
## NegativeServiceReview                  -0.049255257          -0.12630960
## Recommendproduct                       -0.245810121           0.15447375
## BestSellersRank                         0.648309435          -0.08718484
## ShippingWeight                         -0.309679173          -0.15732285
## ProductDepth                           -0.497347475          -0.03946767
## ProductWidth                           -0.372632025          -0.11385499
## ProductHeight                          -0.243556636          -0.09164953
## ProfitMargin                           -0.155252927           0.02242979
## Volume                                           NA                   NA
##                              ProductType.Tablet  ProductNum        Price
## ProductType.Accessories            -0.090909091  0.45078200 -0.265412932
## ProductType.Display                -0.062869461 -0.07033263 -0.127598046
## ProductType.ExtendedWarranty       -0.062869461  0.32885257 -0.145470703
## ProductType.GameConsole            -0.090909091  0.18416094 -0.056937736
## ProductType.Laptop                 -0.113960576 -0.30895915  0.842129187
## ProductType.Netbook                -0.134839972 -0.31800113 -0.049001142
## ProductType.PC                     -0.090909091 -0.26387239  0.228568319
## ProductType.Printer                -0.062869461  0.32124904 -0.100800228
## ProductType.PrinterSupplies        -0.062869461  0.32505081 -0.180760378
## ProductType.Smartphone             -0.134839972 -0.20384688 -0.240853254
## ProductType.Software               -0.062869461  0.31744728 -0.158425140
## ProductType.Tablet                  1.000000000 -0.18141227 -0.007520556
## ProductNum                         -0.181412267  1.00000000 -0.502843414
## Price                              -0.007520556 -0.50284341  1.000000000
## x5StarReviews                       0.382446649  0.14369493 -0.073827536
## x4StarReviews                       0.628193172 -0.02964380 -0.110872859
## x3StarReviews                       0.675016683 -0.05761888 -0.120417139
## x2StarReviews                       0.673163776 -0.14449492 -0.120403166
## x1StarReviews                       0.700793372 -0.19989038 -0.145942983
## PositiveServiceReview               0.655673113 -0.06973952 -0.087276184
## NegativeServiceReview               0.514722258 -0.27624009 -0.075491928
## Recommendproduct                    0.223368680  0.32962103 -0.335467972
## BestSellersRank                    -0.129503774 -0.21143500 -0.064065637
## ShippingWeight                     -0.175120605  0.04608903  0.305381720
## ProductDepth                       -0.142903660 -0.20489592  0.554406040
## ProductWidth                       -0.089754643 -0.24980215  0.299762108
## ProductHeight                      -0.164596972  0.19876714 -0.083221972
## ProfitMargin                       -0.056021367  0.44070709 -0.039665042
## Volume                                       NA          NA           NA
##                              x5StarReviews x4StarReviews x3StarReviews
## ProductType.Accessories        -0.13095112   -0.14370096   -0.11830189
## ProductType.Display            -0.10465503   -0.10258130   -0.09736158
## ProductType.ExtendedWarranty   -0.10705400   -0.10044605   -0.09291922
## ProductType.GameConsole         0.70678916    0.39044978    0.25748059
## ProductType.Laptop             -0.16433710   -0.16917232   -0.16306209
## ProductType.Netbook            -0.11480263   -0.05743602   -0.11592140
## ProductType.PC                 -0.09105873   -0.09121216   -0.06370102
## ProductType.Printer            -0.05427668   -0.08549925   -0.08403452
## ProductType.PrinterSupplies    -0.10405529   -0.10258130   -0.09736158
## ProductType.Smartphone         -0.13602664   -0.12956497   -0.05160884
## ProductType.Software           -0.08966148   -0.06414668   -0.08403452
## ProductType.Tablet              0.38244665    0.62819317    0.67501668
## ProductNum                      0.14369493   -0.02964380   -0.05761888
## Price                          -0.07382754   -0.11087286   -0.12041714
## x5StarReviews                   1.00000000    0.86156376    0.78144956
## x4StarReviews                   0.86156376    1.00000000    0.97642080
## x3StarReviews                   0.78144956    0.97642080    1.00000000
## x2StarReviews                   0.71083333    0.95261395    0.98478962
## x1StarReviews                   0.61377988    0.91396774    0.95019130
## PositiveServiceReview           0.88571578    0.98200009    0.94939773
## NegativeServiceReview           0.66453818    0.80016317    0.75033197
## Recommendproduct                0.38626309    0.31062215    0.26292639
## BestSellersRank                -0.14310283   -0.12340182   -0.06364895
## ShippingWeight                  0.14592069   -0.02366589   -0.05810772
## ProductDepth                   -0.11059532   -0.15395982   -0.16836843
## ProductWidth                   -0.15108379   -0.16341943   -0.15862034
## ProductHeight                  -0.03516902   -0.10534722   -0.13693604
## ProfitMargin                   -0.03045462   -0.04346523   -0.03102536
## Volume                                  NA            NA            NA
##                              x2StarReviews x1StarReviews
## ProductType.Accessories       -0.142932035  -0.123548795
## ProductType.Display           -0.111601068  -0.108840937
## ProductType.ExtendedWarranty  -0.105223864  -0.113095306
## ProductType.GameConsole        0.175207010   0.147130723
## ProductType.Laptop            -0.175321566  -0.181867555
## ProductType.Netbook           -0.051290920  -0.080600675
## ProductType.PC                -0.050717819  -0.028195783
## ProductType.Printer           -0.105223864  -0.104586568
## ProductType.PrinterSupplies   -0.111601068  -0.117349675
## ProductType.Smartphone         0.010258184   0.008364221
## ProductType.Software          -0.105223864  -0.083314724
## ProductType.Tablet             0.673163776   0.700793372
## ProductNum                    -0.144494923  -0.199890381
## Price                         -0.120403166  -0.145942983
## x5StarReviews                  0.710833335   0.613779876
## x4StarReviews                  0.952613948   0.913967738
## x3StarReviews                  0.984789621   0.950191299
## x2StarReviews                  1.000000000   0.971015430
## x1StarReviews                  0.971015430   1.000000000
## PositiveServiceReview          0.922510464   0.893580388
## NegativeServiceReview          0.806746949   0.779582027
## Recommendproduct               0.172124249   0.171672785
## BestSellersRank                0.004361332  -0.039696208
## ShippingWeight                -0.104655234  -0.142093238
## ProductDepth                  -0.190425489  -0.195358134
## ProductWidth                  -0.169516140  -0.123405315
## ProductHeight                 -0.165810554  -0.106701931
## ProfitMargin                  -0.060938012  -0.077581647
## Volume                                  NA            NA
##                              PositiveServiceReview NegativeServiceReview
## ProductType.Accessories               -0.164968075         -0.2573611292
## ProductType.Display                   -0.124042590         -0.1607576747
## ProductType.ExtendedWarranty          -0.133999186         -0.0918615284
## ProductType.GameConsole                0.461310727          0.3403808484
## ProductType.Laptop                    -0.164686747         -0.2081414510
## ProductType.Netbook                   -0.095205564          0.2462762861
## ProductType.PC                        -0.056988971         -0.0830197191
## ProductType.Printer                   -0.084216207         -0.1607576747
## ProductType.PrinterSupplies           -0.124042590         -0.1952057478
## ProductType.Smartphone                -0.121898713         -0.0492552572
## ProductType.Software                  -0.094172803         -0.1263096015
## ProductType.Tablet                     0.655673113          0.5147222585
## ProductNum                            -0.069739521         -0.2762400896
## Price                                 -0.087276184         -0.0754919276
## x5StarReviews                          0.885715784          0.6645381774
## x4StarReviews                          0.982000086          0.8001631738
## x3StarReviews                          0.949397729          0.7503319690
## x2StarReviews                          0.922510464          0.8067469492
## x1StarReviews                          0.893580388          0.7795820270
## PositiveServiceReview                  1.000000000          0.8196544641
## NegativeServiceReview                  0.819654464          1.0000000000
## Recommendproduct                       0.360831992          0.0539570729
## BestSellersRank                       -0.134995058         -0.0009529939
## ShippingWeight                         0.007449719         -0.1073175650
## ProductDepth                          -0.141295323         -0.2155576801
## ProductWidth                          -0.123905533         -0.1382365188
## ProductHeight                         -0.067617496         -0.1463361480
## ProfitMargin                          -0.065353674         -0.1892907418
## Volume                                          NA                    NA
##                              Recommendproduct BestSellersRank
## ProductType.Accessories            0.05043809   -0.1291586611
## ProductType.Display                0.03488117   -0.0825933261
## ProductType.ExtendedWarranty      -0.32389658   -0.0898442865
## ProductType.GameConsole            0.30983398   -0.1246557539
## ProductType.Laptop                -0.22581247   -0.0617465535
## ProductType.Netbook               -0.18168574   -0.0263194606
## ProductType.PC                    -0.03602721   -0.0809742676
## ProductType.Printer                0.15447375   -0.0890487266
## ProductType.PrinterSupplies        0.39365892   -0.0669776214
## ProductType.Smartphone            -0.24581012    0.6483094350
## ProductType.Software               0.15447375   -0.0871848433
## ProductType.Tablet                 0.22336868   -0.1295037744
## ProductNum                         0.32962103   -0.2114349960
## Price                             -0.33546797   -0.0640656369
## x5StarReviews                      0.38626309   -0.1431028328
## x4StarReviews                      0.31062215   -0.1234018174
## x3StarReviews                      0.26292639   -0.0636489532
## x2StarReviews                      0.17212425    0.0043613320
## x1StarReviews                      0.17167279   -0.0396962079
## PositiveServiceReview              0.36083199   -0.1349950580
## NegativeServiceReview              0.05395707   -0.0009529939
## Recommendproduct                   1.00000000   -0.1542109832
## BestSellersRank                   -0.15421098    1.0000000000
## ShippingWeight                     0.11079815   -0.2033580728
## ProductDepth                       0.01437657   -0.2917761162
## ProductWidth                       0.09762164   -0.2473962751
## ProductHeight                      0.26833286   -0.1949204105
## ProfitMargin                       0.08156030   -0.1473475029
## Volume                                     NA              NA
##                              ShippingWeight ProductDepth ProductWidth
## ProductType.Accessories        -0.207289431 -0.004891727  -0.01729070
## ProductType.Display             0.022721058  0.162606786   0.24028011
## ProductType.ExtendedWarranty   -0.157322848 -0.328145456  -0.34771780
## ProductType.GameConsole         0.191454389 -0.090986127  -0.03902988
## ProductType.Laptop              0.097453914  0.322003629   0.06613537
## ProductType.Netbook            -0.129919151 -0.045952159  -0.01489812
## ProductType.PC                  0.468255916  0.634815755   0.61459487
## ProductType.Printer             0.707715687  0.296120264   0.43739305
## ProductType.PrinterSupplies    -0.140767086 -0.158547255  -0.25083178
## ProductType.Smartphone         -0.309679173 -0.497347475  -0.37263202
## ProductType.Software           -0.157322848 -0.039467667  -0.11385499
## ProductType.Tablet             -0.175120605 -0.142903660  -0.08975464
## ProductNum                      0.046089027 -0.204895921  -0.24980215
## Price                           0.305381720  0.554406040   0.29976211
## x5StarReviews                   0.145920693 -0.110595322  -0.15108379
## x4StarReviews                  -0.023665887 -0.153959818  -0.16341943
## x3StarReviews                  -0.058107720 -0.168368426  -0.15862034
## x2StarReviews                  -0.104655234 -0.190425489  -0.16951614
## x1StarReviews                  -0.142093238 -0.195358134  -0.12340531
## PositiveServiceReview           0.007449719 -0.141295323  -0.12390553
## NegativeServiceReview          -0.107317565 -0.215557680  -0.13823652
## Recommendproduct                0.110798147  0.014376568   0.09762164
## BestSellersRank                -0.203358073 -0.291776116  -0.24739628
## ShippingWeight                  1.000000000  0.756791718   0.77092781
## ProductDepth                    0.756791718  1.000000000   0.85162710
## ProductWidth                    0.770927807  0.851627104   1.00000000
## ProductHeight                   0.795171973  0.486154560   0.67409332
## ProfitMargin                    0.666513549  0.252162759   0.27358454
## Volume                                   NA           NA           NA
##                              ProductHeight ProfitMargin Volume
## ProductType.Accessories        -0.12731005  -0.18870355     NA
## ProductType.Display             0.13555836  -0.16108668     NA
## ProductType.ExtendedWarranty   -0.12771428   0.26711841     NA
## ProductType.GameConsole         0.20514324  -0.08255780     NA
## ProductType.Laptop             -0.16700053  -0.04804971     NA
## ProductType.Netbook            -0.18457706  -0.23397272     NA
## ProductType.PC                  0.27215539   0.07666082     NA
## ProductType.Printer             0.80275615   0.87883997     NA
## ProductType.PrinterSupplies     0.09949362   0.14477410     NA
## ProductType.Smartphone         -0.24355664  -0.15525293     NA
## ProductType.Software           -0.09164953   0.02242979     NA
## ProductType.Tablet             -0.16459697  -0.05602137     NA
## ProductNum                      0.19876714   0.44070709     NA
## Price                          -0.08322197  -0.03966504     NA
## x5StarReviews                  -0.03516902  -0.03045462     NA
## x4StarReviews                  -0.10534722  -0.04346523     NA
## x3StarReviews                  -0.13693604  -0.03102536     NA
## x2StarReviews                  -0.16581055  -0.06093801     NA
## x1StarReviews                  -0.10670193  -0.07758165     NA
## PositiveServiceReview          -0.06761750  -0.06535367     NA
## NegativeServiceReview          -0.14633615  -0.18929074     NA
## Recommendproduct                0.26833286   0.08156030     NA
## BestSellersRank                -0.19492041  -0.14734750     NA
## ShippingWeight                  0.79517197   0.66651355     NA
## ProductDepth                    0.48615456   0.25216276     NA
## ProductWidth                    0.67409332   0.27358454     NA
## ProductHeight                   1.00000000   0.72202181     NA
## ProfitMargin                    0.72202181   1.00000000     NA
## Volume                                  NA           NA      1

#note: Correlation values fall within -1 and 1 with variables have string positive relationships having correlation values closer to 1 and strong negative relationships with values closer to -1.

#correlation matrix using a heat map

install.packages(“corrplot”)

library(corrplot)

#Do a plot tof the data

corrplot(corrData)

#blue (cooler) colors show a positive relationship and red (warmer) colors indicate more negative relationships

#predict the findings for the new product rf - trained model name (is known as the object) and the dataset is meant to be inside the bracket

readydata_rf_predict_newprod<-predict(object = readydata_rf, newdata=readyData_newprod, na.action = na.pass)

#to see all the predictions

readydata_rf_predict_newprod
##          1          2          3          4          5          6 
##  418.09453  242.73587  257.22520   50.01307   67.44080   83.58013 
##          7          8          9         10         11         12 
## 1363.87747  295.09227   28.71653 1096.55453 4773.67787  454.93600 
##         13         14         15         16         17         18 
##  902.34453  190.33154  509.13293 2042.42013  445.60173  492.42451 
##         19         20         21         22         23         24 
##  586.58076  516.15794  531.28488  475.64000  468.25987 4724.95827

#predict the findings for the new product KNN - trained model name and the dataset is meant to be inside the bracket

readydata_KNN_predict_newprod<-predict(object = readydata_KNN, newdata=readyData_newprod, na.action = na.pass)

#to see all the predictions

readydata_KNN_predict_newprod
##  [1]  240.8  168.8  231.2  231.2  216.0   90.4 1120.0   87.2   90.4  514.4
## [11] 1652.0  168.8  275.2  107.2  210.4 1484.0   44.8  136.0  220.0   60.8
## [21]  156.0   62.4   36.8 2781.6

#predict the findings for the new product SVM - trained model name and the dataset is meant to be inside the bracket

readydata_svm_predict_newprod<-predict(object = readydata_svm, newdata=readyData_newprod, na.action = na.pass)

#to see all the predictions

readydata_svm_predict_newprod
##          1          2          3          4          5          6 
##  538.73341  248.94260  257.90869   74.24331   28.49271   78.93764 
##          7          8          9         10         11         12 
## 1549.46347   61.99109   53.95642 1139.30168 6651.44163  484.42319 
##         13         14         15         16         17         18 
##  806.98151  261.57497  371.18396 1695.71787   62.87584  508.63044 
##         19         20         21         22         23         24 
##  427.34255  525.66379  880.88309  482.80142  601.98731 6586.05181

#steps to add data/findings to the dataset. How: new prod file is assigned into a new file named new prod plus predictions #first we add for rnadom forest

newprod_pluspredictionsrf <- readyData_newprod

#Then u add the final predictions from each of the models (rf)

newprod_pluspredictionsrf$pred <- predict(object = readydata_rf, newdata = readyData_newprod)

#here we add the predicted rf data to the predictions column in the new prod plus prediction file

newprod_pluspredictionsrf$predictions <- readydata_rf_predict_newprod

#Create a csv file and write it to your hard drive. Note: You may need to use your computer’s search function to locate your output file

write.csv(newprod_pluspredictionsrf, file="newprod_pluspredictionsrf.csv", row.names = TRUE)

#steps to add data/findings to the dataset. How: new prod file is assigned into a new file named new prod plus predictions #next we do same for KNN

newprod_pluspredictionsKNN <- readyData_newprod

#Then u add the final predictions from each of the models (KNN)

newprod_pluspredictionsKNN$pred <- predict(object = readydata_KNN, newdata = readyData_newprod)

#here we add the predicted KNN data to the predictions column in the new prod plus prediction file

newprod_pluspredictionsKNN$predictions <- readydata_KNN_predict_newprod

#Create a csv file and write it to your hard drive. Note: You may need to use your computer’s search function to locate your output file

write.csv(newprod_pluspredictionsKNN, file="newprod_pluspredictionsKNN.csv", row.names = TRUE)

#steps to add data/findings to the dataset. How: new prod file is assigned into a new file named new prod plus predictions #next we do same for svm

newprod_pluspredictionssvm <- readyData_newprod

#Then u add the final predictions from each of the models (svm)

newprod_pluspredictionssvm$pred <- predict(object = readydata_svm, newdata = readyData_newprod)

#here we add the predicted KNN data to the predictions column in the new prod plus prediction file

newprod_pluspredictionssvm$predictions <- readydata_svm_predict_newprod

#Create a csv file and write it to your hard drive. Note: You may need to use your computer’s search function to locate your output file

write.csv(newprod_pluspredictionssvm, file="newprod_pluspredictionssvm.csv", row.names = TRUE)

#run dplyr

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(magrittr)

#since the product type was dumifyed, we can use the product number to identify the product types and then assign colours to the product types in the graph so we can identify the outliers. here we used dplyr package. so we merge product type and number under the exisitng products and name it product type and then use the procut type in the graph to identify each data point.

testSet <- testSet %>%
  left_join(existing_products %>%
            select(ProductNum, ProductType), 
            by = "ProductNum")

#here we use ggplot to visualize the findings in the testSet.The focus is on the test set because we can to see the level of errors that may have occured. geom_abline is used to draw a regression line across. for it you must put the intercepts and slope. color is used to identify the product types. This is for rf(random forest)

rf_plot <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = readydata_rf_predict, y = testSet$Volume, col = ProductType)) +
  geom_abline(slope = 1, intercept = 0) + 
  labs(title = "rf model error with all variables") + 
  theme(legend.position="bottom", legend.title = element_blank())

#call the function

rf_plot

#here we use ggplot to visualize the findings in the testSet.The focus is on the test set because we can to see the level of errors that may have occured. geom_abline is used to draw a regression line across. for it you must put the intercepts and slope. color is used to identify the product types. This is for KNN

KNN_plot <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = readydata_KNN_predict, y = testSet$Volume, col = ProductType)) + 
  geom_abline(slope = 1, intercept = 0) +
  labs(title = "KNN model error with all variables") + 
  theme(legend.position="bottom", legend.title = element_blank())

#call the function

KNN_plot

#here we use ggplot to visualize the findings in the testSet.The focus is on the test set because we can to see the level of errors that may have occured. geom_abline is used to draw a regression line across. for it you must put the intercepts and slope. color is used to identify the product types. This is for svm

svm_plot <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = readydata_svm_predict, y = testSet$Volume, col = ProductType)) + 
  geom_abline(slope = 1, intercept = 0) +
  labs(title = "svm model error with all variables") + 
  theme(legend.position="bottom", legend.title = element_blank())

#call the function

svm_plot

#Reviews Vs Volume of Sales

Review_Vol_of_sale <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = testSet$PositiveServiceReview, y = readydata_svm_predict, col = ProductType)) + 
  labs(title = "Positive Service Reviews VS Volume of Sale") + 
  theme(legend.position="bottom", legend.title = element_blank())
Review_Vol_of_sale

NReview_Vol_of_sale <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = testSet$NegativeServiceReview, y = readydata_svm_predict, col = ProductType)) + 
  labs(title = "Negative Service Reviews VS Volume of Sale") + 
  theme(legend.position="bottom", legend.title = element_blank())
NReview_Vol_of_sale

x5star_Review_Vol_of_sale <- ggplot(data = testSet) +
  geom_point(mapping = aes(x = testSet$x5StarReviews, y = readydata_svm_predict, col = ProductType)) + 
  labs(title = "x5star_Review VS Volume of Sale") + 
  theme(legend.position="bottom", legend.title = element_blank())
x5star_Review_Vol_of_sale

Findings In line with the deliverables, the Algorithms tested were linear regression, random forest, KNN and SVM. At the end of the tests the linear regression model was not used ecause the r - squared value that was gotten was equal to 1. Although this may seem good, it is likely that some kind of over fitting may occour. So other models were tested. Specifically, the RF, kNN, and SVM. Following testing of the models, the KNN model gave an error rating of 0.67, while the rf and svm gave values of 0.87 and 0.88. In addition an observation of the graphs that were generated using the error metrics of the showed a closer convergence between the predicted data and the test set volume.

The last set of charts loosely shows the the impact of service reviews on the sales volume. There you would observe that the higher the reviews, the more likley sales are to occour.

Finally, the four key products of focus were; PC, Laptops, netbooks and Smartphones. Based on predictions driven by the SVM model, smartphone are likely to produce the highest volume of sales at about 1924, this would be followed by Netbooks at 1744, then PC at 787 and laptops at 360. This hierachy of sales prediction tallied with that of the random forest, with the following sequence: smartphones:2056, netbook: 1771, PC: 660 and laptop: 374.