Design of experiments comparing two algorithms

Experiment

This is a random effects model used to compare two algorithms (Single Vs. Multiple Nest).

Set-up: There are 20 items to be added to different nests (or a single nest). The objective is to achieve the minimum no-choice probability value.

Response Variable:

No-Choice probability for Single nest algorithm

No-Choice probability for Multiple nest algorithm

Factors:

Number of nests- Three levels (2,3,4)

Alpha values- Three levels (0.01,0.1,1)

Preference weights of items- Three levels (UD between 5-15, 0-20,15-20)

Dissimilarity parameter values-One level (UD between 0-1)

Cardinality Threshold value- Three levels (5,10,15)

Sequence of adding item to nests(in terms of dissimilarity parameter)-Three levels (High to low, Low to high, Randomly)

For a full factorial design we have 243 experiments.

The data

The first and last records of the design matrix, with related no-choice probability outcomes for both algorithms (single nest and multiple nest)is shown below .

##    n alpha PW lambda  z Sq.         SNO         MNO
## 1 -1    -1 -1      0 -1  -1 0.014542040 0.023100000
## 2  0    -1 -1      0 -1  -1 0.003709265 0.000327392
## 3  1    -1 -1      0 -1  -1 0.008469536 0.000643947
## 4 -1     0 -1      0 -1  -1 0.038573490 0.033926000
## 5  0     0 -1      0 -1  -1 0.009745830 0.000666522
## 6  1     0 -1      0 -1  -1 0.022067270 0.000708219

##      n alpha PW lambda z Sq.         SNO         MNO
## 238 -1     0  1      0 1   1 0.009321043 0.007688557
## 239  0     0  1      0 1   1 0.002315941 0.000117456
## 240  1     0  1      0 1   1 0.005780915 0.000147514
## 241 -1     1  1      0 1   1 0.009321043 0.007688557
## 242  0     1  1      0 1   1 0.002315941 0.000117456
## 243  1     1  1      0 1   1 0.005780915 0.000147514

Basic Comparison

There are two response variables (minimum no-choice probability values achieved by both algorithms). A summary of the no-choice probability values attained by both algorithms across the 243 experiments is given here:

For the single nest algorithms:

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0001678 0.0049700 0.0097460 0.0274800 0.0170300 0.1688000

For the multiple nest algorithm

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.0001072 0.0001980 0.0003776 0.0057120 0.0104700 0.0339300

From the above results it is clear that the multiple nest algorithm performs better in 94% of the cases. Summary of the cases when the single nest algorithm performs better:

##      n alpha PW lambda  z Sq.
## 1   -1    -1 -1      0 -1  -1
## 10  -1    -1  0      0 -1  -1
## 19  -1    -1  1      0 -1  -1
## 28  -1    -1 -1      0  0  -1
## 37  -1    -1  0      0  0  -1
## 46  -1    -1  1      0  0  -1
## 79  -1     1  1      0  1  -1
## 82  -1    -1 -1      0 -1   0
## 91  -1    -1  0      0 -1   0
## 100 -1    -1  1      0 -1   0
## 109 -1    -1 -1      0  0   0
## 127 -1    -1  1      0  0   0
## 163 -1    -1 -1      0 -1   1
## 172 -1    -1  0      0 -1   1
## 181 -1    -1  1      0 -1   1
## 190 -1    -1 -1      0  0   1
## 199 -1    -1  0      0  0   1
## 208 -1    -1  1      0  0   1

Only for the two nest case (when n is -1) the single nest algorithm performs better. In addition, alpha is 0.01 pointing to a very low cardinality context effect for all these cases. For all the above cases it was found that the dissimilarity parameter for one of the nests (out of the two) was greater than 0.93. This points to a scenario when there is minimal within nest similarity(high dissimilarity) leading to a lower no-choice probability value.This is in line with our proposition that high dissimilairity between added items and items in the assortment leads to lower no-choice values. As a result the multiple nest algorithm gives a better performance.

Analyzing the Experiment

The first step is to plot both response variables to check for trends or anomalies.

Plot for the Single Nest Algorithm:

Multiple Nest Algorithm

No-choice probability values deviate from normality more for the single nest algorithm than the multiple nest algorithm. The variability in the values is also greater for the single nest algorithm. Most of the data seems to be concentrated on the extreme ends (high or low). On the other hand, the multiple nest algorithm indicates a slow transition from low to high values. However, lower values are larger in number and the higher values appear as outliers.

Impact of factors on response

Single Nest Algorithm

Results: Variability in no-choice probability is higher when number of nests is highest. When items are added from similar to dissimilar (sequence of adding items to nests is from low to high disssimilarity), the range of no-choice value show greater variability.The mean for all cases is comparable.

Multiple Nest Algorithm

Results:There is a marked difference in the median and variability of the 2 nest option when compared to 3 and 4 nests. As cardinality threshold increases the variability in no-choice decreases.

Model

Fit a model with up to third order interactions.

Single Nest Algorithm

##               Df Sum Sq Mean Sq F value   Pr(>F)    
## n              1 0.0743 0.07433  35.579 9.85e-09 ***
## alpha          1 0.0009 0.00091   0.438    0.509    
## PW             1 0.0007 0.00074   0.353    0.553    
## z              1 0.0040 0.00399   1.908    0.169    
## Sq.            1 0.0000 0.00004   0.020    0.888    
## n:alpha        1 0.0001 0.00008   0.040    0.842    
## n:PW           1 0.0000 0.00004   0.020    0.889    
## n:z            1 0.0001 0.00007   0.036    0.850    
## n:Sq.          1 0.0000 0.00002   0.010    0.920    
## alpha:PW       1 0.0000 0.00001   0.006    0.939    
## alpha:z        1 0.0009 0.00095   0.454    0.501    
## alpha:Sq.      1 0.0000 0.00002   0.011    0.915    
## PW:z           1 0.0000 0.00002   0.012    0.914    
## PW:Sq.         1 0.0000 0.00000   0.002    0.963    
## z:Sq.          1 0.0000 0.00001   0.005    0.943    
## n:alpha:PW     1 0.0000 0.00000   0.001    0.980    
## n:alpha:z      1 0.0001 0.00013   0.060    0.806    
## n:alpha:Sq.    1 0.0000 0.00001   0.003    0.955    
## n:PW:z         1 0.0000 0.00000   0.000    0.989    
## n:PW:Sq.       1 0.0000 0.00002   0.008    0.930    
## n:z:Sq.        1 0.0000 0.00002   0.011    0.915    
## alpha:PW:z     1 0.0000 0.00000   0.002    0.962    
## alpha:PW:Sq.   1 0.0000 0.00000   0.000    0.996    
## alpha:z:Sq.    1 0.0000 0.00000   0.001    0.974    
## PW:z:Sq.       1 0.0000 0.00001   0.004    0.951    
## Residuals    217 0.4534 0.00209                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Results: Under Single Nest Algorithm, only the number of nests are significant i.e. the variability in the number of nests can explain the variability in no-choice probability.

Multiple Nest Algorithm

##                  Df   Sum Sq  Mean Sq   F value   Pr(>F)    
## n1                2 0.014150 0.007075 92596.664  < 2e-16 ***
## alpha1            2 0.000062 0.000031   405.935  < 2e-16 ***
## PW1               2 0.000179 0.000090  1171.821  < 2e-16 ***
## z1                2 0.001218 0.000609  7971.036  < 2e-16 ***
## Sq.1              2 0.000009 0.000004    58.229  < 2e-16 ***
## n1:alpha1         4 0.000101 0.000025   331.526  < 2e-16 ***
## n1:PW1            4 0.000318 0.000079  1039.866  < 2e-16 ***
## n1:z1             4 0.002141 0.000535  7006.244  < 2e-16 ***
## n1:Sq.1           4 0.000012 0.000003    38.924  < 2e-16 ***
## alpha1:PW1        4 0.000002 0.000000     5.986 0.000211 ***
## alpha1:z1         4 0.000122 0.000030   398.515  < 2e-16 ***
## alpha1:Sq.1       4 0.000000 0.000000     0.835 0.505882    
## PW1:z1            4 0.000024 0.000006    78.409  < 2e-16 ***
## PW1:Sq.1          4 0.000001 0.000000     1.942 0.108347    
## z1:Sq.1           4 0.000007 0.000002    24.013 2.33e-14 ***
## n1:alpha1:PW1     8 0.000004 0.000000     6.254 1.13e-06 ***
## n1:alpha1:z1      8 0.000205 0.000026   334.909  < 2e-16 ***
## n1:alpha1:Sq.1    8 0.000000 0.000000     0.193 0.991382    
## n1:PW1:z1         8 0.000043 0.000005    69.820  < 2e-16 ***
## n1:PW1:Sq.1       8 0.000001 0.000000     0.946 0.481873    
## n1:z1:Sq.1        8 0.000016 0.000002    25.830  < 2e-16 ***
## alpha1:PW1:z1     8 0.000004 0.000000     6.041 1.90e-06 ***
## alpha1:PW1:Sq.1   8 0.000000 0.000000     0.052 0.999930    
## alpha1:z1:Sq.1    8 0.000000 0.000000     0.563 0.805947    
## PW1:z1:Sq.1       8 0.000000 0.000000     0.550 0.816192    
## Residuals       112 0.000009 0.000000                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Results: Under multiple nest algorithm, there are significant main effcts for all variables and some interaction effects. Together the 2 level interactions between all factors and their main effects explain the overall variability in no-choice probability.

Re-Model based on factor significance

Based on the significant main and interaction effects we can remodel:

##              Df   Sum Sq  Mean Sq F value Pr(>F)    
## n1            2 0.014150 0.007075   379.3 <2e-16 ***
## Residuals   240 0.004477 0.000019                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##              Df   Sum Sq  Mean Sq  F value   Pr(>F)    
## n1            2 0.014150 0.007075 4679.999  < 2e-16 ***
## alpha1        2 0.000062 0.000031   20.517 6.95e-09 ***
## PW1           2 0.000179 0.000090   59.226  < 2e-16 ***
## z1            2 0.001218 0.000609  402.870  < 2e-16 ***
## Sq.1          2 0.000009 0.000004    2.943   0.0548 .  
## n1:alpha1     4 0.000101 0.000025   16.756 5.61e-12 ***
## n1:PW1        4 0.000318 0.000079   52.557  < 2e-16 ***
## n1:z1         4 0.002141 0.000535  354.108  < 2e-16 ***
## alpha1:z1     4 0.000122 0.000030   20.142 4.11e-14 ***
## Residuals   216 0.000327 0.000002                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Adjusted R squared (The adjusted R2 gives the percentage of variation explained by only the independent variables that actually affect the response variable).

## [1] 0.9803593

## [1] 0.7576535

76% of the variability in no-choice under single nest algorithm is explained by the variability in the number of nests

98% of the variability in no-choice probability under multiple nest algorithm is explained by the variability in:

Number of nests
Alpha
Preference weights of addd items
Cardinality Threshold
Sequence of adding items
Interactions between n and alpha, n and preference weights, z and alpha

Comparing the Algorithms (SN VS MN)

We can compare the two algorithms based on the results of 243 experiments performed along with the analysis above based on the following criteria:

Acheiving minimum no-choice probability:As mentioned before, the Single Nest (SN) Algorithm performs better than the multiple nest(MN) algorithm in only 6% of the cases. Further, analyzing these cases suggests that under a two nest scenario if one of the nests (to which items are being added)has a high dissimilarity value then only SN performs better than MN. This situation is equivalent to adding one item to a new nest since the level of similarity that binds the nest is negligible.
Distribution of no-choice probability values:The SN algorithm has either very high or very low values for no-choice probability. A significant number of no-choice probability values exceed the value of 0.1. On the other hand, MN has exponentially distributed values between 0 to 0.035 (a large number is concentrated around 0 to 0.005)which is much lower in comparison to SN.
Explained variation by factors: Upto 75% of the variability in the response variable (no-choice) for SN algorithm is explained by a single variable i.e. the number of nests. Rest of the variables are found not-significant. For MN algorithm, 98% of the variability is found to be explained by a certain set of factors. Configuring an assortment for a lower no-choice probability is therefore easier for MN than for SN algorithm. In the latter, 25% of the variability in the response variable is unexplained making it more susceptible to achieving higher no-choice values.

Design of experiments comparing two algorithms

Uzma Mushtaque

April 21, 2016

Experiment

The data

Basic Comparison

Analyzing the Experiment

Impact of factors on response

Single Nest Algorithm

Multiple Nest Algorithm

Model

Single Nest Algorithm

Multiple Nest Algorithm

Re-Model based on factor significance

Comparing the Algorithms (SN VS MN)