Basic Stat and Models for DataH1.txt

Basic Stat

1. How many observations do we have for each of those 36 firm pairs

# load data
dat <- read.csv('/Users/chengnie/Google Drive/Research/UTD/project/SponsorSearch/Jing_Files/Strategic Group - To Jiahui/Data and Analysis/01_digital camera/Cheng/DataH1.txt')

# since firm1 and firm2 are ordered according to the Python code of 36 firm list. There is no problem of permutation of firm pair
dat$FirmPair <- paste(dat$Firm1,dat$Firm2)
# lapply(split(dat,dat$FirmPair),nrow)
x<-sapply(split(dat,dat$FirmPair),nrow)
# x
# data.frame has nicer format in printing 
y<-as.data.frame(x[order(x,decreasing=TRUE)])

We have 9701 observations for 1275 sessions. We have only 33 unique pairs out of the all potential 36 pairs:

y
##                       x[order(x, decreasing = TRUE)]
## canon bestbuy                                   1179
## amazon bestbuy                                  1134
## amazon olympus                                   948
## canon target                                     448
## amazon become                                    430
## amazon target                                    430
## olympus target                                   429
## bestbuy become                                   427
## bestbuy target                                   423
## canon ebay                                       337
## amazon ebay                                      326
## bestbuy ebay                                     320
## kodak officemax                                  318
## olympus ebay                                     313
## kodak samsung                                    278
## bestbuy samsung                                  272
## amazon samsung                                   256
## samsung sony                                     216
## kodak sony                                       215
## officemax sony                                   196
## kodak ecamerafilms                               190
## ebay target                                      142
## nextag sony                                      133
## nextag samsung                                    89
## samsung ecamerafilms                              73
## become target                                     53
## nextag philips                                    43
## kodak philips                                     39
## philips ecamerafilms                              13
## philips officemax                                 13
## kodak electronicsexpo                             10
## ebay samsung                                       6
## philips sony                                       2

2. The ratio of the variable within and co-visit

Mean of within group is 0.3633646

Mean of Co-visit is 0.0258736

3. Two way contingency table

(table2way <- xtabs(~Co.visit + within.group, data = dat))
##         within.group
## Co.visit    0    1
##        0 6101 3349
##        1   75  176
# http://www.statmethods.net/stats/frequencies.html
summary(table2way) # chi-square test of indepedence
## Call: xtabs(formula = ~Co.visit + within.group, data = dat)
## Number of cases in table: 9701 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 127.12, df = 1, p-value = 1.746e-29
chisq.test(table2way)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table2way
## X-squared = 125.6279, df = 1, p-value < 2.2e-16

3.1 ratio of co-visit in two levels of within (0 or 1)

sapply(split(dat$Co.visit,dat$within.group),mean)
##          0          1 
## 0.01214378 0.04992908

The result shows that the co-visit ratio is higher for within-group pair.

## Model-based evidence

Logit model 1: basic benchmark without fixed effect

\(CoVisit = \beta_0 + \beta_1 * Within + \epsilon\)

# http://www.ats.ucla.edu/stat/r/dae/logit.htm
logit1 <- glm(Co.visit ~ within.group, data = dat, family = "binomial")
summary(logit1)
## 
## Call:
## glm(formula = Co.visit ~ within.group, family = "binomial", data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.3201  -0.3201  -0.1563  -0.1563   2.9702  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   -4.3987     0.1162  -37.86   <2e-16 ***
## within.group   1.4528     0.1396   10.41   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2330.0  on 9700  degrees of freedom
## Residual deviance: 2208.8  on 9699  degrees of freedom
## AIC: 2212.8
## 
## Number of Fisher Scoring iterations: 7

Logit model 2: adding firm-pair fixed effect

\(CoVisit = \beta_0 + \beta_1 * Within + \alpha_{FirmPair} + \epsilon\)

logit2 <- glm(Co.visit ~ within.group + FirmPair, data = dat, family = "binomial")
summary(logit2)
## 
## Call:
## glm(formula = Co.visit ~ within.group + FirmPair, family = "binomial", 
##     data = dat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.4838  -0.2880  -0.1845  -0.1189   3.4818  
## 
## Coefficients: (1 not defined because of singularities)
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                     -4.6681     0.5023  -9.293  < 2e-16 ***
## within.group                     0.6979     0.7121   0.980 0.327079    
## FirmPairamazon bestbuy           1.2790     0.5192   2.464 0.013755 *  
## FirmPairamazon ebay              1.8839     0.5347   3.523 0.000427 ***
## FirmPairamazon olympus           0.6034     0.5621   1.074 0.283030    
## FirmPairamazon samsung           0.5250     0.7116   0.738 0.460617    
## FirmPairamazon target            1.0011     0.5521   1.813 0.069777 .  
## FirmPairbecome target          -14.8979  1477.1774  -0.010 0.991953    
## FirmPairbestbuy become           0.2326     0.6743   0.345 0.730167    
## FirmPairbestbuy ebay             0.8084     0.5787   1.397 0.162442    
## FirmPairbestbuy samsung          0.1720     0.7677   0.224 0.822680    
## FirmPairbestbuy target           1.2445     0.5438   2.289 0.022107 *  
## FirmPaircanon bestbuy            0.9147     0.5388   1.698 0.089533 .  
## FirmPaircanon ebay               0.8150     0.6311   1.291 0.196558    
## FirmPaircanon target            -0.7390     0.8687  -0.851 0.394905    
## FirmPairebay samsung           -14.8979  4390.3074  -0.003 0.997292    
## FirmPairebay target             -0.9785     1.1233  -0.871 0.383720    
## FirmPairkodak ecamerafilms     -14.8979   780.1783  -0.019 0.984765    
## FirmPairkodak electronicsexpo  -14.8979  3400.7175  -0.004 0.996505    
## FirmPairkodak officemax        -14.8979   603.0553  -0.025 0.980291    
## FirmPairkodak philips          -15.5958  1722.0203  -0.009 0.992774    
## FirmPairkodak samsung           -0.9570     0.8708  -1.099 0.271807    
## FirmPairkodak sony              -1.3957     1.1222  -1.244 0.213619    
## FirmPairnextag philips         -14.8979  1639.9717  -0.009 0.992752    
## FirmPairnextag samsung           0.1908     1.1241   0.170 0.865218    
## FirmPairnextag sony             -0.2147     1.1225  -0.191 0.848340    
## FirmPairofficemax sony         -14.8979   768.1439  -0.019 0.984526    
## FirmPairolympus ebay             0.3211     0.7110   0.452 0.651567    
## FirmPairolympus target          -1.3910     1.1201  -1.242 0.214309    
## FirmPairphilips ecamerafilms   -14.8979  2982.6266  -0.005 0.996015    
## FirmPairphilips officemax      -14.8979  2982.6266  -0.005 0.996015    
## FirmPairphilips sony           -15.5958  7604.2355  -0.002 0.998364    
## FirmPairsamsung ecamerafilms   -14.8979  1258.6621  -0.012 0.990556    
## FirmPairsamsung sony                 NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 2330  on 9700  degrees of freedom
## Residual deviance: 2090  on 9668  degrees of freedom
## AIC: 2156
## 
## Number of Fisher Scoring iterations: 18

However, \(\beta_1\) becomes insignificant after we control for FirmPair fixed effect.