Two different market definitions are used
Define a market for a firm as a set of competitiors within 2.5 mile radius from the focal market. I exclude other markets that are overlapped with the focal market to remove indirect effects from firms that are near but not in the distance band.
Use Density-based spatial clustering of applications with noise (DBSCAN) to define markets. Detail will be explained later.
The major difference between the distance band approach and DBSCAN is that the distance band approach is more flexible since depending on the location of firms, each firm has different sets of markets. However, DBSCAN creates non-overlapping markets for all firms. Thus, each firm have the same sets of markets.
I use the reduced form model. Depending on the results of DBSCAN, I may incorporate the structural approach.
I use hotels in Houston, TX from 2014 Q1 to 2014 Q4. The data set (from Source Strategic Inc) is quartely based so there are four time points. Due to entry and exit in the market, the data set is an unbalanced-panel one.
##
## ================================================================================
## Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
## --------------------------------------------------------------------------------
## adr 1,880 85.849 52.001 16.990 44.512 118.453 400.750
## room_sold 1,880 9,459.916 10,004.500 299.520 3,031.440 11,238.840 114,480.000
## room 1,880 111.690 117.694 6 37 131 1,200
## rating 1,880 1.763 1.659 0 0 3 6
## nhotel 1,880 20.755 10.918 1 13 29 46
## avmmc 1,880 26.689 29.279 0 0 46.1 132
## avmmc2 1,880 1.192 1.334 0 0 2 9
## hi.sales 1,880 0.138 0.119 0.038 0.081 0.148 1.000
## n.same.brand 1,880 0.070 0.294 0 0 0 3
## n.same.chain 1,880 0.853 1.412 0 0 1 7
## n.brand.city 1,880 5.166 5.154 0 0 9 20
## n.chain.city 1,880 18.226 18.375 0 0 34 55
## --------------------------------------------------------------------------------
As mentioned earlier, I create a market for each hotel in Houston, TX. Using a radius of 2.5 mile from a focal hotel, I define a set of rivals in its market for the focal hotel. The average number of hotels for each market defined wit this approach (nhotel) is 20.75. This shows a large number of hotels locate in a small geogrpahic area in Houston, TX.
Using this approach to create other markets would be two possible problems, especially if the other markets are too close to the focal market: the first one is an issue of double counting of the same rivals, and the second one is the potential indirect effect on the forcal market. First, figure 1 illustrates potential double counting. The left circel represents the market of firm 1, while the right circle is for the market for firm 4. Texts in the figure represents the location for each firm. In the figure, firm 2 belongs to both firm 1’s market (the left circle) and firm 4’s market(the right circle). Thus, since firm 1 exists in the right circle, firm 2 can be a rival for firm 1 in the left circle. At the same time, firm 2 can be a rival for firm 1 in the right circle.
Figure 2 illustrates possible issues of the indirect effect of rivals that do not have direct contacts. If the market of firm 4 is included, it is not negligible the indirect effect from the rivals (firm 5) through the nearest rival (firm 2) on the focal firm (firm 1).
Due to these reasons, I exclude markets that are ovrelapped with the focal markets when calculating multimarket contants.
Since I use two different measures of the average multimarket contacts. Both measures use the same formals to calculate the average MMC, while they are different in recongizing other markets given a focal market. Figure 3 shows how to calculate AVMMC. Assume that we calculate the multimarket contacts of firm 1 in the left circle. In this approach of AVMMC, I assume distance bands of all hotels as independent markets. This means firm 1 appears in three right circles (Marekts B(Firm 4’s market) and C(Firm 1’s second market), market D (Firm 3’s market)). Thus, there are three other markets. At market B, firm 1 has a contact with firm 2, while firm 1 has a contact with firm 3 in markets C and D. Thus, the average multimarket contacts for firm 1 in market A is \(3/2 =1.5\) (total number of contants of rivals in other markets (B,C,D) / No. of rivals in the focal market (A), AVMMC=1.5).
The second measure of MMC only considers markets where a focal firm is in its forcal markets. Figure 4 explains this. Assume one is intersted in the average MMC of firm 1 at market A. Rather than considering two markets B and C, market C ( firm 1’s second focal market) is only treated as other markets for firm 1. Thus, the average MMC of firm 1 at market A is 0.5 since firm 1 only has a contact with firm 3 in market C (AVMMC2 =0.5).
##
## ======================================================================
## Dependent variable:
## --------------------------------------
## adr
## OLS instrumental
## variable
## (1) (2)
## ----------------------------------------------------------------------
## avmmc 0.160*** 0.239***
## (0.028) (0.037)
##
## rating 26.246*** 25.954***
## (0.669) (0.677)
##
## room 0.006 0.011
## (0.007) (0.007)
##
## hi.sales -27.548*** -27.286***
## (6.194) (6.215)
##
## cbd 47.500*** 48.975***
## (3.379) (3.421)
##
## air -13.078*** -13.458***
## (3.723) (3.737)
##
## factor(qtr)2 5.152** 5.264**
## (2.078) (2.085)
##
## factor(qtr)3 -1.106 -0.987
## (2.073) (2.080)
##
## factor(qtr)4 1.374 1.374
## (2.068) (2.075)
##
## Constant 24.337*** 21.120***
## (2.504) (2.707)
##
## ----------------------------------------------------------------------
## Observations 1,274 1,274
## R2 0.715 0.714
## Adjusted R2 0.713 0.712
## Residual Std. Error (df = 1264) 26.134 26.219
## F Statistic 353.101*** (df = 9; 1264)
## ======================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
##
## ======================================================================
## Dependent variable:
## --------------------------------------
## adr
## OLS instrumental
## variable
## (1) (2)
## ----------------------------------------------------------------------
## avmmc2 4.048*** 5.197***
## (0.597) (0.777)
##
## rating 26.320*** 26.174***
## (0.662) (0.666)
##
## room 0.008 0.012
## (0.007) (0.007)
##
## hi.sales -30.632*** -31.358***
## (6.175) (6.192)
##
## cbd 47.200*** 47.959***
## (3.346) (3.367)
##
## air -12.561*** -12.631***
## (3.703) (3.709)
##
## factor(qtr)2 5.258** 5.352***
## (2.068) (2.072)
##
## factor(qtr)3 -0.916 -0.795
## (2.064) (2.067)
##
## factor(qtr)4 1.491 1.524
## (2.058) (2.061)
##
## Constant 23.336*** 21.212***
## (2.487) (2.654)
##
## ----------------------------------------------------------------------
## Observations 1,274 1,274
## R2 0.718 0.717
## Adjusted R2 0.716 0.715
## Residual Std. Error (df = 1264) 26.009 26.047
## F Statistic 357.878*** (df = 9; 1264)
## ======================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
This approach aims to define dense regions which is measured by the number of points close to the given point.
DBSCAN requires two parameters: distance (\(\epsilon\), or eps) and the minimum number of points (minPts). Given a point i, a radius neighbood is defined with \(\epsilon\). With this neighbood, if there are more than minPts, the point is considered as a core point and then form a cluster. Other points in the cluster will be considered as board points. If any border points within their own neighbors have more points than minPts, they are considered as core points and form their clusters. Since these clusters are closed enogh, the clusters are combined into one. Any other points (neither core nor border) will be considered as noise.
Figure 4. Core(X), Border(Y), and Noise(Z) points in DBSCAN
Since DBSCAN requires two pre-determined parameters (eps and minPts), I explain how these two parameters are determined.
minPts: Hotels, in general, consider 4 ~ 5 hotels as major competitors in the local market (cite needed). Based on this, I choose four for minPts. In addition, if minPts is greater than 4, there would be much difference in clustering results(cite needed)
eps: Given minPts = 4, I create four plots of the distributions of distances to four nearest hotels(points) for each hotel by using the data set of hotels in Houston, TX in 2014. These plots are called k-NN distance plots, each representing a quarter in 2014. Since there are about 470 hotels per quarter, the number of distances calculated in these plots is about 1,880. Based on the observations on these four plots, most of the distances are included between distance = 0 and distance =0.02. Thus, setting eps for this analysis as 0.02 is reasonable since with this eps, I includes most hotels (points) as core or border points.
In sum, I set minPts = 4 as well as eps = 0.02.
Based on the parameter values determined earlier(eps= 0.02, minPts = 4), I conduct clustering using DBSCAN.
The results of the clustering are summarized in the following figure and table. In sum, for each quarter, 21 clusters are created with 58 or 59 noise points (the ones that do not belong to any clusters)