(Playing along at home? Download the .Rmd for a list of libraries you’ll need!)
In February of 2017, the Super Bowl took place in Houston, Texas, for the first time since 2004. While the event is ostensibly about football, cities often plan for years ahead of time before hosting the Big Game in order to effectively handle the influx of sports fans, celebrities, major media companies, high rollers, and haggard townies looking to party in the national spotlight.
The TABC (Texas Alcoholic Beverage Commission) Mixed Beverage Tax is a fixed tax charged on alcoholic beverages sold by a holder of a Mixed Beverage License in the state of Texas. There exists a record of how much every license holder pays per month, and since the tax is fixed, you can calculate the revenue upon which the tax was paid very simply. This data is also public, which is wonderful. With it, you can do all sorts of things, including, I hope, estimate the effect of hosting the Super Bowl on bar sales. For this, we’ll need a big set of data in addition to just February 2017’s sales, let’s start with 6 years and see how it goes.
If you’re inclined, you can get the data for yourself here: * https://www.comptroller.texas.gov/transparency/open-data/search-datasets/
Or, you can pull it from a few related projects on my github, http://github.com/ianwells.
Let’s get started! What does the data look like?
tabc <- read.csv('tabc.csv')
tabc$date <- as.Date(tabc$date)
tabc$rev <- as.numeric(as.character(tabc$rev))
tabc$yoy_rev <- as.numeric(as.character(tabc$yoy_rev))
tabc$mom_rev <- as.numeric(as.character(tabc$mom_rev))
tabc$lat <- as.numeric(as.character(tabc$lat))
tabc$lng <- as.numeric(as.character(tabc$lng))
tabc$rev[tabc$rev == 0]<-NA
tabc$yoy_rev[tabc$yoy_rev == 0]<-NA
tabc$mom_rev[tabc$mom_rev == 0]<-NA
tabc$yoy_diff <- tabc$rev - tabc$yoy_rev
tabc$mom_diff <- tabc$rev - tabc$mom_rev
tabc.sb <- filter(tabc, date == as.Date('2017-02-01'))kable(head(sample_n(tabc,6)))
Here we have revenues for a few random bars and months within the data. We also do a couple special things - we convert the address on file for the license holder to a latitude-longitude so that we can map it, and we also estimate the neighborhood the bar is in by seeing which neighborhood-polygon (generously provided by http://zetashapes.com/) it lies in.
To show you what this data looks like for a particular bar, let’s pick on Poison Girl, an excellent bar in the Montrose neighborhood. Great bourbon selection.
ggplot(data = filter(tabc,name == 'POISON GIRL'), mapping = aes(x=date,y=rev)) + geom_line(color='RED')Not bad! Poison girl, which opened well before 2011 by the way, saw increasing revenues over much of the last six years, but didn’t see much of a change during the Super Bowl.
Let’s look at somebody a little more Super Bowl-y, like say, The Four Seasons Hotel, a prestigious accommodation in downtown Houston.
ggplot(data = filter(tabc,name == 'SCLV HOUSTON BEVERAGE LLC'), mapping = aes(x=date,y=rev)) + geom_line(color = 'RED')That’s what a bump in sales from the Super Bowl looks like. You’ll notice that sometimes the name in the TABC records doesn’t exactly match the name of the business - you can bust out a map and see where the address of record is and find a more familiar name. Trust me that ‘SCLV HOUSTON BEVERAGE LLC’ is the Four Seasons.
Alright, let’s do some math.
So I’d like to answer a couple questions here:
How much extra bar money came into the city for the Super Bowl? Where was the most extra money spent? *What portion of the extra money went to the little guy?
My approach is going to look like this:
Make a model for bar sales, per bar. Examine the year-over-year, month-over-month, and vs-model change in revenues for the whole city for February. Figure who captured that change in revenues, and make some plots Interpret to see what kind of establishment made the most extra money
Here’s what bar sales in Houston look like:
tabc.m <- group_by(tabc,date)
tabc.m.s <- summarize(tabc.m,sum(rev,na.rm = TRUE))
names(tabc.m.s) <- c('date','rev')
ggplot(data = tabc.m.s, mapping = aes(x=date,y=rev)) + geom_line(color = 'RED')Just looking at total bar sales for the whole city, you probably wouldn’t notice anything special about February 2017. What we can do though, is see about forecasting sales for February 2017, and if the real sales are different, we can try to attribute that to the Super Bowl.
One way to forecast sales is to just take a comparable month (say from a year ago, to accommodate seasonal changes month-to-month) and call it your forecast sales. For a mature bar, you might reason, February this year should look the same as February last year, if you assume nothing has changed. If you know that one thing is different between the two months, like say the Super Bowl happened, then maybe you can attribute any difference solely to the influence of the Super Bowl.
This is, of course, a poor way to predict sales. A lot of things can change in a year, not to mention a bar might not have even been open in February 2016. Even if you look at the whole city’s sales things can change a lot year-over-year, Super Bowls notwithstanding.
Another way to do it is just by looking at month-over-month sales, but that only works if you can expect every month to sort of look the same as a baseline. This of course isn’t true for every month, since bar sales are very seasonal, with major drinking holidays or recurring events causing predictable spikes in sales. Some months are more similar than others though - you could do worse than comparing your Februaries to last August, for example, but this really depends on the bar.
A better way would be to use some sort of model to forecast sales for a given month. We’ve used ETS models before, which take into account seasonal effects as well as long-term trends in data to come up with a fairly solid prediction for sales. We can argue that if our sales don’t match the model, it’s due to some exogenous thing that occurred that month, which of course would be the Super Bowl.
A disadvantage there is that we need a few months data to get a good model built that way, so we won’t be able to use it for very new bars. For those, we’ll use either year-over-year or month-over-month numbers.
Alright. So how much extra money got spent in February 2017, vs February 2016?
tabc.m.s[74,] - tabc.m.s[73,] #12.1 million extra over Jan 2017## date rev
## 1 31 days 12120668
tabc.m.s[74,] - tabc.m.s[62,] #9.1 million extra over Feb 2016## date rev
## 1 366 days 9096839
tabc.m.s.pre <- filter(tabc.m.s,date <= '2017-1-01')
tabc.m.s.post <- filter(tabc.m.s,date >= '2017-2-01') #this is just to illustrate, it can be off by one month
tabc.m.ts <- ts(tabc.m.s.pre$rev, start = c(2011, 1), frequency = 12)
tabc.m.post.ts <- ts(tabc.m.s.post$rev, start = c(2017, 2), frequency = 12)
plot(stl(tabc.m.ts, s.window="periodic"))Took a quick break to show you the periodic components in the revenue signal. Check out the broad upward trend, in addition to the seasonal peaks. Now let’s make the model.
tabc.ets <- ets(tabc.m.ts, model = 'ZZZ')
forecast.ets <- predict(tabc.ets, n.ahead = 1, prediction.interval = T, level = 0.9)
plot(forecast.ets); points(tabc.m.post.ts,col='red',lwd = 2)The full forecast, let’s zoom in:
plot(forecast.ets,xlim=c(2016,2017.5)); points(tabc.m.post.ts,col='red',lwd = 2)That red circle is the real sales figure for Feburary 2017, the blue line is the prediction not including the Super Bowl, and the grey bars are the level that the model is 90% sure the prediction fits in. Notice how the red circle sits confidently above the grey bars - we’re more than 90% sure the real Feburary 2017 is a statistical outlier, a good indication that the Super Bowl did impact bar sales. Can we get a number for that?
effect.ets <- tabc.m.post.ts - forecast.ets$mean
effect.ets## Feb
## 2017 8965273
There you go, about 8.9 million extra dollars, according to our ETS model. A little less than the year-over-year difference, probably due to a slight upward trend in bar sales over that time frame. Also worth point out is that ETS tells us there’s a bit of a range on that number - a few million in either direction.
So, where did it go? Where did that money get spent? Let’s see if we can do that again, per bar. We’ll calculate a composite difference - we’ll use the ETS model if we can, followed by the year-over-year difference if we have it, resorting to month-over-month as a last resort. We’ll call that number - the extra cash a bar made during February 2017 - the Composite Difference, or composite_diff. The raw revenue for that month we’ll call rev.
bars <- unique(tabc.sb$name) #fun fact: 1814 unique licenseholders in Houston during the SB
bars.effect <- data.frame(matrix(nrow=length(bars),ncol=2))
names(bars.effect) <- c('name','model_diff')
for (i in 1:length(bars))
{
this_bar <- bars[i]
tabc.b <- filter(tabc, name == this_bar)
tabc.b.pre <- filter(tabc.b,date <= '2017-1-01')
tabc.b.post <- filter(tabc.b,date >= '2017-2-01')
tryCatch({
tabc.b.ts <- ts(tabc.b.pre$rev, start = c(2011, 1), frequency = 12)
tabc.b.post.ts <- ts(tabc.b.post$rev, start = c(2017, 2), frequency = 12)
tabc.b.ets <- ets(tabc.b.ts, model = 'ZZZ')
forecast.b <- predict(tabc.b.ets, n.ahead = 1, prediction.interval = T, level = 0.9)
data.frame(forecast.b)
effect <- tabc.b.post.ts - forecast.b$mean
bars.effect[i,]<-c(as.character(this_bar),effect[1])
},warning=function(w){bars.effect[i,]<-c(as.character(this_bar),NA)
},error=function(e) {bars.effect[i,]<-c(as.character(this_bar),NA)
})
}
bars.effect$model_diff <- as.numeric(bars.effect$model_diff)
tabc.sb <- left_join(tabc.sb, bars.effect,by= "name")
tabc.sb$composite_diff <- coalesce(tabc.sb$model_diff,tabc.sb$yoy_diff,tabc.sb$mom_diff)
tabc.sb.o <- arrange(tabc.sb, desc(composite_diff))
ggplot(tabc.sb.o, aes(composite_diff)) + geom_density() + xlim(-10000,250000)That composite difference is pretty spread out - this distribution has a long, long tail. If you saw an extra $25,000 during the Super Bowl, you did better than nearly 95% of the bars in Houston.
ggmap(get_map(location = "houston", zoom = 12)) + geom_density2d(data = filter(tabc.sb, composite_diff > 100000),
aes(x = lng, y = lat), size = 0.2) + stat_density2d(data = filter(tabc.sb, composite_diff > 100000),
aes(x = lng, y = lat, fill = ..level.., alpha = 0.01), size = 0.01,
bins = 16, geom = "polygon") +
scale_alpha(range = c(0, 0.3), guide = FALSE)## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=houston&zoom=12&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=houston&sensor=false
This is where the big money went, geographically speaking - no surprises here, those are also major concentrations of bars and fun stuff.
Let’s see who those bars are - the fortunate few who really raked in the cash.
kable(select(head(tabc.sb.o,100),name,composite_diff,rev))| name | composite_diff | rev |
|---|---|---|
| ARAMARK SPORTS & ENTERTAINMENT | 951025.97 | 1001301.94 |
| UNDERDOGS SPORTS PUB | 700337.95 | 742274.93 |
| ARAMARK SPORTS & ENTERTAINMENT | 449250.00 | 548942.99 |
| B & B BUTCHERS | 434117.01 | 738415.97 |
| HILTON AMERICAS - HOUSTON | 385909.42 | 1169312.99 |
| ROSE CLUB | 312567.02 | 657964.93 |
| WESTIN HOTEL MEMORIAL CITY | 297004.03 | 393892.99 |
| ARAMARK SPORTS AND ENTERTAINME | 291438.06 | 422144.93 |
| SCLV HOUSTON BEVERAGE LLC | 235174.03 | 556938.96 |
| TREASURES | 230877.85 | 486988.96 |
| AXELRAD BEER GARDEN | 219447.01 | 364145.97 |
| ART OF THE SPIRITS INC | 218657.01 | 263915.97 |
| LITTLE WOODROW’S KIRBY ICE HOU | 193071.94 | 515144.93 |
| TMC BEVERAGES OF HOUSTON INC | 179075.79 | 499580.90 |
| MARRIOTT MARQUIS HOUSTON | 174584.03 | 812937.91 |
| PAPPAS BROS GRILL | 171441.94 | 533973.88 |
| THE GROVE | 171288.91 | 408612.99 |
| DEL FRISCO’S DOUBLE EAGLE STEA | 168806.66 | 434622.99 |
| IRISH COWBOY | 167694.92 | 352933.88 |
| XOCHI | 162161.94 | 183064.93 |
| CONCRETE COWBOY HOUSTON | 160934.93 | 324324.93 |
| HOTEL ZA ZA | 156826.72 | 511943.88 |
| CHILI’S CHI01 | 140140.00 | 283930.90 |
| 1720 MAIN RECEPTION HALL | 135830.00 | 241861.94 |
| MCINTYRE’S | 135652.98 | 309495.97 |
| PAPPAS BROS STEAKHOUSE | 131270.34 | 565896.87 |
| REPUTATION | 124888.06 | 266774.93 |
| PROSPECT PARK | 123094.03 | 299701.94 |
| PAPPADEAUX SEAFOOD KITCHEN | 117768.95 | 139617.91 |
| AURA DYNAMIC NIGHTLIFE | 117498.06 | 277852.99 |
| PHM BEVERAGE INC | 113472.07 | 224438.96 |
| THE DUNLAVY | 113324.92 | 196317.91 |
| VIC & ANTHONY’S STEAKHOUSE | 109285.43 | 514420.90 |
| STEAK 45 | 105828.96 | 726250.90 |
| LE GRAND COMPTOIR | 103507.02 | 224490.90 |
| MR PEEPLES RESTAURANT & ENTERT | 102487.01 | 149207.91 |
| ROYAL SONESTA HOTEL HOUSTON | 97656.73 | 233208.96 |
| BOOTS ’N SHOOTS | 91041.94 | 257427.91 |
| GROTTO RISTORANTE | 84134.03 | 220160.90 |
| PHILIPPE RESTAURANT AND LOUNGE | 83557.02 | 180704.93 |
| 806 HOSPITALITY SERVICES, LLC | 82701.94 | 225927.91 |
| CHI CHI’S RICHMOND | 80311.05 | 300948.96 |
| MAJORCA BISTRO & TAPAS | 71708.96 | 239070.90 |
| THE REMINGTON GRILL | 70218.87 | 203718.96 |
| LITTLE WOODROW’S EAST END | 69510.01 | 171952.99 |
| STEREO LIVE | 68374.88 | 207901.94 |
| MORTON’S OF CHICAGO | 63875.03 | 221483.88 |
| THE PUB GALLERIA | 63358.95 | 77081.94 |
| HUNGRY’S CAFE & BISTRO | 62912.99 | 79300.90 |
| BRASSERIE DUE PARC | 62697.01 | 65303.88 |
| WAREHOUSE LIVE | 59934.85 | 151886.87 |
| LITTLE WOODROW’S IN MIDTOWN | 59291.05 | 483138.96 |
| EDDIE V’S #8509 | 59235.13 | 204746.87 |
| LUCKY’S PUB | 58694.03 | 177800.90 |
| MO’S IRISH PUB | 57720.00 | 158700.90 |
| DHR WEST LOOP BEVCO, INC. | 55960.67 | 159640.00 |
| THE FLYING SAUCER DRAUGHT EMPO | 54334.86 | 227605.97 |
| BELVEDERE | 52343.46 | 160552.99 |
| CAFE TOULOUSE | 51562.09 | 155962.99 |
| PAPPASITOS CANTINA | 51417.01 | 226785.97 |
| TRULUCK’S STEAK & STONE CRAB | 51279.62 | 299780.90 |
| CLUB ONYX HOUSTON | 49315.61 | 123537.91 |
| THE ALLEY KAT | 48401.94 | 95110.90 |
| MILE HIGH GENTLEMEN’S CLUB | 48103.13 | 69470.00 |
| THE ASTORIAN | 48102.09 | 83628.96 |
| BB’S CAFE | 47470.90 | 49510.90 |
| BAYOU CITY EVENT CENTER | 44665.97 | 64271.94 |
| THE BELLTOWER ON 34TH | 44285.36 | 89450.90 |
| SC BEVERAGE COMPANY | 44045.97 | 124880.90 |
| THE MOONSHINERS | 43759.10 | 119200.00 |
| LE COLONIAL RESTAURANT | 42762.09 | 371495.97 |
| MCGONIGEL’S MUCKY DUCK | 40462.99 | 50288.96 |
| SALC, INC. | 40352.67 | 147032.99 |
| DAILY GRILL | 40053.37 | 110275.97 |
| HAROLD’S | 39800.00 | 70992.99 |
| PISTOLEROS | 39465.97 | 90324.93 |
| THE OCEANAIRE SEAFOOD ROOM | 39106.53 | 147540.00 |
| CATTLEMAN’S CLUB/CACTUS JACKS’ | 38803.91 | 134993.88 |
| THE COMMONER/THE BOULEVARDIER | 38508.06 | 109968.96 |
| FOUR FAMILIES OF HOUSTON | 38025.95 | 720426.87 |
| SWAGGER | 37536.51 | 123244.93 |
| SULLIVANS STEAKHOUSE | 37164.02 | 110120.90 |
| OGDEN’S POUR SOCIETY | 36498.06 | 96540.00 |
| VALENCIA ORANGES IV, LLC | 36038.06 | 156194.93 |
| CENTERFOLDS | 35812.75 | 224201.94 |
| BRENNANS OF HOUSTON INC | 34881.15 | 319962.99 |
| MOLLY’S PUB | 34775.97 | 74310.90 |
| BAR 5015 | 34603.12 | 112605.97 |
| THE FEDERAL GRILL | 34548.06 | 96340.00 |
| HYATT REGENCY HOUSTON GALLERIA | 33801.94 | 74437.91 |
| SKYBOX SPORTS BAR | 33728.95 | 139721.94 |
| ELEVATED EVENTS LLC | 33472.99 | 84732.99 |
| MAGNOLIA SPIRITS HOUSTON INC | 33452.57 | 98560.90 |
| THE CAPITAL GRILLE #8007 | 33234.03 | 181072.99 |
| BELGIUM BEER BAR BEL01 | 33094.03 | 44788.96 |
| THE DUCK OFF SPORTS BAR | 32629.10 | 38000.00 |
| GATEWAY BEVERAGES, INC. | 32267.91 | 89236.87 |
| CASA BLANCA | 31955.07 | 37351.94 |
| YARD HOUSE | 31809.80 | 362070.00 |
| HOSPITALITY INTERNATIONAL INC | 31625.08 | 79742.99 |
And there you have it! The top 100 money-making liquor license holders in Houston during the Super Bowl.
Aramark is the company that has the catering and liquor license for NRG Stadium and a lot of the other official Super Bowl events around town. Underdogs Sports Pub is a small local bar that somehow handled the liquor for Club Nomadic, a traveling superclub that hosted a Taylor Swift concert during Super Bowl weekend as well as a number of other high-profile events. Rose Club is otherwise known in town as Clé. SCLV Houston Beverage LLC is the Four Seasons Hotel. Treasures is a gentleman’s club that Drake sings about sometimes (TMC Beverages of Houston is also known as The Men’s Club). I can’t explain how a Chili’s got on the list, but that Chili’s must have been pretty lit.
Let’s use this to answer that question we had - how much of this extra money went to big companies, which make money on the Superbowl every year? We’ll focus on the top 20 bars.
top20 <- tabc.sb.o[1:20,]
top20$local <- TRUE
corps <- c(1,2,3,5,7,8,9,15)
top20$local[corps] <- FALSE
plot <- ggplot(top20, aes(x="", y=composite_diff, fill=local)) +
geom_bar(width = 1, stat = "identity") +
coord_polar("y", start=0)
plotcorps.made <- sum(tabc.sb.o[corps,12]) #3486243
all.made <- sum(tabc.sb.o[1:20,12]) #6139314So for the Top 20 bars, which made about 6 million of the extra dough, 8 of them were non-local and they made about 55% of the excess revenues.
Fun! That’s all we’re going to do today.
The Comptroller of Public Accounts cannot vouch for the data or analysis derived from data after it has been retrieved from the Comptroller’s Web site.
Some numbers in the R may vary slightly from the numbers in the text, as some models are rolled at render-time and may involve some randomness.
The MIT License (MIT)
Copyright (c) 2017 Ian Wells
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.