IS607 - Final Project - TTB Data and Analysis

Matt Moramarco

December 18, 2014

Background

Exploring the publicly available reporting through the Tax and Trade Bureau about the beverage alcohol industry.

Motivation is in the development of a startup that provides monthly analysis and reporting on this data in order to make it available for business decisions by those in the wine industry.

Primary focus is on data acquisition and conversion into a repeatable process that provides easy to consume data for the development of reporting deliverables.

Data Acquisition and Issues

Data Loading

DELETE FROM land_wine_ttb;
COPY land_wine_ttb
FROM '/Users/mmoramarco/Projects/cuny/is607_data_acquisition/Final Project/ttb_wine.csv'
WITH DELIMITER ','
CSV HEADER;

Data Validation

library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="ttb_data")
wine_rs <- dbSendQuery(con,"SELECT * from land_wine_ttb LIMIT 3")
db_results <- fetch(wine_rs)
print(db_results)
##     date                                         measure     month
## 1 201210    Still Wine Bulk Removed From Fermentors (WG) 209020808
## 2 201210     Still Wine Bullk Increse by Sweetening (WG)   2422597
## 3 201210 Still Wine Bulk Increase by Adding Spirits (WG)   1822963
##   monthprioryear       ytd ytdprioryear
## 1      211136221 437750041    426968365
## 2        2689228  21485301     22457992
## 3        2050951   8348775      7869998

Preliminary Analysis

Result Plot

ggplot(share_melt, aes(x=variable,y=value,fill=variable)) +
  geom_bar(stat="identity") +
  xlab("Category") + ylab("Change in % Share") +
  ggtitle("Change in % Share of Consumption")

Conclusion and Next Steps