In the early 19th century, various types of alcohol consumption increased in America because of overabundance of corn available in the Western region. Today alcohol is a critical part of American diet. It is readily available than ever. We do not have data from earlier period, but World Health Org. has recent data from 2000-2010 that we can use to examine if there is an increasing trend in alcohol consumption in America. The data also contains information on other countries. We will also look at how it has affected throughout the globe.

Data Source: http://apps.who.int/gho/data/node.main.A1026?lang=en

Uploaded to Github: https://raw.githubusercontent.com/pauluck/602/master/al.csv

1. Load data file using pandas library. Show few lines of data.

## Warning: package 'knitr' was built under R version 3.2.3
Country Data Source Beverage Types 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000
Afghanistan Data source All types 0.00 0.00 0.03 0.02 0.03 0.01 0.01 0.01 0.00 0.00 0.00
Afghanistan Data source Beer 0.00 0.00 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00
Afghanistan Data source Wine 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00
Afghanistan Data source Spirits 0.00 0.00 0.02 0.00 0.02 0.01 0.01 0.01 0.00 0.00 0.00
Afghanistan Data source Other alcoholic beverages 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Albania Data source All types 4.96 4.98 5.58 5.36 5.22 5.04 4.91 4.41 4.27 3.94 4.54 3.96

2. Drop colums Data.Source and 2011-2013 because most of the data is missing there.

Country Beverage Types 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000
Afghanistan All types 0.00 0.00 0.03 0.02 0.03 0.01 0.01 0.01 0.00 0.00 0.00
Afghanistan Beer 0.00 0.00 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00
Afghanistan Wine 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00
Afghanistan Spirits 0.00 0.00 0.02 0.00 0.02 0.01 0.01 0.01 0.00 0.00 0.00
Afghanistan Other alcoholic beverages 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Albania All types 4.98 5.58 5.36 5.22 5.04 4.91 4.41 4.27 3.94 4.54 3.96

3. Pull out US data.

Country Beverage Types 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000
940 United States of America All types 8.55 8.67 8.74 8.74 8.63 8.52 8.48 8.40 8.36 8.25 8.21
941 United States of America Beer 4.28 4.43 4.54 4.54 4.54 4.50 4.58 4.58 4.66 4.66 4.62
942 United States of America Wine 1.48 1.44 1.44 1.44 1.40 1.36 1.32 1.29 1.25 1.17 1.17
943 United States of America Spirits 2.80 2.80 2.76 2.76 2.69 2.65 2.57 2.54 2.46 2.42 2.42
944 United States of America Other alcoholic beverages No data No data No data No data No data No data No data No data No data No data 0.00

4. How is the consumption trend in US?

I will be using matplotlib library to make line plot for all the years so we can visualise the trend.

x = list of years to map

ybeer = list of beer numbers for US
ywine = list of wine numbers for US
yspirits = list of spirits numbers for US

matplotlib.plot(x,ybeer,'red line',x, ywine,'blue line', x, yspritis, 'green line')

5. How does American consumption compare to its neighbor country, Canada?

First, I will pull out data only containg Canada. Remove unwanted rows. Then I will take average of each type and use stacked bar plot using matplotlib to campare.

x = ['beer','wine','spirits']

cn = data containing Canada

us = data cantaining US

avgbeerUS = US average beer consumption
avgwineUS = US average wine consumption
avgspiritsUS = us average spirits consumption
avgUS = list of all US averages

avgbeercn = Canada average beer consumption
avgwinecn = Canada average wine consumption
avgspiritscn = Canada average spirits consumption
avgCN = list of all CN averages

matplotlib.bar(x, avgUS, width, color='r')
matplotlib.bar(x, avgCN, width, color='y')

6. Which country drinks the most wine? Beer? Spirits?

I will be using pandas.value_count.max() function on each category.

groupby.al['beer']['2010'].value_count.max()
groupby.al['wine']['2010'].value_count.max()
groupby.al['spirits']['2010'].value_count.max()

7. Create a plot showing overall world consumption.

I plan to take average and standard deviation of each type of alcohol and compare it with each country. If a country falls below 2 standard deviation of average, than its consumption is very low and if it falls within 2 standard deviation over the average then this country is considered as a high consumption category.

The goal is to show a map of world with some highlights of consumption.

avgb = average of all consumption
sd = standard deviation of all consumption

high = countries with high consumption
med = countries with med consumption
low = countries with med consumption

plot data using **mpl_toolkits.basemap**