Appendix 1: Data from MySQL:

For each dataset the query was saved into seperate .CSV files, uploaded to GitHub, and imported into MySQL; from MySQL they were prepared for import into R. This section will detail the data files (original .CSV’s), the source, and the queries that were used to create our final datasets: honey_state and honey_county. The queries past the import wizard can be found here, and the query has been saved in GitHub.

Data Import:

Simply enough, each .CSV was imported into a schema (titled HISP – a former title of the project) using the MySQL data import wizard.

National Agriculture Statistics Service:

The first set of data we gathered came from the National Agriculture Statistics Service website, which houses a large amount of data related to United States Agriculture. The website stores census and survey data on a number of subjects (the main categories are animals, products, crops, demographics, economics, and environment) across a number of geographic levels (national, state, county, and zip code) for a number of years. The data can be access by the website UI allowing the user to select each subject, geographic level, and year which can then be exported into a .CSV file, or the entire database can be downloaded as a text file after extracting the database .GZ. For our purposes, it was simple enough to query the database using the website rather than have to deal with the large and cumbersome dataset.

To import the data into R (rather than MySQL) we’d use the following code (all data has been backed-up into my personal GitHub account):

#County Data
county1 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/CAHoneyCensus_County_2012.csv", header = TRUE) #Honey Census 2012 with no other years
county2 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAChemicalandFertilizerTotals_County_allyears.csv", header = TRUE)
county3 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USACropTotalsCensus_County_allyears.csv", header = TRUE)
county4 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAHoneyCensus_County_allyears.csv", header = TRUE)
county5 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAHorticultureCensus_County_allyears.csv", header = TRUE)

#State Data
state1 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAHoneySurvey_State_allyears.csv", header = TRUE) #State Survey (no county survey)
state2 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAChemicalandFertilizerTotals_State_allyears.csv", header = TRUE)
state3 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USACropTotalsCensus_State_allyears.csv", header = TRUE)
state4 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAHoneyCensus_State_allyears.csv", header = TRUE)
state5 <- read.csv("https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/USAHorticultureCensus_State_allyears.csv", header = TRUE)

#Final Data
honey_county <- read.csv('https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/honey_county.csv', header = TRUE)
honey_state <- read.csv('https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/honey_state.csv', header = TRUE)
FIPS <- read.csv('https://raw.githubusercontent.com/chrisgmartin/HoneyIShrunkTheBees/master/Data/FIPS.csv', header = TRUE)

With a connection to MySQ estabilshed, these files can be queried by the following table names:

  • cahoneycensus_county_2012

  • USAChemicalandFertilizerTotals_County_allyears

  • USAChemicalandFertilizerTotals_State_allyears

  • USACropTotalsCensus_County_allyears.csv

  • USACropTotalsCensus_State_allyears.csv

  • USAHoneyCensus_County_allyears.csv

  • USAHoneyCensus_State_allyears.csv

  • USAHoneySurvey_State_allyears.csv

  • USAHorticultureCensus_County_allyears.csv

  • USAHorticultureCensus_State_allyears.csv

And the final datasets can be queried with two simple tables titled:

  • honey_county: which lists all of the county level statistics

  • honey_state: which lists all of the state level statistics