setwd("/home/rstudio-user/R/Assignment_5/Data")
FruitYearbookSupplyandUtilization_GTables <- read_excel("/home/rstudio-user/R/Assignment_5/Data/FruitYearbookSupplyandUtilization_GTables.xlsx")
The data, “U.S. supply and utilization: fresh, canned, juice, dried; per capita use, U.S. population,” is sourced for the USDA and can be accessed at the following URL: https://www.ers.usda.gov/data-products/fruit-and-tree-nuts-data/fruit-and-tree-nuts-yearbook-tables/#General
This report focuses on “Table G-1 – Fresh apples: Supply and utilization, 1980/81 to date” updated on 10/29/2020.
The units in this table for Utilized production:Domestic are million pounds and Per capita use is pounds.
fruitSU <- slice_max(FruitYearbookSupplyandUtilization_GTables, Season, n=10)
knitr::kable(fruitSU, digits = 2)
| Season | Utilized production | Imports | Total supply | Exports | Domestic | Per capita use |
|---|---|---|---|---|---|---|
| 2019/20 | 7430.2 | 237.65 | 7667.85 | 1900.65 | 5767.20 | 17.46 |
| 2018/19 | 6865.1 | 322.72 | 7187.82 | 1636.39 | 5551.44 | 16.91 |
| 2017/18 | 7815.8 | 296.13 | 8111.93 | 2220.65 | 5891.27 | 18.06 |
| 2016/17 | 7745.1 | 377.08 | 8122.18 | 1912.54 | 6209.63 | 19.15 |
| 2015/16 | 6928.1 | 414.61 | 7342.71 | 1715.23 | 5627.48 | 17.48 |
| 2014/15 | 7909.0 | 360.14 | 8269.14 | 2286.17 | 5982.98 | 18.71 |
| 2013/14 | 6918.7 | 469.99 | 7388.69 | 1858.56 | 5530.13 | 17.43 |
| 2012/13 | 6594.9 | 430.17 | 7025.07 | 1969.17 | 5055.90 | 16.05 |
| 2011/12 | 6312.9 | 381.06 | 6693.96 | 1854.94 | 4839.02 | 15.47 |
| 2010/11 | 6248.8 | 328.68 | 6577.48 | 1823.15 | 4754.33 | 15.31 |
The first graph signals that the US is a strong producers of apples. The country is exporting approximately four times more than it is importing. This graph view allows you to see imports and exports right next to each other allowing for quick insights of the differences between the two but it is hard to track the total change from year to year over time. Graph 1.2 was created to look deeper into the relationship of imports and exports revealing how the total difference between imports and exports has changed from year to year. Over the 10 years of sample data, the relationship appears steady with varying boom and bust seasons.
fruitIE <- select(fruitSU, "Season", "Imports", "Exports")
fruitIE_long <- reshape2::melt(fruitIE, id.var='Season', value.name = "value")
ggplot(fruitIE_long, aes(fill=variable, y=value, x=Season)) +
geom_bar(position="dodge", stat="identity") +
ylab("Fresh apples (million pounds)") +
theme(axis.text.x = element_text(angle = 45))
ggplot(fruitIE, aes(y=Imports-Exports, x=Season)) +
geom_bar(position = "identity", stat="identity", color="red", fill="darkgreen")+
theme(axis.text.x = element_text(angle = 45))+
ylab("Imports-Exports (million pounds)")
In this graph, which doesn’t start from zero, you can see that the per capita consumption of fresh apples is tracking upwards. The range of data is about 4 lbs.
fruitU <- select(fruitSU, "Season", "Per capita use")
fruitU_ <- rename(fruitU, Per_capita_use = "Per capita use")
ggplot(fruitU_, aes(x=Season, y=Per_capita_use)) +
geom_point(position = "identity", stat = "identity", shape = 15, size = 5)+
ylab("Per capita use (pounds)")
The first problem encountered was the set-up of the excel file from the USDA website. The file was not set-up as a CSV, so it had many tabs that did not read into r. The file also used multiple rows for header labels which I could not figure out how to combined when uploading the data. I edited the original data file, undoing the formatting and sectioning of the headers above the data to read into r correctly. The headers in the original data contained spaces. This confuses r when referencing the column. I tried to rename the column with underscores but I could not get r to recognize the heading with spaces, until I created a new table. I had an issue knitting the document as it would return an error on Line 26 that is could not find my database. As a result, I moved my project into home>R and set as a directory using the setwd() function.