12 07 2022

BRIEF

Dashboard is mainly about visualization of produced primary and processed crops foods stats. Food stats are gathered from FAO (link shown in below) In the source All countries in ‘Countries’ section, Production Quantity in ‘Elements’ section, Crops Primary >(List), Crops Processed > (List) in ‘Items’ section and 2000-2020 in ‘Years’ section are selected. Dashboard consists of two visual tabs. Application can be found here: https://c94mnx-mehmet-0l0k.shinyapps.io/Food_Analysis/

APP STRUCTURE

Application consists of two R files named ui.R and server.R. In the ui.R file App will download a zip file which consists a csv file which was gathered from FAO and uploaded to Github. Data’s columns’ names will be changed to Country and Tonnes. In the first tab app will use googleVis map. googleVis package has a country name option to highlight the countries which is more suitable for our data. But some country names in googleVis and data country names doesn’t match. So some country names will be edited for googleVis package.

Data Mutating and Creating Variables

...
data$Area[data$Area=="Bahamas"] <- "The Bahamas"
select_items <- unique(data$Item)[order(unique(data$Item))]
select_years <- unique(data$Year)
select_countries <- unique(data$Country)[order(unique(data$Country))]

FIRST TAB: MAP OF PRODUCTION

In this tab we will have a sidebar with two inputs: Food, Year. Application will make a grouping from the inputs and will make a new order by produced food (tonnes) and visualize the choropleth map. In ui.R we created input variables before. And in sidebar we want top producer countries table:

  sidebarLayout(
               sidebarPanel(
                 selectInput("food_var",
                             label = "Select a Food",
                             choices = select_items),
                 selectInput("year_var",
                             label = "Select a Year",
                             choices = select_years),
                 h3("Top Producer Countries:"),
                 dataTableOutput("toptable")
               ),
               
               # Show a plot of the world map
               mainPanel(
                 htmlOutput("distPlot")
               )
             )

So in ui.R it is declared that distplot will be plotted. And in server.R application will generate choropleth map based on input selections:

 renderGvis({
    gvisGeoChart(filter(data, Year == input$year_var, Item == input$food_var), 
    locationvar = 'Country', colorvar = 'Tonnes',
    options = list(projection ="kavrayskiy-vii",width=900,height=1000))
  })

Data will be grouped by “input$year_var” and “inputfood_var”, and will be plotted. And data will be reorderd with same inputs and displayed as table in sidebar.

 output$toptable <- renderDataTable({
    filter(data, Year == input$year_var, Item == input$food_var)[,c(4,12)]
    [order(filter(data,Year == input$year_var, Item == input$food_var)$Tonnes,
    decreasing = T),][1:10,]
  })

SECOND TAB: COUNTRY SUMMARY

In this tab user can choose a country and year from sidebar. And there are two graphs on the main panel based on the user selections. ui.R codes are:

tabPanel("Country Summary",
             # Sidebar with country and year input
             titlePanel("Food Production By Countries"),
             sidebarLayout(
               sidebarPanel(
                 selectInput("country_var",
                             label = "Select a Country",
                             choices = select_countries),
                 selectInput("year_var2",
                             label = "Select a Year",
                             choices = select_years)
                 
               ),
               mainPanel(
                 # Bar plot 
                 h3("Top 10 Food Production"),
                 plotlyOutput(outputId = "p"),
                 # Scatter Plot with Trend Lines Added
                 h3("Total Food Production By Years"),
                 plotlyOutput(outputId = "p2")
               )
             )

And server.R file will bring p and p2 plots based on selections. Let’s look at p:

output$p <- renderPlotly({
    plot_ly(x = filter(data, Country==input$country_var, 
    Year==input$year_var2)[c(8,12)][order(filter(data, 
    Country==input$country_var, Year==input$year_var2)[,c(12)],
    decreasing = T),]$Item[1:10],
    y = filter(data, Country==input$country_var, 
    Year==input$year_var2)[c(8,12)][order(filter(data,
    Country==input$country_var, Year==input$year_var2)[,c(12)], 
    decreasing = T),][,c(2)][1:10],
    color = ~filter(data, Country==input$country_var, 
    Year==input$year_var2)[c(8,12)][order(filter(data,
    Country==input$country_var, Year==input$year_var2)[,c(12)], 
    decreasing = T),]$Item[1:10],
    colors="BrBG", type = "bar") %>% hide_legend()
  })

In this tab a top 10 food types will be displayed in bar plot. And second plot will have total food produce of the selected country by years. In the second scatter plot we will add two lines. First line will be linear regression line and second line will be polynomial regression line (power=2). In some cases polynomial lines fits better.For this purpose we will have two reactive functions. Regression function will calculate the total food production of selected country by years and create a new dataframe with total production and years in it. And it will create a lm() model (years will be the only regressor) and predict the total production by years. Finally it returns the predictions.

regression <- reactive({
    newdf <- data.frame(2000:2020,matrix(sapply(2000:2020,
    function(i){filter(data, Year==i, Country==input$country_var)
    %>% summarise(sum(Tonnes))})))
    colnames(newdf) <- c("index","tonnes")
    newdf$tonnes <- as.numeric(newdf$tonnes)
    model <- lm(tonnes~index,newdf)
    preds <- predict(model)
    return(preds)
  })

And similarly we will have polynomial regression line. And for that we need another reactive function in server.R

polynomial <- reactive({
    newdf <- data.frame(2000:2020,matrix(sapply(2000:2020,
    function(i){filter(data, Year==i, Country==input$country_var)
    %>% summarise(sum(Tonnes))})))
    colnames(newdf) <- c("index","tonnes")
    newdf$tonnes <- as.numeric(newdf$tonnes)
    model <- lm(tonnes~ poly(index,2,raw=TRUE), newdf)
    poly_preds <- predict(model)
    return(poly_preds)
  })

Finally app will plot the scatter points and linear regression and polynomial regression lines.

  output$p2 <-renderPlotly({
    p <- plot_ly(x=2000:2020, y=matrix(sapply(2000:2020,
    function(i){filter(data, Year==i, Country==input$country_var)
    %>% summarise(sum(Tonnes))})), color =2000:2020,
    colors = c( "#56B1F7","#132B43"), size=22,name="Total Production",
    mode="markers", type="scatter") %>%
    add_trace(y=regression(),  name = "Regression Line", 
    line = list(color="purple", width=1), mode="lines") %>%
    add_trace(y= polynomial(), name = "Polynomial Line (Power=2)",
    line= list(color="brown", width=1), mode="lines") 
  })