Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: From Data to Viz: AREA CHART


Objective
The original visualisation was used to show evolution of several variables (names) using an area chart and it shows evolution of 9 American baby names. I also found this visualisation used as an example for faceted plots at https://www.r-graph-gallery.com/223-faceting-with-ggplot2.html, to explain what faceting is.
Targetted audience is any person interested in these data visualisation techniques, but in using the title “Popularity of American baby names in previous 30 years”, it would also include any person who has at least a passing interest in baby names in America.

Above visualisation has the following three main issues:

  • Misleading
    Title says “Popularity of American baby names in previous 30 years”, but inside the code (available along with the original visualisation), it is limited to an arbitrary list of nine female names. It is not clear as to why these 9 names were selected. A viewer may think these are the most popular names in America for girls. Y axis also misleading due to range issues.

  • Failing to answer a practical question
    The visualisation does not answer the practical question as per it’s title - “Which name is popular in America at any given time?” The compression of X axis makes it difficult to even understand which of these nine names was popular at a point in time.

  • Arrangement
    Uses default alphabetical ordering, instead of a meaningful ordering, say, by rank.

Reference
* Original data visualisation was found on https://www.data-to-viz.com/graph/area.html.

  • Data referred to the original visualisation refers to babynames package in R. This package contains three datasets provided by the USA social security administration: The dataset used here is - babynames: For each year from 1880 to 2017, the number of children of each sex given each name. All names with more than 5 uses are given. (Source: http://www.ssa.gov/oact/babynames/limits.html)

Code

The following code was used to fix the issues identified in the original.

To see the Shiny app output, scroll inside the window (Re-size to smaller window in case Warning: Error in : invalid quartz() device size

library(babynames)
library(shiny)
library(tidyverse)



# ================================================================================================
# ================================================================================================
# ui
# ================================================================================================
# ================================================================================================
ui <- ui <- fluidPage(
  
  
  
  fluidRow(
    
    #Set the filters in UI
    
    column(width = 12,
           wellPanel(titlePanel("Top 10 popular American Baby Names by year - 1888 to 2017 "),
                     column(12,
                            sliderInput(inputId="year",
                                        label=h3("Select year"),
                                        min = 1888,
                                        max = 2017,
                                        value = 2005)
                     )
                     ,column(4, offset = 0,
                             radioButtons(inputId="sex",label=h3("Select gender"),choices = list("male"="M","female"="F"),selected = "M")
                     )
                     

             )
           )       
    ),
    
  #Main Plot
    column(12,
           plotOutput("distPlot")
    )
  )

  

# ================================================================================================
# ================================================================================================
# Server 
# ================================================================================================
# ================================================================================================


server <- function(input, output, session) {
  
  #Reactive function to filter data based on user input
  don_reactive <- reactive({
    
    don_filter <- babynames %>% 
      filter(sex==input$sex) %>% 
      filter(year==input$year)  %>% 
      slice_max(n, n = 10) %>% arrange(n)
    don_filter$name <- factor(don_filter$name, levels = don_filter$name)
    print(don_filter)
    
  })
  
  

  output$distPlot <- renderPlot({
    
    # Create the plot
    cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
      p <- ggplot(data=don_reactive(), aes(x=name, y=n)) +
          ggtitle(paste("Top 10 " ,
                        case_when(
                          input$sex == "M" ~ "boy",
                          input$sex == "F" ~ "girl"),"baby names in ",input$year)) +
        coord_flip() 
    p <- p + geom_col() 
    p <- p + geom_bar(stat="identity",colour="black",fill=cbPalette[3]) 
    p <- p + geom_text(aes(label = name), position = position_stack(vjust = 0.5))
    p <- p+ geom_label(aes(label = n)) 
    p <- p+theme(
      plot.title = element_text(color="black", size=26, face="bold.italic")
    )
    p <- p + theme(axis.title = element_blank(),axis.text.y=element_blank())
    
    p <- p + scale_x_discrete(labels = NULL, breaks = NULL) + labs(x = "")
    p
  })
}


# ================================================================================================
# ================================================================================================
# Shiny app construction
# ================================================================================================
# ================================================================================================
shinyApp(ui=ui, server=server)

Shiny applications not supported in static R Markdown documents

Data Reference

Reconstruction

The following plot fixes the main issues in the original.