+— title: “Assignment 2” subtitle: “Deconstruct, Reconstruct Web Report” author: “Naresh Shankar (s3927631)” output: html_document urlcolor: blue —

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: NCAA Sponsorship & Participation Rates Report 1981-82 - 2017-18.


Objective

The objective of the original visualisation was to show that women’s participation in athletics has not been very prominent. Higher opportunities for games have normally been provided to men over women. The target audience in this case is for the general public to understand the position women are in the current times.

The visualisation chosen had the following three main issues:

  • A bar graph is used here for comparison purposes. Since the data is continuous, it will be better to use a Line graph instead. “It plots a series of related values that depict a change in Y as a function of X” David J Slutsky (2014).
  • The background colour is pink. It is not very appealing to look at to understand data at a first glance.
  • Y-axis doesn’t have a label which does not specify whether it is number of students or something else.

Reference

Code

The following code was used to fix the issues identified in the original. Please note, I had copied the dataset from the PDF into an excel file and then performed the below steps.

library(ggplot2)
library(tidyr)
options(scipen = 10000)
df_sports <- readxl::read_xlsx("~/My_R_Projects/Data_Visualisation/Assessment 2/NCAA_Dataset.xlsx",skip = 1)
df_sports$Year <- gsub("\\-.*","",df_sports$Year)
df_sports_long <- df_sports %>% gather("sex","TotalStudents", c("Men","Women"))
df_sports_plot <- ggplot(data = df_sports_long, aes(x = as.factor(Year), y = TotalStudents, group = sex)) 
df_sports_plot <- df_sports_plot + geom_line(aes(color=sex)) + scale_color_manual(values = c("blue","orange"))
df_sports_plot <- df_sports_plot + geom_point(size=1)
df_sports_plot <- df_sports_plot + scale_y_continuous(limits = c(0,300000)) + scale_x_discrete(guide = guide_axis(check.overlap = TRUE))
df_sports_plot <- df_sports_plot + labs(title = "Student Participation in College Athletics", subtitle = "Gender-wise Athletics Participation", y = "No. of Students", x = "Year", caption = "Source: NCAA Sports Sponsorship and Participation Rates Report (Oct 2018)")
df_sports_plot <- df_sports_plot + geom_text(label=df_sports_long$TotalStudents, check_overlap = TRUE,nudge_x = 0.25,nudge_y = 0.25, hjust="inward",vjust = "bottom",size=3)

Data Reference

Other Coding References

Reconstruction

The graph has been remodeled to reflect a trend in increase between years on the participation of men vs women. Upon first glance, one can significantly see the gap and easily comment on the difference. This will be how the public perceive this information and provide analysis. The following plot fixes the main issues in the original.

Data Reference