Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: -https://howmuch.net/articles/wolds-highest-paid-athletes-in-top-sports


Objective

In recent years sports personalities have gained a huge popularity not only in terms of their brand value but also in terms of money they get paid. The top-notch athletes who are paid big fat pay cheques mostly play sports like football, basketball, tennis. The original data visualisation tells us that their goal is to illustrate how these sportsmen from various sports have fared in terms of money they’ve earned in the year 2017.

The original visualisation caters to the target audience of sports analysts on various news channels and newspapers to show how much these sportsmen are earning from various endorsements. Sponsors with these different sports leagues also get benefited with the stardom of the players and they can also be viewed as a target audience in this visualisation.

The visualisation chosen had the following three main issues: -

  • In this graph the first issue is visual bombardment, we see there are lots of unnecessary data and visuals provided which make it difficult to visualise the graph easily. Also, in this data there are so many graphic cues that take away the attention from their intended audience. It can be seen in the visualisation that players face and information of them are irrelevant in the graph also graph includes grid line which makes difficult to read the earnings of the players.

  • The second issue is poor colour scheme,the data visualisation looks ambiguous and lacks clarification because the modulation of colour is not used very effectively. As we can see, identical colours are used, making the users hard to identify. For example, Beneath the graph, we could see several details that might cause a colour distinguishing issue to the audience.

  • The third issue is inaccurate selection of graph, here the choice of the data visualisation is very substandard. When we see those, who earn more have a wide circle while those who earn less have a tiny circle, we just could not differentiate whose incomes are more from the graph. Hence this visualisation is not self-explanatory.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(dplyr)
library(tidyr)
### Reading dataset
highest_paid_athlete<-read.csv("athletes1.csv")

# using trimws function to remove spacing or unnecessary tabs
highest_paid_athlete$Sport<-trimws(highest_paid_athlete$Sport)
highest_paid_athlete$Name<-trimws(highest_paid_athlete$Name)


### selecting top 20 highest paid athelete
hpa_1<-head(highest_paid_athlete,20)
str(hpa_1)
## 'data.frame':    20 obs. of  6 variables:
##  $ Rank                   : num  1 2 3 4 5 6 6 8 9 10 ...
##  $ Name                   : chr  "CRISTIANO RONALDO" "LEBRON JAMES" "LIONEL MESSI" "ROGER FEDERER" ...
##  $ X.Pay....mln.          : num  93 86 80 64 61 50 50 47 47 46 ...
##  $ Salary.Winnings....mln.: num  58 31 53 6 27 16 47 12 27 38 ...
##  $ Endorsements....mln.   : num  35 55 27 58 34 34 3 35 20 8 ...
##  $ Sport                  : chr  "SOCCER" "BASKETBALL" "SOCCER" "TENNIS" ...
# changing column names for better understanding

names(hpa_1)[names(hpa_1) == "Rank"] <- "Rank_of_Players" 
names(hpa_1)[names(hpa_1) == "Name"] <- "Name_of_Players" 
names(hpa_1)[names(hpa_1) == "X.Pay....mln."] <- "Payment" 
names(hpa_1)[names(hpa_1) == "Salary.Winnings....mln."] <- "Salary/winnings" 
names(hpa_1)[names(hpa_1) == "Endorsements....mln."] <- "Endorsements" 
names(hpa_1)[names(hpa_1) == "Sport"] <- "Sport" 

# created new column
hpa_1_new=hpa_1
hpa_1_new<-hpa_1%>%mutate(Percent_Earnings_from_endorsements= Endorsements/Payment*100)
hpa_1_new$Percent_Earnings_from_endorsements=round(hpa_1_new$Percent_Earnings_from_endorsements,0)
hpa_1_new
##    Rank_of_Players   Name_of_Players Payment Salary/winnings Endorsements
## 1                1 CRISTIANO RONALDO      93              58           35
## 2                2      LEBRON JAMES      86              31           55
## 3                3      LIONEL MESSI      80              53           27
## 4                4     ROGER FEDERER      64               6           58
## 5                5      KEVIN DURANT      61              27           34
## 6                6      RORY MCILROY      50              16           34
## 7                6       ANDREW LUCK      50              47            3
## 8                8     STEPHEN CURRY      47              12           35
## 9                9      JAMES HARDEN      47              27           20
## 10              10    LEWIS HAMILTON      46              38            8
## 11              11        DREW BREES      45              31           14
## 12              12    PHIL MICKELSON      44               4           40
## 13              13 RUSSELL WESTBROOK      39              27           12
## 14              14  SEBASTIAN VETTEL      39              38            1
## 15              15    DAMIAN LILLARD      38              24           14
## 16              16    NOVAK DJOKOVIC      38              10           28
## 17              17       TIGER WOODS      37               0           37
## 18              18            NEYMAR      37              15           22
## 19              19       DWYANE WADE      36              23           13
## 20              20   FERNANDO ALONSO      36              34            2
##          Sport Percent_Earnings_from_endorsements
## 1       SOCCER                                 38
## 2   BASKETBALL                                 64
## 3       SOCCER                                 34
## 4       TENNIS                                 91
## 5   BASKETBALL                                 56
## 6         GOLF                                 68
## 7     FOOTBALL                                  6
## 8   BASKETBALL                                 74
## 9   BASKETBALL                                 43
## 10 AUTO RACING                                 17
## 11    FOOTBALL                                 31
## 12        GOLF                                 91
## 13  BASKETBALL                                 31
## 14 AUTO RACING                                  3
## 15  BASKETBALL                                 37
## 16      TENNIS                                 74
## 17        GOLF                                100
## 18      SOCCER                                 59
## 19  BASKETBALL                                 36
## 20 AUTO RACING                                  6
# plot
plot_bar<- ggplot(hpa_1_new,aes(x=reorder(Name_of_Players, Payment), y=Payment, fill= Sport)) +
  geom_bar(stat = "identity")+
  coord_flip()+
  geom_text(aes(label=paste0(Payment,"M {", Percent_Earnings_from_endorsements,"%}")),hjust=1,size = 2.5,fontface = "bold")+
  labs(title = "2017, Highest-paid athletes  ", subtitle = "with regard to percentage earnings from endorsements" , x="Players", y = "Athletes Payment-USD Millions (Percentage earnings from endorsements)",fill = "Sports")+
  scale_fill_manual(values =rev(c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2")))+
  theme_minimal()+
  theme(panel.grid.major = element_blank(),
        plot.title = element_text(hjust = 0.4, size = 15, face = "bold" ), plot.subtitle = element_text(hjust = 0.4, size = 15, face = "bold" ), axis.title.x = element_text(vjust = -1, size = 11, face = "bold"),axis.title.y = element_text(size = 11, face = "bold"),legend.title = element_text(face = "bold"))

Data Reference

  • The dataset of world’s Highest-paid athletes in Top sports 2017 is taken from Link

Reconstruction

The following plot corrects the key problems with the original plot as the information are now in a horizontal bar chart. This data visualisation displays players overall profits in terms of the percentage profits from the endorsement. We can see players who don’t earn much yet are profiting from endorsement. This means that players have to be recognised and have huge personalities in their competitions. So that they will gain endorsement even though the players are not in their great form.