INTRODUCTION

The Grammar of Graphics is implemented in R using ggplot2 package (Hadley Wickham). Essentially, we construct plots by layering grammatical elements on top of each other and use aesthetic mappings to define our visualizations.

BASIC

1. Data, Aesthetics & Geometries

1.1. Display a histogram of the salary

# Histogram of the salary
ggplot(survey, aes(x=salary)) + geom_histogram(color="darkblue", fill="lightblue", bins=20)


1.2. Display now a barplot of the brand

# Barplot of the brand
ggplot(survey, aes(x=brand, fill=brand)) + geom_bar()


1.3. Display a combination of brand with salary. Try with different combinations

# Histogram of salary and brand
ggplot(survey, aes(x=salary, fill=brand)) + geom_histogram(color="black", bins=20)

ADVANCED

2. Facets

2.1. Display as many salary histograms as levels of education you have.
-You will need facet_grid() or facet_wrap()

# Histogram of the salary
ggplot(survey, aes(x=salary)) + geom_histogram(color="darkblue", fill="lightblue", bins=20 ) + facet_wrap(~elevel)


2.2. Something happens with the x label… Any idea about how to fix it?

# Histogram of the salary
ggplot(survey, aes(x=salary)) + geom_histogram(color="darkblue", fill="lightblue", bins=20 ) + facet_wrap(~elevel,scales = "free_x" )

3. Statistics

3.1. Display the relationship between age, salary & brand.
- Try to see some patterns using geom_smooth()

ggplot(survey, aes(x=age, y=salary, col=brand)) + 
  geom_point() +
  geom_smooth() 
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'


3.2.. Display a boxplot of salary with brand
- Add the median

# Boxplot of salary and brand
ggplot(survey, aes(x=brand, y=salary, fill=brand)) + 
  geom_boxplot()+
  stat_summary(fun.y=median, colour="black", geom="text", 
               vjust=-0.7, aes(label=round(..y.., digits=1)))

4. Coordinates

4.1. Display a pie chart of the brand (you will need “coord_polar()”)

# Pie chart of the brand
ggplot(survey, aes(x="",fill=brand)) + geom_bar() + coord_polar(theta="y")

5. Themes

5.1. Display the histogram of salary and brand.
- Add a tittle
- Change the x scale (every 15.000) and rotate the x labels
- Change the y scale to percentages

# Histogram of salary and brand
ggplot(survey, aes(x=salary, fill=brand)) + 
  geom_histogram(color="black", bins=10) +
  labs(title="Relationship between brand and salary")+
  scale_x_continuous(breaks=seq(20000, 150000, 15000))+
  theme(axis.text.x = element_text(angle=60, hjust=1)) +
  scale_y_continuous(labels = scales::percent)


5.2. You also can change the background
- Change the theme (package ggthemes)

# Histogram of salary and brand
ggplot(survey, aes(x=salary, fill=brand)) + 
  geom_histogram(color="black", bins=10) +
  labs(title="Relationship between brand and salary")+
  scale_x_continuous(breaks=seq(20000, 150000, 15000))+
  theme(axis.text.x = element_text(angle=60, hjust=1)) +
  scale_y_continuous(labels = scales::percent)+ 
  theme_economist()

NEXT STEPS

Plotly

Age, salary and brand

P<-plot_ly(data = survey, x = ~age, y = ~salary, color=~brand)
P
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plot.ly/r/reference/#scatter
## No scatter mode specifed:
##   Setting the mode to markers
##   Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
## Warning: package 'bindrcpp' was built under R version 3.4.4
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

Highcharter

Brand pie chart

newdf<-survey %>%
  select(brand) %>%
  group_by(brand) %>%
  summarise(n=n())

highchart() %>% 
 hc_chart(type = "pie") %>% 
 hc_add_series_labels_values(labels = newdf$brand, values = newdf$n)
## Warning: 'hc_add_series_labels_values' is deprecated.
## Use 'hc_add_series' instead.
## See help("Deprecated")