Data Science Stream
Preparation
library(palmerpenguins)
library(plotly)
Creating Interactive Box Plots in RStudio
penguins_box <- plot_ly(data = penguins, y = ~body_mass_g, type = "box")
penguins_box
penguins_box <- plot_ly(data = penguins,
y = ~body_mass_g,
type = "box",
x0 = "body mass (g)")
penguins_box
penguins_box <- plot_ly(data = penguins,
y = ~body_mass_g,
color = ~sex,
type = "box")
penguins_box
We observe that the distributions of body mass values for both male and female penguins are positively skewed, and clearly not symmetrical, as shown by the median values not being equidistant between the first and third quartile values.
penguins_box <- plot_ly(data = penguins,
x = ~species, y = ~body_mass_g,
color = ~sex, type = "box")
penguins_box
penguins_box %>% layout(boxmode = "group")
We observe that the male penguins for each species have much higher median body mass values. There is a particularly large difference in the distributions of body masses for male and female Gentoo
penguins. The male and female Chinstrap
penguins are relatively close in median body mass. Interestingly, the female Gentoo
penguins are generally much heavier than both female and male penguins of the other species.
Another point of interest is that the distributions of body mass, when split across species and sex, no longer appear as skewed. The male Adelie and Chinstrap penguins have slightly skewed body mass distributions, but the other groups appear to have roughly symmetric distributions.
Piping
penguins_box %>% layout(title = "Box Plots of Penguin body mass Data",
boxmode = "group")
penguins_box %>% layout(title = "Box Plots of Penguin body mass Data",
boxmode = "group",
legend=list(title=list(text='Sex')))
penguins_box %>% layout(xaxis = list(title = "Penguin Species"),
yaxis = list(title = "Penguin Body Mass (grams)"),
boxmode = "group",
legend=list(title=list(text='Sex')))
Creating Interactive Violin Plots in RStudio
penguins_violin <- plot_ly(data = penguins,
y = ~body_mass_g,
type = "violin",
x0 = "body mass (g)",
box = list(visible = T ))
penguins_violin
penguins_violin <- plot_ly(data = penguins,
x = ~species,
y = ~body_mass_g,
type = 'violin',
box = list(visible = T ))
penguins_violin
# Note you could replace split = ~sex with color = ~sex here
penguins_violin <- plot_ly(data = penguins,
x = ~species,
y = ~body_mass_g,
split = ~sex,
type = 'violin',
box = list(visible = T ))
penguins_violin
penguins_violin %>% layout(violinmode = "group")
penguins_violin %>% layout(title = "Violin Plots of Penguin body mass Data",
violinmode = "group")
Extension: Creating your own plotly
plots
Example code for the creation of violin plots with all the specified characteristics is shown below:
violin_fig <- plot_ly(data = penguins,
x = ~sex, y = ~bill_length_mm,
type = 'violin',
split = ~species,
color = ~island,
text = ~island,
box = list(visible = T ))
violin_fig %>% layout(title = "Violin Plots of Penguin bill length Data",
yaxis = list(title = "bill length (mm)"),
violinmode = "group")
That’s everything for this lab.
References
Horst, Allison Marie, Alison Presmanes Hill, and Kristen B Gorman. 2020.
Palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data.
https://doi.org/10.5281/zenodo.3960218.
Sievert, Carson. 2020.
Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman; Hall/CRC.
https://plotly-r.com.
These notes have been prepared by Rupert Kuveke. The copyright for the material in these notes resides with the author named above, with the Department of Mathematical and Physical Sciences and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License
BY-NC-ND.
