Creating Interactive Histograms in RStudio
hist(penguins$body_mass_g, breaks = 19)

install.packages("plotly")
library(plotly)
penguin_hist_base <- plot_ly(data = penguins,
x = ~body_mass_g,
type = "histogram")
penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
A brief explanation of the code is provided in the Code
chunk below.
# Here, we are creating a plotly object called "penguin_hist_base"
penguin_hist_base <- plot_ly(data = penguins, # We are using the penguins data
x = ~body_mass_g, # and modelling the body_mass_g data
type = "histogram") # in a histogram format
# The code below is used to modify the layout of the histogram
# to include a label for the y-axis
penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
penguin_hist <- plot_ly(data = penguins,
x = ~body_mass_g,
color = ~island,
type = "histogram", alpha = 0.6)
penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'),
barmode ="overlay")
A brief explanation of the code is provided in the Code
chunk below.
# Here, we are creating a plotly object called "penguin_hist"
penguin_hist <- plot_ly(data = penguins, # We are using the penguins data
x = ~body_mass_g, # and modelling the body_mass_g data
color = ~island, type = "histogram", alpha = 0.6)
# We are producing a histogram for this data, with points coloured differently,
# depending on the island on which the penguin is located
# The code below is used to modify the layout of the histogram
# This includes adding a label to the y-axis
# and setting the histograms to be layered over each other
# (hence the alpha = 0.6 above to change the opacity)
penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'),
barmode ="overlay")
Creating Interactive Scatter Plots in RStudio
penguins_scatter <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~flipper_length_mm)
penguins_scatter
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex)
penguins_scatter2
An example result for an arbitrary selection of colours is shown below.
penguins_scatter_colours <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = c("cyan", "orange"))
penguins_scatter_colours
For brevity only the result for the Set2
colors
specification is shown below.
penguins_scatter_colours <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set2")
penguins_scatter_colours
penguins_scatter2 <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1",
type = "scatter", mode = "markers")
penguins_scatter2
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1",
type = "scatter", mode = "lines")
penguins_scatter2
Note that here, R is drawing a line between the individual data points - clearly we don’t want this!
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1", text = ~species,
type = "scatter", mode = "markers")
penguins_scatter2
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1", symbol = ~species,
type = "scatter", mode = "markers")
penguins_scatter3
Here we have used the symbols cross
, diamond
and star
.
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1", symbol = ~species,
symbols = c("cross", "diamond", "star"),
type = "scatter", mode = "markers")
penguins_scatter3
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm,
color = ~sex, colors = "Set1", symbol = ~species,
symbols = c("cross", "diamond", "star"),
type = "scatter", mode = "markers",
marker = list(size = 8))
penguins_scatter3
Creating your own plotly
Scatter Plot
penguins_scatter_new <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~bill_length_mm,
type = "scatter", mode = "markers")
penguins_scatter_new
penguins_scatter_new2 <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~bill_length_mm,
color = ~island,
type = "scatter", mode = "markers")
penguins_scatter_new2
penguins_scatter_new3 <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~bill_length_mm,
color = ~island, symbol = ~species,
type = "scatter", mode = "markers")
penguins_scatter_new3
penguins_scatter_new4 <- plot_ly(data = penguins,
x = ~body_mass_g, y = ~bill_length_mm,
color = ~island, symbol = ~species,
symbols = c("cross", "diamond", "star"),
type = "scatter", mode = "markers",
marker = list(size=8))
penguins_scatter_new4
It does seem that penguins living on different islands have noticeably different body_mass_g
and bill_length_mm
measurements, but this is also due to the fact that some species of penguin only live on one of the three islands - e.g. Gentoo and Chinstrap penguins only live on Biscoe island and Dream island respectively, whereas the Adelie penguins live on all three islands.
However, we also note that the Adelie penguins living on Torgersen island are much smaller overall than Adelie penguins living on other islands.
That’s everything covered.
---
title: "STM1001: Computer Lab 2B Solutions"
output:
  bookdown::html_document2: 
    toc: true
    toc_float: true
    code_download: true
    theme: readable
    code_folding: show
bibliography: STM1001_DS_CL_references.bib 
link-citations: yes
---

<style>
#TOC {
  background: url("https://www.latrobe.edu.au/_media/la-trobe-api/v5/img/logo.svg");
  background-size: contain;
  padding-top: 80px !important;
  background-repeat: no-repeat;
}
</style>

### Data Science Stream {-}

### Topic 2B: Data Visualisation I {-}

<br>

Example R code solutions for the [Data Science Computer Lab 2](https://rpubs.com/LTU_STM1001/DSMCL2), which uses data from @penguins, and the `plotly` [@plotly] R package, are presented below.

<br>

# Palmer Penguins Data Set {#penguins}

##

```{r, include = F}
# Install packages if missing
install.packages(setdiff("palmerpenguins", rownames(installed.packages())), repos = "http://cran.us.r-project.org")
install.packages(setdiff("plotly", rownames(installed.packages())), repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
# Install package
install.packages("palmerpenguins")
```

##

```{r class.source = "fold-show", eval = T, echo = T}
# Load the `palmerpenguins` package into your current R working environment
library(palmerpenguins)
# Summarise the data in the `palmerpenguins` package
summary(penguins)
```

# Creating Interactive Histograms in RStudio

```{r class.source = "fold-show", eval = T, echo = T}
hist(penguins$body_mass_g, breaks = 19)
```

##

```{r class.source = "fold-show", eval = F, echo = F, include = F}
install.packages("plotly", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("plotly")
```

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F}
library(plotly)
```

## 

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F}
penguin_hist_base <- plot_ly(data = penguins, 
                             x = ~body_mass_g, 
                             type = "histogram")

penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
```

A brief explanation of the code is provided in the `Code` chunk below.

```{r class.source = "fold-hide", eval = F, echo = T}
# Here, we are creating a plotly object called "penguin_hist_base"
penguin_hist_base <- plot_ly(data = penguins, # We are using the penguins data
                             x = ~body_mass_g, # and modelling the body_mass_g data
                             type = "histogram") # in a histogram format

# The code below is used to modify the layout of the histogram
# to include a label for the y-axis
penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
```

## {#basehist}

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F, fig.align = "center"}
penguin_hist_base
```

##

No answer required.

##

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F}
penguin_hist <- plot_ly(data = penguins, 
                        x = ~body_mass_g, 
                        color = ~island, 
                        type = "histogram", alpha = 0.6)

penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")
```

A brief explanation of the code is provided in the `Code` chunk below.

```{r class.source = "fold-hide", eval = F, echo = T}
# Here, we are creating a plotly object called "penguin_hist"
penguin_hist <- plot_ly(data = penguins, # We are using the penguins data
                        x = ~body_mass_g, # and modelling the body_mass_g data
                        color = ~island, type = "histogram", alpha = 0.6)
# We are producing a histogram for this data, with points coloured differently, 
# depending on the island on which the penguin is located

# The code below is used to modify the layout of the histogram
# This includes adding a label to the y-axis
# and setting the histograms to be layered over each other
# (hence the alpha = 0.6 above to change the opacity)
penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")
```

## {#islandshist}

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F, fig.align = "center"}
penguin_hist
```

##

No answer required.

# Creating Interactive Scatter Plots in RStudio {#scatter} 

## {#simplescatter}

No answer required.

## {#scatterbase}

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter <- plot_ly(data = penguins, 
                            x = ~body_mass_g, y = ~flipper_length_mm)
penguins_scatter
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F}
penguins_scatter <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, type = "scatter", mode = "markers")
suppressMessages(penguins_scatter)
```

## {#scattercolour}

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex)
penguins_scatter2
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, fig.align = "center"}
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, color = ~sex, 
                             type = "scatter", mode = "markers")
suppressMessages(penguins_scatter2)
```

## {#scattercolours}

An example result for an arbitrary selection of colours is shown below.

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter_colours <- plot_ly(data = penguins, 
                                    x = ~body_mass_g, y = ~flipper_length_mm, 
                                    color = ~sex, colors = c("cyan", "orange"))
penguins_scatter_colours
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, fig.align = "center"}
penguins_scatter_colours <- plot_ly(data = penguins, 
                                    x = ~body_mass_g, y = ~flipper_length_mm, 
                                    color = ~sex, colors = c("cyan", "orange"),
                                    type = "scatter", mode = "markers")
penguins_scatter_colours
```

##

For brevity only the result for the `Set2` `colors` specification is shown below.

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter_colours <- plot_ly(data = penguins, 
                                    x = ~body_mass_g, y = ~flipper_length_mm, 
                                    color = ~sex, colors = "Set2")
penguins_scatter_colours
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, fig.align = "center"}
penguins_scatter_colours <- plot_ly(data = penguins, 
                                    x = ~body_mass_g, y = ~flipper_length_mm, 
                                    color = ~sex, colors = "Set3",
                                    type = "scatter", mode = "markers")
penguins_scatter_colours
```

##

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter2 <- plot_ly(data = penguins, 
                             x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1",
                             type = "scatter", mode = "markers")
penguins_scatter2
```

```{r class.source = "fold-show", eval = T, echo = T}
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1",
                             type = "scatter", mode = "lines")
penguins_scatter2
```

Note that here, R is drawing a line between the individual data points - clearly we don't want this!

##

```{r class.source = "fold-show", eval = T, echo = T, warning = F}
penguins_scatter2 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", text = ~species,
                             type = "scatter", mode = "markers")
penguins_scatter2
```


## {#scattersymbol}

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species, 
                             type = "scatter", mode = "markers")
penguins_scatter3
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, message = F, fig.align = "center"}
penguins_scatter3 <- plot_ly(data = remove_missing(penguins), x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species, 
                             type = "scatter", mode = "markers")

penguins_scatter3
```

##

Here we have used the symbols `cross`, `diamond` and `star`.

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species,
                             symbols = c("cross", "diamond", "star"),
                             type = "scatter", mode = "markers")
penguins_scatter3
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, message = F, fig.align = "center"}
penguins_scatter3 <- plot_ly(data = remove_missing(penguins), x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species, 
                             symbols = c("cross", "diamond", "star"),
                             type = "scatter", mode = "markers")

penguins_scatter3
```

##

```{r class.source = "fold-show", eval = F, echo = T}
penguins_scatter3 <- plot_ly(data = penguins, x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species,
                             symbols = c("cross", "diamond", "star"),
                             type = "scatter", mode = "markers",
                             marker = list(size = 8))
penguins_scatter3
```

```{r class.source = "fold-show", eval = T, echo = F, warning = F, message = F, fig.align = "center"}
penguins_scatter3 <- plot_ly(data = remove_missing(penguins), x = ~body_mass_g, y = ~flipper_length_mm, 
                             color = ~sex, colors = "Set1", symbol = ~species, 
                             symbols = c("cross", "diamond", "star"),
                             type = "scatter", mode = "markers",
                             marker = list(size = 8))

penguins_scatter3
```

# Creating your own `plotly` Scatter Plot {#scatterpersonal}

##

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F, fig.align = "center"}
penguins_scatter_new <- plot_ly(data = penguins, 
                                x = ~body_mass_g, y = ~bill_length_mm,
                                type = "scatter", mode = "markers")
penguins_scatter_new
```

##

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F, fig.align = "center"}
penguins_scatter_new2 <- plot_ly(data = penguins, 
                                 x = ~body_mass_g, y = ~bill_length_mm,
                                 color = ~island,
                                 type = "scatter", mode = "markers")
penguins_scatter_new2
```

##

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F, fig.align = "center"}
penguins_scatter_new3 <- plot_ly(data = penguins, 
                                 x = ~body_mass_g, y = ~bill_length_mm,
                                 color = ~island, symbol = ~species,
                                 type = "scatter", mode = "markers")
penguins_scatter_new3
```

##

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F, fig.align = "center"}
penguins_scatter_new4 <- plot_ly(data = penguins, 
                                 x = ~body_mass_g, y = ~bill_length_mm,
                                 color = ~island, symbol = ~species, 
                                 symbols = c("cross", "diamond", "star"),
                                 type = "scatter", mode = "markers",
                                 marker = list(size=8))
penguins_scatter_new4
```

##

It does seem that penguins living on different islands have noticeably different `body_mass_g` and `bill_length_mm` measurements, but this is also due to the fact that some species of penguin only live on one of the three islands - e.g. Gentoo and Chinstrap penguins only live on Biscoe island and Dream island respectively, whereas the Adelie penguins live on all three islands.

However, we also note that the Adelie penguins living on Torgersen island are much smaller overall than Adelie penguins living on other islands.

<br>

#### That's everything covered. #### {-}

<br>

# References {- #Ref}
<div id="refs"></div>

<br>

<font color = "grey">
These notes have been prepared by Rupert Kuveke. The copyright for the material in these notes resides with the author named above, with the Department of Mathematical and Physical Sciences and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License 
<a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/CC" target="_blank"> BY-NC-ND. </a>
</font>