Data Science Module

Topic 1B: Data Visualisation I


Welcome to the first Data Science computer lab for STM1001!

Throughout the semester, we will use the R software environment for all our work. R is widely used for statistical computing and data visualisation, and indeed the first four computer labs of the STM1001 Data Science module focus on data visualisation in R.

In this first lab, we will explore some of the more light-hearted options available to R users, and then introduce an excellent package, plotly (Sievert 2020), which allows us to create interactive data visualisations.

By the end of this lab, you should feel comfortable loading and using packages in R, and be able to create a simple interactive histogram using plotly.


1 Making memes in R

Base R contains many functions, and is perfectly sufficient for a number of data analysis methods. However, one of the great benefits of R is that anyone can create packages (bundles of code, data and functions) which can be uploaded to global repositories (such as CRAN or Bioconductor), and made available for anyone around the world to download and use in their version of R.

Often, these packages are extremely helpful. They may contain useful data sets for a specific field of research, address a shortcoming with the base R suite of functions, allow users to perform specialised analyses, and/or offer users some additional functionalities.

On the other hand, sometimes these packages are more light-hearted, such as the meme package (Yu 2021), which allows users to create simple memes within R. Let’s take a look at this package now.

1.1

Because the meme package is not installed in base R, we need to download it before we can use it. We can use the install.packages() R function to do this, as shown in the code below.

install.packages("meme")

Open up RStudio and run this code now.

1.2

Once the meme package is downloaded and installed, we need to load it in our current R session. Run the following code to load the meme package.

library(meme)

1.3

Great, now we can start to make some simple memes! We really only need two lines of code for this.

Firstly, we need to find an appropriate image. For this example, we will use an image of Hagrid, from the Harry Potter series. We have located this image online, and copied the url. In R, we assign this url to the object hagrid, as shown below:

hagrid <- "https://i.imgflip.com/13wb2t.jpg"

Note that the url needs to be contained within quotation marks.

Make sure to run this code before moving on to the next step.

1.4

Next, we use the meme function to add some words to this image. Try running the code below, and see what happens.

meme(hagrid, "Yer a wizard", "with coding", font = "sans")

Note: Some warnings may appear in your R Console as this code is executing. Don’t worry about it, it is safe to ignore these warnings.

1.5

If you would like to save the meme you have made, it is helpful to assign the output of the meme function to an object. In the code below, we make a new meme, and assign it to the object success. Try running this code now.

success_kid <- "http://i0.kym-cdn.com/entries/icons/mobile/000/000/745/success.jpg"
success <- meme(success_kid, "Using R", "to make memes", font = "sans")
success

Hint: Notice that we need to include the final line of code, calling the object success, in order for the image to be shown.

1.6

Now we can save our meme, using the function meme_save. Take a look at the code below.

meme_save(success, file="c:/STM1001/Data Science/success_kid_R_meme.png") 

Here, we are saving our success meme, to the file location c:\STM1001\Data Science\, with the name success_kid_R_meme.png.

Note that although the file path on our computer includes backslashes (\), in R code these need to be changed to forward slashes (/).

1.7

Now it’s time to try making your own meme.

  1. Find an appropriate image of your choice online (please ensure you pick content suitable for university and work).

  2. Copy the url.

  3. Assign this url to an object in R.

  4. Use the meme function to add words to your image.

  5. Save your meme using the meme_save function.

Hint: If you are not quite sure how to begin, click the Code button to the right below.

# First, we need to find an image, and assign it to an object 
# (here we use the generic object name 'image_name')
# Just replace the ...s with the url of your image
image_name <- "..."
# Next, we need to use the meme function, to add some words (just replace the ...s)
my_meme <- meme(image_name, "...", "...", font = "sans")
# Note that you need to include the `, font = "sans"` part to ensure R know which font to use.
# Now all that's left is to save your meme - just refer to the code above.

Congratulations! You were probably not expecting to make a meme in your first data science computer lab, and this probably won’t be on the final exam, but hopefully you are starting to realise that R is very versatile.

2 Customizing GIFs in R

R is not limited to working with static images - we can modify and create gifs and animations. In this section, we will use another fun package, the magick package (Ooms 2021), to customize a gif.

Run the following code to download, install and load the magick package in your current R session.

install.packages("magick")
library(magick)

2.1

Just as we obtained online images of hagrid and success kid, so too can we use urls to gifs and animations. For this example, we have used the url to a rotating earth gif.

We use the image_read function to read this gif into R, and assign it to the object Earth.

Earth <- image_read("https://i.giphy.com/media/mf8UbIDew7e8g/giphy.gif")
Earth

Make sure to run this code before moving on to the next step (don’t worry if it takes a few seconds). The gif should appear in the Viewer section of RStudio.

2.2

Using the magick package, we can easily make some changes to this gif.

Take a look at the code below. You will notice here that:

  • We have reversed the gif, using the rev function
  • We have flipped the gif, using the image_flip function, and
  • We have added text to this gif using the image_annotate function
rev(Earth) %>% 
           image_flip() %>% 
           image_annotate("        Meanwhile, in Australia", size = 40, color = "white")

Try running this code now.

2.3

This is really just scratching the surface of the magick package. However, our intention for this first computer lab is to give you a taste of some of the different possibilities available in R, so for the moment, let’s move on.

3 Drawing a fish in R

Instead of using a pre-existing image or gif, let’s now try to create one from scratch. Specifically, let’s draw a fish. To do this, we can use the appropriately named rfishdraw package (Ding 2021).

3.1

Let’s download and install the rfishdraw package now. In order to use this package, we will also need to download and install some additional packages, upon which the rfishdraw package depends. Such packages are known as dependencies, and it is common for more sophisticated R packages to have multiple dependencies.

Note that these dependencies are packages in their own right.

Run this code in R now.

install.packages("rfishdraw")
install.packages("patchwork")
install.packages("ggplot2")
library("rfishdraw")
library("patchwork")
library("ggplot2")

3.2

If you now run the code below, a detailed drawing of a fish should appear in a new window!

get_polylines(path = "inst/fishdraw.js",
              
              format = "smil",
              
              output = "animated.svg",
              
              draw_type = "random")

windows() # If you are using a Mac, replace windows() with: quartz()

fish_draw()

3.3

Suppose we would like to change the colour of our fish. We can do this, by including the argument col = "..." within the function fish_draw. For example, if we would like our fish to be blue, we can write

fish_draw(col = "blue")

Try changing this colour to a different colour, and then run the code.

4 Palmer Penguins Data Set

Now that we have had a taste of some of the more light-hearted R packages out there, let’s consider a package which contains some useful data.

The palmerpenguins R package (Horst, Hill, and Gorman 2020) contains data, collected over the course of several years, on 3 species of penguin living on different islands in the Palmer archipelago, off the coast of Antarctica. For more details, you can refer to Section 2 of the Data Visualisation in R supplement.

4.1

Just like the previous packages, we will need to download and load the palmerpenguins package before we can begin working with this penguin data.

Run the code below to install and load the palmerpenguins package in R.

install.packages("palmerpenguins")
library(palmerpenguins)

4.2

We can use the summary function to obtain a quick overview of the data contained within the penguins data set.

# This code summarises the data in the `palmerpenguins` package.
summary(penguins)

Don’t worry too much about the values shown in the summary table - the main things to note at this stage are the different variables, namely species, island, bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, sex and year.

5 Interactive Histograms

Suppose that we would like to produce histograms showing the distribution of the penguins’ body_mass_g values (their body mass in grams). We could create a simple histogram using the base R hist function via the following code:

hist(penguins$body_mass_g, breaks = 19)

However, this histogram has some shortcomings. Firstly, it is static. We can’t interact with the image, and we can’t manipulate it in real time to display different details.

For example, perhaps we would like to see the distribution of the penguins’ body_mass_g values, but only for the penguins on a specific island. We would need to do some more coding to produce such a histogram in base R. Even then, if we would like to have similar histograms for the other two islands, this would mean further coding.

Alternatively, we could use the plotly package to create an interactive, responsive histogram. Let’s take a look at how to do this now.

5.1

To begin, just as for the previous packages, we will need to download and load the plotly package in R, before we can use any plotly functions.

Run the code below to install and load the plotly package in R.

install.packages("plotly")
library(plotly)

5.2

To create plotly plots, we use the function plot_ly(). We won’t worry too much about the composition of this function just yet - we’ll cover this in more detail next week. For the moment, take a look at the code below, and see if you can get a general idea of what’s going on.

penguin_hist_base <- plot_ly(data = penguins, 
                             x = ~body_mass_g, 
                             type = "histogram")

penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))

Before you move on to the next question, run this code in R.

Note: Once you have taken some time to consider the code above, if you would like more details or would like to check the accuracy of your interpretation, click the Code button below for a brief explanation.

# Here, we are creating a plotly object called "penguin_hist_base"
penguin_hist_base <- plot_ly(data = penguins, # We are using the penguins data
                             x = ~body_mass_g, # and modelling the body_mass_g data
                             type = "histogram") # in a histogram format

# The code below is used to modify the layout of the histogram
# to include a label for the y-axis
penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))

5.3

To produce this plotly histogram, run the R code below. Your histogram should appear in the Viewer section of RStudio.

penguin_hist_base

5.4

As we noted earlier, plotly graphs, unlike base R graphs, are interactive!

Notice that if you hover over the data in the histogram in 5.3, you can see the specific details (note that the graph in this document is also interactive!). If you left-click and drag your cursor over a section to create a box, you can also zoom in on a particular section of the plot. Just double left-click to zoom back out.

5.5

Perhaps you are not impressed with plotly yet. After all, our histogram doesn’t look that different to the base R version, so what is all the fuss about?

Well, it is very easy to modify our penguin_hist_base plot_ly graph to show extra detail. For example, we can easily produce separate histograms for the penguins on each island. Take a look at the R code below, which builds upon what we used in penguin_hist_base.

penguin_hist <- plot_ly(data = penguins, 
                        x = ~body_mass_g, 
                        color = ~island, 
                        type = "histogram", alpha = 0.6)

penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")

Before you move on to the next question, run this code in R.

Note: Once you have taken some time to consider the code above, if you would like more details or would like to check the accuracy of your interpretation, click the Code button below for a brief explanation.

# Here, we are creating a plotly object called "penguin_hist"
penguin_hist <- plot_ly(data = penguins, # We are using the penguins data
                        x = ~body_mass_g, # and modelling the body_mass_g data
                        color = ~island, type = "histogram", alpha = 0.6)
# We are producing a histogram for this data, with points coloured differently, 
# depending on the island on which the penguin is located

# The code below is used to modify the layout of the histogram
# This includes adding a label to the y-axis
# and setting the histograms to be layered over each other
# (hence the alpha = 0.6 above to change the opacity)
penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")

5.6

To produce this new plotly histogram, run the R code below. Your histogram should appear in the Viewer section of RStudio.

penguin_hist

This is looking better than our previous histogram! Because we have told our plot_ly function to assign different colours to the different islands, we now have three histograms, rather than one with all the data clumped together.

Even better, these are all presented within the one plot, which also includes a handy legend. Hopefully you are now beginning to appreciate the increased functionality offered by plotly over base R plots.

5.7

Finally, and perhaps most importantly for this specific example, it is important to note that we can dynamically filter out observations, to focus on data from a specific island. Simply click on one of the lines in the legend in the top right of our histogram in 5.6, to remove that data from assessment (note that the axes dynamically adjust too).

Try focusing just on the Dream island penguins.

Hint: To bring the removed data back, simply click once more on the relevant line in the legend.


That’s the end of the first data science computer lab!

Hopefully you have enjoyed this first computer lab, and now have a better idea of just how versatile R can be. Don’t worry if some of the code seems difficult at the moment - this is only the first week after all! Next week, we will continue working with plotly and the palmerpenguins data set, to produce even more detailed interactive plots.

Before you finish up, if you have been writing and running your code in RStudio, make sure to save your script file somewhere safe - it might come in handy later on.


References

Ding, Liuyong. 2021. rfishdraw: Automatically Generated Fish Drawings via JavaScript. https://github.com/Otoliths/rfishdraw.
Horst, Allison Marie, Alison Presmanes Hill, and Kristen B Gorman. 2020. Palmerpenguins: Palmer Archipelago (Antarctica) Penguin Data. https://doi.org/10.5281/zenodo.3960218.
Ooms, Jeroen. 2021. magick: advanced graphics and image-processing in R. https://docs.ropensci.org/magick/.
Sievert, Carson. 2020. Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman; Hall/CRC. https://plotly-r.com.
Yu, Guangchuang. 2021. meme: create memes in R. https://github.com/GuangchuangYu/meme/.


These notes have been prepared by Rupert Kuveke. The copyright for the material in these notes resides with the author named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.

---
title: "STM1001: Computer Lab 1B"
output:
  bookdown::html_document2: 
    toc: true
    toc_float: true
    code_download: true
    theme: readable
    code_folding: show
bibliography: STM1001_DS_CL_references.bib 
link-citations: yes
---

<style>
#TOC {
  background: url("https://www.latrobe.edu.au/_media/la-trobe-api/v5/img/logo.svg");
  background-size: contain;
  padding-top: 80px !important;
  background-repeat: no-repeat;
}
</style>

### Data Science Module {-}

### Topic 1B: Data Visualisation I {-}

<br>

Welcome to the first Data Science computer lab for STM1001!

Throughout the semester, we will use the R software environment for all our work.
R is widely used for statistical computing and data visualisation, and indeed the first four computer labs of the STM1001 Data Science module focus on data visualisation in R. 

In this first lab, we will explore some of the more light-hearted options available to R users, and then introduce an excellent package, `plotly` [@plotly], which allows us to create interactive data visualisations.

By the end of this lab, you should feel comfortable loading and using packages in R, and be able to create a simple interactive histogram using `plotly`.

<br>

# Making memes in R

Base R contains many functions, and is perfectly sufficient for a number of data analysis methods.
However, one of the great benefits of R is that anyone can create **packages** (bundles of code, data and functions) which can be uploaded to global repositories (such as CRAN or Bioconductor), and made available for anyone around the world to download and use in their version of R. 

Often, these packages are extremely helpful. They may contain useful data sets for a specific field of research, address a shortcoming with the base R suite of functions, allow users to perform specialised analyses, and/or offer users some additional functionalities.  

On the other hand, sometimes these packages are more light-hearted, such as the `meme` package [@memes], which allows users to create simple memes within R. Let's take a look at this package now.

## 

Because the `meme` package is not installed in base R, we need to download it before we can use it.
We can use the `install.packages()` R function to do this, as shown in the code below.

```{r class.source = "fold-show", eval = F, echo = T, include = F}
install.packages("meme", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("meme")
```

Open up RStudio and run this code now.

##

Once the `meme` package is downloaded and installed, we need to load it in our current R session.
Run the following code to load the `meme` package.

```{r class.source = "fold-show", eval = T, echo = F, include = F}
library(meme)
```

```{r class.source = "fold-show", eval = F, echo = T}
library(meme)
```

##

Great, now we can start to make some simple memes!
We really only need two lines of code for this.

Firstly, we need to find an appropriate image. For this example, we will use an image of Hagrid, from the Harry Potter series. We have located this image online, and copied the url. In R, we assign this url to the object `hagrid`, as shown below:

```{r class.source = "fold-show", eval = T, echo = T}
hagrid <- "https://i.imgflip.com/13wb2t.jpg"
```

*Note that the url needs to be contained within quotation marks.*

Make sure to run this code before moving on to the next step.

##

Next, we use the `meme` function to add some words to this image. Try running the code below, and see what happens.

```{r class.source = "fold-show", eval = F, echo = T}
meme(hagrid, "Yer a wizard", "with coding", font = "sans")
```

*Note: Some warnings may appear in your R Console as this code is executing. Don't worry about it, it is safe to ignore these warnings.* 

##

If you would like to save the meme you have made, it is helpful to assign the output of the `meme` function to an object.
In the code below, we make a new meme, and assign it to the object `success`. Try running this code now.

```{r class.source = "fold-show", eval = F, echo = T}
success_kid <- "http://i0.kym-cdn.com/entries/icons/mobile/000/000/745/success.jpg"
success <- meme(success_kid, "Using R", "to make memes", font = "sans")
success
```

*Hint: Notice that we need to include the final line of code, calling the object `success`, in order for the image to be shown.*

## 

Now we can save our meme, using the function `meme_save`. Take a look at the code below.

```{r class.source = "fold-show", eval = F, echo = T}
meme_save(success, file="c:/STM1001/Data Science/success_kid_R_meme.png") 
```

Here, we are saving our `success` meme, to the file location `c:\STM1001\Data Science\`, with the name `success_kid_R_meme.png`. 

Note that although the file path on our computer includes backslashes (`\`), in R code these need to be changed to forward slashes (`/`).

##

Now it's time to try making your own meme. 

a. Find an appropriate image of your choice online (please ensure you pick  content suitable for university and work). 
b. Copy the url. 

c. Assign this url to an object in R.
d. Use the `meme` function to add words to your image.
e. Save your meme using the `meme_save` function.

*Hint: If you are not quite sure how to begin, click the `Code` button to the right below.*

```{r class.source = "fold-hide", eval = F, echo = T}
# First, we need to find an image, and assign it to an object 
# (here we use the generic object name 'image_name')
# Just replace the ...s with the url of your image
image_name <- "..."
# Next, we need to use the meme function, to add some words (just replace the ...s)
my_meme <- meme(image_name, "...", "...", font = "sans")
# Note that you need to include the `, font = "sans"` part to ensure R know which font to use.
# Now all that's left is to save your meme - just refer to the code above.
```

Congratulations! You were probably not expecting to make a meme in your first data science computer lab, and this probably won't be on the final exam, but hopefully you are starting to realise that R is very versatile.

# Customizing GIFs in R

R is not limited to working with static images - we can modify and create gifs and animations.
In this section, we will use another fun package, the `magick` package [@magick], to customize a gif.

Run the following code to download, install and load the `magick` package in your current R session.

```{r class.source = "fold-show", eval = F, echo = T, include = F}
install.packages("magick", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("magick")
```

```{r class.source = "fold-show", eval = T, echo = F, include = F}
library(magick)
```

```{r class.source = "fold-show", eval = F, echo = T}
library(magick)
```

##

Just as we obtained online images of `hagrid` and `success kid`, so too can we use urls to gifs and animations.
For this example, we have used the url to a rotating earth gif.

We use the `image_read` function to read this gif into R, and assign it to the object `Earth`.

```{r class.source = "fold-show", eval = F, echo = T, , fig.align = "center"}
Earth <- image_read("https://i.giphy.com/media/mf8UbIDew7e8g/giphy.gif")
Earth
```

Make sure to run this code before moving on to the next step (don't worry if it takes a few seconds). The gif should appear in the `Viewer` section of RStudio.

##

Using the `magick` package, we can easily make some changes to this gif.

Take a look at the code below. You will notice here that: 

* We have  reversed the gif, using the `rev` function
* We have  flipped the gif, using the `image_flip` function, and
* We have added text to this gif using the `image_annotate` function

```{r class.source = "fold-show", eval = F, echo = T, , fig.align = "center"}
rev(Earth) %>% 
           image_flip() %>% 
           image_annotate("        Meanwhile, in Australia", size = 40, color = "white")
```

Try running this code now.

##

This is really just scratching the surface of the `magick` package. However, our intention for this first computer lab is to give you a taste of some of the different possibilities available in R, so for the moment, let's move on.

# Drawing a fish in R

Instead of using a pre-existing image or gif, let's now try to create one from scratch. Specifically, let's draw a fish.
To do this, we can use the appropriately named `rfishdraw` package [@rfishdraw].

##

Let's download and install the `rfishdraw` package now. In order to use this package, we will also need to download and install some additional packages, upon which the `rfishdraw` package depends. Such packages are known as **dependencies**, and it is common for more sophisticated R packages to have multiple dependencies.

*Note that these dependencies are packages in their own right.*

Run this code in R now.

```{r class.source = "fold-show", eval = F, echo = T, include = F}
install.packages("rfishdraw", repos = "http://cran.us.r-project.org")
install.packages("patchwork", repos = "http://cran.us.r-project.org")
install.packages("ggplot2", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("rfishdraw")
install.packages("patchwork")
install.packages("ggplot2")
```

```{r class.source = "fold-show", eval = T, echo = F, include = F}
library("rfishdraw")
library("patchwork")
library("ggplot2")
```

```{r class.source = "fold-show", eval = F, echo = T}
library("rfishdraw")
library("patchwork")
library("ggplot2")
```

##

If you now run the code below, a detailed drawing of a fish should appear in a new window!

```{r class.source = "fold-show", eval = F, echo = T}
get_polylines(path = "inst/fishdraw.js",
              
              format = "smil",
              
              output = "animated.svg",
              
              draw_type = "random")

windows() # If you are using a Mac, replace windows() with: quartz()

fish_draw()
```

##

Suppose we would like to change the colour of our fish. We can do this, by including the argument `col = "..."` within the function `fish_draw`. For example, if we would like our fish to be blue, we can write

```{r class.source = "fold-show", eval = F, echo = T}
fish_draw(col = "blue")
```

Try changing this colour to a different colour, and then run the code.

# Palmer Penguins Data Set {#penguins}

Now that we have had a taste of some of the more light-hearted R packages out there, let's consider a package which contains some useful data.

The `palmerpenguins` R package [@penguins] contains data, collected over the course of several years, on 3 species of penguin living on different islands in the Palmer archipelago, off the coast of Antarctica. 
For more details, you can refer to [Section 2 of the Data Visualisation in R supplement](https://bookdown.org/rehk/stm1001_dsm_data_visualisation_in_r/penguins.html).

##

Just like the previous packages, we will need to download and load the `palmerpenguins` package before we can begin working with this penguin data.

Run the code below to install and load the `palmerpenguins` package in R.

```{r class.source = "fold-show", eval = F, echo = F, include = F}
install.packages("palmerpenguins", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("palmerpenguins")
```

```{r class.source = "fold-show", eval = T, echo = T, warning=F}
library(palmerpenguins)
```

##

We can use the `summary` function to obtain a quick overview of the data contained within the `penguins` data set.

```{r class.source = "fold-show", eval = F, echo = T}
# This code summarises the data in the `palmerpenguins` package.
summary(penguins)
```

Don't worry too much about the values shown in the summary table - the main things to note at this stage are the different variables, namely `species`, `island`, `bill_length_mm`, `bill_depth_mm`, `flipper_length_mm`, `body_mass_g`, `sex` and `year`.

# Interactive Histograms

Suppose that we would like to produce histograms showing the distribution of the penguins' `body_mass_g` values (their body mass in grams). We could create a simple histogram using the base R `hist` function via the following code:

```{r class.source = "fold-show", eval = T, echo = T}
hist(penguins$body_mass_g, breaks = 19)
```

However, this histogram has some shortcomings. Firstly, it is static. We can't interact with the image, and we can't manipulate it in real time to display different details. 

For example, perhaps we would like to see the distribution of the penguins' `body_mass_g` values, but only for the penguins on a specific island. We would need to do some more coding to produce such a histogram in base R. Even then, if we would like to have similar histograms for the other two islands, this would mean further coding.

Alternatively, we could use the `plotly` package to create an interactive, responsive histogram. Let's take a look at how to do this now.

##

To begin, just as for the previous packages, we will need to download and load the `plotly` package in R, before we can use any `plotly` functions.

Run the code below to install and load the `plotly` package in R.

```{r class.source = "fold-show", eval = F, echo = F, include = F}
install.packages("plotly", repos = "http://cran.us.r-project.org")
```

```{r class.source = "fold-show", eval = F, echo = T}
install.packages("plotly")
```

```{r class.source = "fold-show", eval = T, echo = T, message = F, warning = F}
library(plotly)
```

##

To create `plotly` plots, we  use the function `plot_ly()`. We won't worry too much about the composition of this function just yet - we'll cover this in more detail next week. For the moment, take a look at the code below, and see if you can get a general idea of what's going on.

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F}
penguin_hist_base <- plot_ly(data = penguins, 
                             x = ~body_mass_g, 
                             type = "histogram")

penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
```

Before you move on to the next question, run this code in R.

*Note: Once you have taken some time to consider the code above, if you would like more details or would like to check the accuracy of your interpretation, click the `Code` button below for a brief explanation.*

```{r class.source = "fold-hide", eval = F, echo = T}
# Here, we are creating a plotly object called "penguin_hist_base"
penguin_hist_base <- plot_ly(data = penguins, # We are using the penguins data
                             x = ~body_mass_g, # and modelling the body_mass_g data
                             type = "histogram") # in a histogram format

# The code below is used to modify the layout of the histogram
# to include a label for the y-axis
penguin_hist_base <- penguin_hist_base %>% layout(yaxis = list(title = 'count'))
```

## {#basehist}

To produce this `plotly` histogram, run the R code below. Your histogram should appear in the `Viewer` section of RStudio.

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F, fig.align = "center"}
penguin_hist_base
```

##

As we noted earlier, `plotly` graphs, unlike base R graphs, are interactive!

Notice that if you hover over the data in the histogram in \@ref(basehist), you can see the specific details (note that the graph in this document is also interactive!). 
If you left-click and drag your cursor over a section to create a box, you can also zoom in on a particular section of the plot. Just double left-click to zoom back out.

##

Perhaps you are not impressed with `plotly` yet. After all, our histogram doesn't look that different to the base R version, so what is all the fuss about?

Well, it is very easy to modify our `penguin_hist_base` `plot_ly` graph to show extra detail. For example, we can easily produce separate histograms for the penguins on each island. Take a look at the R code below, which builds upon what we used in `penguin_hist_base`.

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F}
penguin_hist <- plot_ly(data = penguins, 
                        x = ~body_mass_g, 
                        color = ~island, 
                        type = "histogram", alpha = 0.6)

penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")
```

Before you move on to the next question, run this code in R.

*Note: Once you have taken some time to consider the code above, if you would like more details or would like to check the accuracy of your interpretation, click the `Code` button below for a brief explanation.*

```{r class.source = "fold-hide", eval = F, echo = T}
# Here, we are creating a plotly object called "penguin_hist"
penguin_hist <- plot_ly(data = penguins, # We are using the penguins data
                        x = ~body_mass_g, # and modelling the body_mass_g data
                        color = ~island, type = "histogram", alpha = 0.6)
# We are producing a histogram for this data, with points coloured differently, 
# depending on the island on which the penguin is located

# The code below is used to modify the layout of the histogram
# This includes adding a label to the y-axis
# and setting the histograms to be layered over each other
# (hence the alpha = 0.6 above to change the opacity)
penguin_hist <- penguin_hist %>% layout(yaxis = list(title = 'count'), 
                                        barmode ="overlay")
```

## {#islandshist}

To produce this new `plotly` histogram, run the R code below. Your histogram should appear in the `Viewer` section of RStudio.

```{r class.source = "fold-show", eval = T, echo = T, warning = F, message = F, fig.align = "center"}
penguin_hist
```

This is looking better than our previous histogram! Because we have told our `plot_ly` function to assign different colours to the different islands, we now have three histograms, rather than one with all the data clumped together.

Even better, these are all presented within the one plot, which also includes a handy legend. Hopefully you are now beginning to appreciate the increased functionality offered by `plotly` over base R plots.

##

Finally, and perhaps most importantly for this specific example, it is important to note that we can dynamically filter out observations, to focus on data from a specific island. Simply click on one of the lines in the legend in the top right of our histogram in \@ref(islandshist), to remove that data from assessment (note that the axes dynamically adjust too). 

Try focusing just on the Dream island penguins.

*Hint: To bring the removed data back, simply click once more on the relevant line in the legend.*

<br>

#### That's the end of the first data science computer lab! #### {-}

Hopefully you have enjoyed this first computer lab, and now have a better idea of just how versatile R can be. Don't worry if some of the code seems difficult at the moment - this is only the first week after all! Next week, we will continue working with `plotly` and the `palmerpenguins` data set, to produce even more detailed interactive plots.

Before you finish up, if you have been writing and running your code in RStudio, make sure to save your script file somewhere safe - it might come in handy later on. 

<br>

# References {- #Ref}
<div id="refs"></div>

<br>

<font color = "grey">
These notes have been prepared by Rupert Kuveke. The copyright for the material in these notes resides with the author named above, with the Department of Mathematics and Statistics and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License 
<a href = "https://creativecommons.org/licenses/by-nc-nd/4.0/CC" target="_blank"> BY-NC-ND. </a>
</font>