Week 4 Workbook

##Data Analysis - Reproducing graphics from the tutorial

#Basics

First I loaded the relevant libraries and changed the output to false to prevent the outputted data appearing on my Quarto document.

library(tidyverse)
library(modeldata)
library(ggplot2)

I then produced a simple graphic specifying the data points to compare and the type of graphic.

ggplot(crickets, aes(x = temp, y = rate)) + geom_point()

To change the labels I add a further command to my code.

ggplot(crickets, aes(x = temp, y = rate)) + geom_point() + labs(x = "Temperature", y = "Chirp rate")

I revised the code further to include a title and source caption.

ggplot(crickets, aes(x = temp, y = rate)) + geom_point() + labs(x = "Temperature", y = "Chirp rate", title = "Cricket chirps", caption = "Source: McDonald (2009)")

By revising the AES function I included a colour representation of species.

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point() + labs(x = "Temperature", y = "Chirp rate", title = "Cricket chirps", caption = "Source: McDonald (2009)")

The key produced by the above code displayed “species” in all lower case font. Adding an additional command to the labs section changed the Species key title font to title case.

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point() + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)")

Adding the following command to my code changed the colours to a colourblind friendly palette.

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point() + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")

#Modifying the basic properties of the plot

To change the aesthetics of the graph by preference rather than a variable, the graph type command can be modified. E.g. The below code changes the colour of the points to red.

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point(color = "Red") + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")

The points can be further modified by size,..

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point(color = "Red", size = 2,) + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")

…by opacity (.3 is opacity of 30%),..

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point(color = "Red", size = 2, alpha = .3) + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")

…by shape

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point(color = "Red", size = 2, alpha = .3, shape = "square") + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")

#Learning more about options for the Geom

Help files can be accessed for all geoms by the command ?geom…

E.g. The following command produces a help file for a scattergraph.

?geom_point
starting httpd help server ... done

#Adding another layer to the geom

A secondary geom can be added to the graph. E.g. below and line plot appears alongside the scatter plot.

ggplot(crickets, aes(x = temp, y = rate,)) + geom_point() + geom_smooth() + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)")
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

The features of this new geom can also be amended. The following command sets the plot to linear method.

ggplot(crickets, aes(x = temp, y = rate,)) + geom_point() + geom_smooth(method = "lm") + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)")
`geom_smooth()` using formula = 'y ~ x'

In the above chart there is a grey shadowing over the line plot making the graphic difficult to read. This indicates uncertainty somewhere in the data. The following revision to the code can prevent this error from affecting the graphic. (SE is the standard error ribbon).

ggplot(crickets, aes(x = temp, y = rate,)) + geom_point() + geom_smooth(method = "lm", se = FALSE) + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)")
`geom_smooth()` using formula = 'y ~ x'

The below example uses the above principle using the previous chart indicating the aes function color = species.

ggplot(crickets, aes(x = temp, y = rate, color = species)) + geom_point() + geom_smooth(method = lm, se = FALSE) + labs(x = "Temperature", y = "Chirp rate", color = "Species", title = "Cricket chirps", caption = "Source: McDonald (2009)") + scale_color_brewer(palette = "Dark2")
`geom_smooth()` using formula = 'y ~ x'

#Other plots

The following command produces a histogram (Single quantative variable with frequencies generally on the vertical axis)

ggplot(crickets, aes(x = rate)) + geom_histogram()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The bin width can easily be changed by revising the command as follows: -

ggplot(crickets, aes(x = rate)) + geom_histogram(bins = 15)

Changing the geom type gives another perspective to the data.

ggplot(crickets, aes(x = rate)) + geom_freqpoly(bins = 15)

The below code produces a bar chart.

ggplot(crickets, aes(x = species)) + geom_bar()

The aesthetics of the chart can be changed by preference as per the below code.

ggplot(crickets, aes(x = species)) + geom_bar(color = "black", fill = "lightblue")

They can be also be changed by variables. The below example changes the colour of the chart dependent on the variable ‘species’. In addition a colour blind friendly palette has been applied.

ggplot(crickets, aes(x = species, fill = species)) + geom_bar() + (scale_fill_brewer(palette = "Dark2"))

A legend has automatically appeared by making aesthetic changes to a variable. As the variables are already labelled on the bar chart the legend is duplicating data and cluttering the graphic. The legend can be removed by adding a command to the geom as follows: -

ggplot(crickets, aes(x = species, fill = species)) + geom_bar(show.legend = FALSE) + (scale_fill_brewer(palette = "Dark2"))

A good method of displaying one set of quantitative data and one set of categorical data is with a boxplot.

ggplot(crickets, aes(x = species, y = rate)) + geom_boxplot()

The aesthetics can be amended as previously. The below example shows the different colours for the variable ‘species’ using a colour blind friendly palette and removes the legend from the graphic.

ggplot(crickets, aes(x = species, y = rate, color = species)) + geom_boxplot(show.legend = FALSE) + scale_color_brewer(palette = "Dark2")

The background of the chart can be amended as shown in example below which uses the minimal theme.

ggplot(crickets, aes(x = species, y = rate, color = species)) + geom_boxplot(show.legend = FALSE) + scale_color_brewer(palette = "Dark2") + theme_minimal()

Other modifications can be made to themes. The following command provides more detail: -

?theme(minimal)

##Faceting

Sometimes certain data produces graphics that are difficult to interpret. The following histogram illustrates this point: -

ggplot(crickets, aes(x = rate, fill = species)) + geom_histogram(bins = 15) + scale_fill_brewer(palette = "Dark2")

The faceting function can be used to present the data more clearly. The following example separates the categories into two charts.

ggplot(crickets, aes(x = rate)) + geom_histogram(bins = 15) + facet_wrap(~species)

Again, the graphic can be changed aesthetically. The below example changes the colour of the bars by the variable ‘species’ to one with a colour blind friendly palette and removes the legend.

ggplot(crickets, aes(x = rate, fill = species)) + geom_histogram(bins = 15, show.legend = FALSE) + facet_wrap(~species) + scale_fill_brewer(palette = "Dark2")

The following amendment to code changes the layout of the charts to be displayed in a vertical column.

ggplot(crickets, aes(x = rate, fill = species)) + geom_histogram(bins = 15, show.legend = FALSE) + facet_wrap(~species, ncol = 1) + scale_fill_brewer(palette = "Dark2")

##Research Methods - What is a good research hypothesis?

A good research hypothesis is a prerequisite to investigation that aims to test a prediction through controlled scientific methods i.e. experiments/close observation/evaluating statistical data, to prove or disprove a theory.The hypothesis should aim to address a shortfall information ideally related to current and relevant issues.The hypothesis should clearly and concisely outline the theory to be tested and methods that will be used to ensure sufficient information is collated to allow for data analysis and reasoning to reach a conclusion that either proves the theory or illustrates deviations from the expected results, providing the scope and basis for further scientific research and investigation.