Previous labs have had you create different types of visuals for helping illustrate your data,

we have focused on the built-in graphing abilities in R, but there are many other options for

creating high-level, publication quaility figures, the most common graphing tool is ggplot

Here are some helpful links for understanding the structure of coding in ggplot:

https://ggplot2.tidyverse.org/articles/ggplot2.html

https://r-graph-gallery.com/ggplot2-package.html

PART 1: installing and loading ggplot2

library(ggplot2)

PART 2: Practicing with barnacle_data_2.csv, boxplots

In R lab 5, you worked with the barnacle dataset to look at how tide height influenced size and how two size measurements related to one another. I had you graph these things with the basic visualization tools, but let’s try and make them a bit fancier using ggplot2

Please perform the tasks below: 1. Load the “barnacle_data_2.csv” file, call it “barnacle” 2. Run the code chunk below, this will create a boxplot:

barnacle <- read.table('barnacle_data_2.csv',sep=',',header=T)
barnacle
##     Tide.Height OperculumAug BasalAug  WidthChange
## 1             H        5.258    7.098  68.01916033
## 2             H        5.615    9.256  18.65816768
## 3             H        5.483    8.289   0.97719870
## 4             H        5.065   11.578   6.36552081
## 5             H        5.229    9.563  12.21374046
## 6             H        5.645   11.291   7.41298379
## 7             H        6.367    8.750   1.06285714
## 8             H        5.740   11.798   6.32310561
## 9             H        5.714    9.890   8.24064712
## 10            H        3.821    6.072  45.25691700
## 11            H        3.642    6.293   1.43016050
## 12            H        3.971    8.611   3.79746835
## 13            H        5.323    8.826  26.50124632
## 14            H        4.708    7.784  16.00719424
## 15            H        3.472    6.029  22.64057058
## 16            H        3.235    5.521  42.25683753
## 17            H        3.881    5.820  30.20618557
## 18            H        1.848    3.368  89.60807601
## 19            H        4.865    6.001  40.19330112
## 20            H        5.073    5.813  12.98813005
## 21            H        2.857    3.483   7.69451622
## 22            H        4.771    8.944  12.94722719
## 23            H        4.422    6.262  16.33663366
## 24            H        4.384    7.715   3.30524951
## 25            H        4.898    6.371   3.32757809
## 26            H        4.722    6.874   7.53564155
## 27            H        4.261    7.039  14.98792442
## 28            H        4.604    7.817   9.49213253
## 29            H        4.558    8.892  16.64417454
## 30            H        5.303    8.467   1.29916145
## 31            H        4.723    9.030   6.36766334
## 32            H        3.758    9.147   8.54925112
## 33            H        4.908    8.703  32.64391589
## 34            H        4.361    9.779  19.20441763
## 35            H        4.348   10.218  15.02250930
## 36            H        5.249   10.126  15.40588584
## 37            H        3.956    6.552  11.75213675
## 38            H        4.457    8.149  18.33353786
## 39            H        4.164    8.804  15.54975011
## 40            H        4.088    9.363   0.58741856
## 41            H        3.656    8.790   5.77929465
## 42            H        3.755    8.400   5.83333333
## 43            H        4.026    8.218   8.42054028
## 44            H        3.663    5.287  94.15547570
## 45            H        4.284    6.053  30.81116802
## 46            H        4.672    9.284   7.71219302
## 47            H        3.911    4.594  30.40922943
## 48            L        3.470   10.315   3.46097916
## 49            L        3.215    6.568  21.26979294
## 50            L        3.957    6.979  34.46052443
## 51            L        3.936    7.290  21.38545953
## 52            L        3.585    7.513  32.31731665
## 53            L        3.778    7.208  37.19478357
## 54            L        4.316   10.376  13.97455667
## 55            L        2.288    5.088  55.09040881
## 56            L        2.874    6.361  20.40559660
## 57            L        4.518    9.353  23.69293275
## 58            L        4.056    6.387  97.96461563
## 59            L        3.256    4.375  32.96000000
## 60            L        4.436    7.321  36.64799891
## 61            L        3.643    7.522  68.67854294
## 62            L        4.566    6.641  68.31802439
## 63            L        4.213    9.063  14.17852808
## 64            L        4.324    7.733  19.28100349
## 65            L        3.185    5.127  58.31870490
## 66            L        3.725    6.541  44.10640575
## 67            L        4.116    9.121  25.60026313
## 68            L        3.921    6.626  70.55538787
## 69            L        4.979    6.601  10.37721557
## 70            L        2.364    6.376   6.80677541
## 71            L        4.557    9.280  19.11637931
## 72            L        4.576    7.057  42.39761939
## 73            L        3.583    7.816  26.30501535
## 74            L        3.693    6.758  42.14264575
## 75            L        3.823    7.797  19.72553546
## 76            L        4.175    6.908  46.56919514
## 77            L        3.328    6.613  34.59851807
## 78            L        3.654    5.344  55.03368263
## 79            L        3.686    7.859   8.57615473
## 80            L        3.533    5.623  37.86235106
## 81            L        3.686    9.099   9.42960765
## 82            L        4.994    6.844  66.17475161
## 83            L        3.100    7.648  26.15062762
## 84            L        3.083    5.081  26.51052942
## 85            L        2.828    5.793  47.00500604
## 86            L        3.687    6.926  38.75252671
## 87            L        4.464    8.429  12.43326611
## 88            L        4.864    8.745  24.64265294
## 89            L        2.037    3.750 108.72000000
## 90            L        5.148    6.868  24.24286546
## 91            L        4.257    5.239  87.57396450
## 92            L        3.977    5.342  53.80007488
## 93            L        4.934    7.455  73.72233400
## 94            L        4.966    5.677  19.88726440
## 95            L        4.376    5.645  22.90522586
## 96            L        4.022    6.343   7.11020022
## 97            H        2.384    6.582   5.49984807
## 98            H        3.128    5.800  18.65517241
## 99            H        3.553    6.453   9.29800093
## 100           H        2.941    7.115   3.17638791
## 101           H        3.146    6.465   5.35189482
## 102           H        2.529    6.094   9.92779783
## 103           H        2.915    4.429   4.01896591
## 104           H        3.541    6.797   2.16271885
## 105           H        3.425    3.791  18.93959377
## 106           H        4.016    7.637  12.89773471
## 107           H        3.388    7.166   2.73513815
## 108           H        3.244    6.179  11.02120084
## 109           H        4.471    8.479   4.03349452
## 110           H        2.721    5.915   4.64919696
## 111           H        3.434    6.937  12.90183076
## 112           H        3.390    5.634   1.77493788
## 113           H        3.811    8.301   0.69871100
## 114           H        3.058    6.829   6.97027383
## 115           H        2.996    7.893   5.38451793
## 116           H        4.033   10.004   0.58976409
## 117           H        3.271    7.576   0.19799366
## 118           H        2.459    5.039  10.00198452
## 119           H        2.848    5.411   7.54019590
## 120           H        2.355    4.924   9.44354184
## 121           H        3.226    4.617   4.80831709
## 122           H        2.908    5.836   0.97669637
## 123           H        3.438    6.758   8.34566440
## 124           H        3.102    7.255   3.37698139
## 125           H        2.905    5.495   6.31483166
## 126           H        2.899    4.488  32.10784314
## 127           H        3.664    4.989   6.41411104
## 128           H        3.953    5.095  10.08832188
## 129           H        3.598    4.718   9.11403137
## 130           H        3.712    5.000   7.10000000
## 131           H        3.676    5.525   8.30769231
## 132           H        3.814    4.937   6.94753899
## 133           H        4.335    6.064   8.04749340
## 134           H        2.721    5.204   2.01767871
## 135           H        3.482    6.168  10.68417639
## 136           H        3.303    7.058   1.50184188
## 137           H        3.404    7.095   2.42424242
## 138           H        2.901    5.130   4.69785575
## 139           H        3.687    5.945  13.45668629
## 140           H        3.338    6.304  14.87944162
## 141           H        3.770    5.416   4.33899557
## 142           H        2.937    6.022   1.77681833
## 143           H        2.711    6.399   1.40646976
## 144           L        2.469    5.757  22.14695154
## 145           L        2.647    5.063   3.00217263
## 146           L        2.806    4.918  12.99308662
## 147           L        2.790    4.488  28.14171123
## 148           L        2.245    5.398  11.98592071
## 149           L        2.118    4.392  32.74134791
## 150           L        2.735    5.837  12.78053795
## 151           L        1.824    4.437  41.04124408
## 152           L        3.062    5.167   1.43216567
## 153           L        2.146    4.936  14.12074554
## 154           L        1.572    3.365  53.87815750
## 155           L        1.839    3.950  23.69620253
## 156           L        0.533    0.815 447.36196320
## 157           L        2.175    4.860  26.37860082
## 158           L        1.954    4.473  22.11044042
## 159           L        1.974    5.062  20.20940340
## 160           L        1.667    4.271  23.46054788
## 161           L        1.476    4.098  26.32991703
## 162           L        1.771    5.867  22.75438896
## 163           L        1.698    4.231  17.08815883
## 164           L        1.763    5.341   3.83823254
## 165           L        1.527    2.157  56.00370885
## 166           L        1.982    2.923  36.09305508
## 167           L        1.875    4.359   0.06882312
## 168           L        1.890    6.340   0.25236593
## 169           L        2.277    6.037   1.07669372
## 170           L        1.648    5.127   1.28730252
## 171           L        2.453    5.178  18.38547702
## 172           L        4.374    7.607   4.78506639
## 173           L        3.775    6.709   6.48382769
## 174           L        2.963    6.629   7.82923518
## 175           L        2.642    4.595  24.06964091
## 176           L        2.385    5.413   6.96471458
## 177           L        3.030    4.794   5.67375886
## 178           L        2.977    6.953  11.79347044
## 179           L        3.971    6.916  25.80971660
## 180           L        3.162    6.733   6.00029704
## 181           L        2.911    6.126   8.60267711
## 182           L        3.107    5.147  10.80240917
## 183           L        3.691    5.390  33.06122449
## 184           L        2.939    5.633  14.27303391
## 185           L        3.300    7.230  18.25726141
## 186           L        2.886    5.893  34.29492618
## 187           L        3.375    6.912  13.94675926
## 188           L        2.870    4.840  10.59917355
## 189           L        3.403    6.239  15.98012502
## 190           L        3.013    7.053  13.05827308
## 191           L        2.502    6.042  22.04568024
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange)) + geom_boxplot()

3. Notice that the chart has no color and that the syntax on the axes labels is poor, we can add to our initial code to change these things, run the code chunk below:

ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() + labs( x = "Tide Height", y = "Change in Width (mm)")

4. Notice how the figure now has the proper titles for the axes and the boxes are colored. One thing of note, there is now a figure legend. With respect to the figure legend, we still see the syntax error relating to the column name (we don’t want it to say Tide.Height!) and also, do we really need a figure legend if the x-axis contains the info we need? Run the two code chunks below to see how we can address this issue.

ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() + 
  labs( x = "Tide Height", y = "Change in Width (mm)") +
    theme(legend.position="none")

ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() + 
  labs( x = "Tide Height", y = "Change in Width (mm)", fill = "Tide Height") 

5. We’ve now seen ways in which we can change the legend information. But what if we wanted different colors? What if we didn’t like having the gray in the background? CHALLENGE: recreate the plot above but change the color of the boxes to ANYTHING YOU WANT (as long as there are 2 different colors), also, create the plot with a white background, not a gray one!

ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot(alpha=0.3) + scale_fill_brewer(palette="Dark2") + theme_bw() +
  labs( x = "Tide Height", y = "Change in Width (mm)", fill = "Tide Height")

Using code from these two links will be helpful in accomplishing this! https://r-graph-gallery.com/264-control-ggplot2-boxplot-colors.html http://www.sthda.com/english/wiki/ggplot2-themes-and-background-colors-the-3-elements

PART 3: Practicing with barnacle_data_2.csv, line graphs

In R Lab 5, you all plotted the relationships of two size variables. Let’s see how we can use R to make the scatter plot a bit nicer!

  1. With the “barnacle” data still in R, run the following code:
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + 
    geom_point()

  1. The scatter plot created above does a nice job of showing the data, but let’s show some ways in which we can make it nicer, run the code below:
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + 
    geom_point(color = "darkblue", shape = 3, size = 12 )

3. The scatter plot above included code to change the shape of the points, the size of the points, and the color of the points! Challenge: recreate the code above but change the color to anything you want, the size to anything you want, and the shape to a filled in square. Use the links below to look at color and shape options! http://www.sthda.com/english/wiki/ggplot2-point-shapes https://sape.inf.usi.ch/quick-reference/ggplot2/colour

ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + 
    geom_point(color = "skyblue", shape = 15, size = 8 )

  1. When we created the scatter plot in R lab 5, we want to add our least squares regression line to help describe the fit. We can add this using ggplot without ever needing to run the regression! Run the two codes below to see how we can get this to work:
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + 
    geom_point() +
  geom_smooth(method=lm , color="red", se=FALSE) 
## `geom_smooth()` using formula = 'y ~ x'

ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + 
    geom_point() +
  geom_smooth(method=lm , color="red", fill="#69b3a2", se=TRUE) 
## `geom_smooth()` using formula = 'y ~ x'

  1. The above plots show that we can add the line, change the color, and even have the option of including a standard error visual!

  2. CHALLENGE: create a scatter plot for OperculumAug vs. BasalAug, but change the background to white, change the axes labels so they just say “Operculum (mm)” and “Basal Width (mm)”, the plot should also include the line of best fit and SE.

ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + theme_bw() +
    geom_point() + labs( x = "Operculum (mm)", y = "Basal Width (mm)") +
  geom_smooth(method=lm , color="red", fill="#69b3a2", se=TRUE) 
## `geom_smooth()` using formula = 'y ~ x'

PART 4: Practicing with barnacle_data_2.csv, bar graphs

What you may have noticed about the boxplot you created is that it is a little tough to see patterns and make comparisons as the boxes get flattened with some of the outliers. To create a better visual, you may want to consider plotting the mean values as bars and adding error bars to help visualize variation.

Creating a barplot in ggplot (and in R) is a little bit trickier as it involves a little manipulation. NOTE: there are other ways to create a barplot with means, I am showing you one way

  1. We need to calculate the mean and some measure of variance to plot! We used a code called aggregate() in the past to help calculate summary statistics and store them in a data table, let’s do the same with the barnacle data. The code below will create two separate datasets, one for mean and one for standard deviation:
mean_basal <- aggregate(barnacle$WidthChange, by = list(barnacle$Tide.Height), mean)
sd_basal <- aggregate(barnacle$WidthChange, by = list(barnacle$Tide.Height), sd)

2.We’ve calculated the mean and SD for this data, but they exist in separate tables! It tends to be easier to work with data when they are part of the same data table, the code below is going to merge the two data tables together into a new data table. NOTE: if you recall, the column titles from the aggregate code show up as “Group.1” and “x”, I will need to use these in the merge function below.

total_basal <- merge(mean_basal, sd_basal, by = "Group.1")
  1. The “total_basal” file now has the combined data, the only issue is that the column names are not informative. You can certainly use the output from above without changing anything, but it may be easier to change the names before creating plots. The code below rewrites the column names:
colnames(total_basal) <- c("Tide.Height", "Mean", "SD")
  1. Let’s use this new dataset to create our barplot with error bars:
ggplot(total_basal) +
  geom_bar( aes(x=Tide.Height, y=Mean), stat="identity", fill="skyblue", alpha=0.5) +
  geom_errorbar( aes(x=Tide.Height, ymin=Mean-SD, ymax=Mean+SD), width=0.4, colour="orange", alpha=0.9, size=1.5)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

  1. With the graph above, it is much easier to see patterns/differences in the data. CHALLENGE: create a bar graph for this data that has: thinner error bars, proper axes labels, a white background, and different color bars!
ggplot(total_basal) +
  geom_bar( aes(x=Tide.Height, y=Mean, fill=Tide.Height), stat="identity", alpha=0.5) + scale_fill_manual(values = c("red", "green")) + theme_bw() +
  geom_errorbar( aes(x=Tide.Height, ymin=Mean-SD, ymax=Mean+SD), width=0.1, colour="black", alpha=0.9, size=1.5) +
  labs( x = "Tide Height (mm)", y = "Mean")