Here are some helpful links for understanding the structure of coding in ggplot:
library(ggplot2)
In R lab 5, you worked with the barnacle dataset to look at how tide height influenced size and how two size measurements related to one another. I had you graph these things with the basic visualization tools, but let’s try and make them a bit fancier using ggplot2
Please perform the tasks below: 1. Load the “barnacle_data_2.csv” file, call it “barnacle” 2. Run the code chunk below, this will create a boxplot:
barnacle <- read.table('barnacle_data_2.csv',sep=',',header=T)
barnacle
## Tide.Height OperculumAug BasalAug WidthChange
## 1 H 5.258 7.098 68.01916033
## 2 H 5.615 9.256 18.65816768
## 3 H 5.483 8.289 0.97719870
## 4 H 5.065 11.578 6.36552081
## 5 H 5.229 9.563 12.21374046
## 6 H 5.645 11.291 7.41298379
## 7 H 6.367 8.750 1.06285714
## 8 H 5.740 11.798 6.32310561
## 9 H 5.714 9.890 8.24064712
## 10 H 3.821 6.072 45.25691700
## 11 H 3.642 6.293 1.43016050
## 12 H 3.971 8.611 3.79746835
## 13 H 5.323 8.826 26.50124632
## 14 H 4.708 7.784 16.00719424
## 15 H 3.472 6.029 22.64057058
## 16 H 3.235 5.521 42.25683753
## 17 H 3.881 5.820 30.20618557
## 18 H 1.848 3.368 89.60807601
## 19 H 4.865 6.001 40.19330112
## 20 H 5.073 5.813 12.98813005
## 21 H 2.857 3.483 7.69451622
## 22 H 4.771 8.944 12.94722719
## 23 H 4.422 6.262 16.33663366
## 24 H 4.384 7.715 3.30524951
## 25 H 4.898 6.371 3.32757809
## 26 H 4.722 6.874 7.53564155
## 27 H 4.261 7.039 14.98792442
## 28 H 4.604 7.817 9.49213253
## 29 H 4.558 8.892 16.64417454
## 30 H 5.303 8.467 1.29916145
## 31 H 4.723 9.030 6.36766334
## 32 H 3.758 9.147 8.54925112
## 33 H 4.908 8.703 32.64391589
## 34 H 4.361 9.779 19.20441763
## 35 H 4.348 10.218 15.02250930
## 36 H 5.249 10.126 15.40588584
## 37 H 3.956 6.552 11.75213675
## 38 H 4.457 8.149 18.33353786
## 39 H 4.164 8.804 15.54975011
## 40 H 4.088 9.363 0.58741856
## 41 H 3.656 8.790 5.77929465
## 42 H 3.755 8.400 5.83333333
## 43 H 4.026 8.218 8.42054028
## 44 H 3.663 5.287 94.15547570
## 45 H 4.284 6.053 30.81116802
## 46 H 4.672 9.284 7.71219302
## 47 H 3.911 4.594 30.40922943
## 48 L 3.470 10.315 3.46097916
## 49 L 3.215 6.568 21.26979294
## 50 L 3.957 6.979 34.46052443
## 51 L 3.936 7.290 21.38545953
## 52 L 3.585 7.513 32.31731665
## 53 L 3.778 7.208 37.19478357
## 54 L 4.316 10.376 13.97455667
## 55 L 2.288 5.088 55.09040881
## 56 L 2.874 6.361 20.40559660
## 57 L 4.518 9.353 23.69293275
## 58 L 4.056 6.387 97.96461563
## 59 L 3.256 4.375 32.96000000
## 60 L 4.436 7.321 36.64799891
## 61 L 3.643 7.522 68.67854294
## 62 L 4.566 6.641 68.31802439
## 63 L 4.213 9.063 14.17852808
## 64 L 4.324 7.733 19.28100349
## 65 L 3.185 5.127 58.31870490
## 66 L 3.725 6.541 44.10640575
## 67 L 4.116 9.121 25.60026313
## 68 L 3.921 6.626 70.55538787
## 69 L 4.979 6.601 10.37721557
## 70 L 2.364 6.376 6.80677541
## 71 L 4.557 9.280 19.11637931
## 72 L 4.576 7.057 42.39761939
## 73 L 3.583 7.816 26.30501535
## 74 L 3.693 6.758 42.14264575
## 75 L 3.823 7.797 19.72553546
## 76 L 4.175 6.908 46.56919514
## 77 L 3.328 6.613 34.59851807
## 78 L 3.654 5.344 55.03368263
## 79 L 3.686 7.859 8.57615473
## 80 L 3.533 5.623 37.86235106
## 81 L 3.686 9.099 9.42960765
## 82 L 4.994 6.844 66.17475161
## 83 L 3.100 7.648 26.15062762
## 84 L 3.083 5.081 26.51052942
## 85 L 2.828 5.793 47.00500604
## 86 L 3.687 6.926 38.75252671
## 87 L 4.464 8.429 12.43326611
## 88 L 4.864 8.745 24.64265294
## 89 L 2.037 3.750 108.72000000
## 90 L 5.148 6.868 24.24286546
## 91 L 4.257 5.239 87.57396450
## 92 L 3.977 5.342 53.80007488
## 93 L 4.934 7.455 73.72233400
## 94 L 4.966 5.677 19.88726440
## 95 L 4.376 5.645 22.90522586
## 96 L 4.022 6.343 7.11020022
## 97 H 2.384 6.582 5.49984807
## 98 H 3.128 5.800 18.65517241
## 99 H 3.553 6.453 9.29800093
## 100 H 2.941 7.115 3.17638791
## 101 H 3.146 6.465 5.35189482
## 102 H 2.529 6.094 9.92779783
## 103 H 2.915 4.429 4.01896591
## 104 H 3.541 6.797 2.16271885
## 105 H 3.425 3.791 18.93959377
## 106 H 4.016 7.637 12.89773471
## 107 H 3.388 7.166 2.73513815
## 108 H 3.244 6.179 11.02120084
## 109 H 4.471 8.479 4.03349452
## 110 H 2.721 5.915 4.64919696
## 111 H 3.434 6.937 12.90183076
## 112 H 3.390 5.634 1.77493788
## 113 H 3.811 8.301 0.69871100
## 114 H 3.058 6.829 6.97027383
## 115 H 2.996 7.893 5.38451793
## 116 H 4.033 10.004 0.58976409
## 117 H 3.271 7.576 0.19799366
## 118 H 2.459 5.039 10.00198452
## 119 H 2.848 5.411 7.54019590
## 120 H 2.355 4.924 9.44354184
## 121 H 3.226 4.617 4.80831709
## 122 H 2.908 5.836 0.97669637
## 123 H 3.438 6.758 8.34566440
## 124 H 3.102 7.255 3.37698139
## 125 H 2.905 5.495 6.31483166
## 126 H 2.899 4.488 32.10784314
## 127 H 3.664 4.989 6.41411104
## 128 H 3.953 5.095 10.08832188
## 129 H 3.598 4.718 9.11403137
## 130 H 3.712 5.000 7.10000000
## 131 H 3.676 5.525 8.30769231
## 132 H 3.814 4.937 6.94753899
## 133 H 4.335 6.064 8.04749340
## 134 H 2.721 5.204 2.01767871
## 135 H 3.482 6.168 10.68417639
## 136 H 3.303 7.058 1.50184188
## 137 H 3.404 7.095 2.42424242
## 138 H 2.901 5.130 4.69785575
## 139 H 3.687 5.945 13.45668629
## 140 H 3.338 6.304 14.87944162
## 141 H 3.770 5.416 4.33899557
## 142 H 2.937 6.022 1.77681833
## 143 H 2.711 6.399 1.40646976
## 144 L 2.469 5.757 22.14695154
## 145 L 2.647 5.063 3.00217263
## 146 L 2.806 4.918 12.99308662
## 147 L 2.790 4.488 28.14171123
## 148 L 2.245 5.398 11.98592071
## 149 L 2.118 4.392 32.74134791
## 150 L 2.735 5.837 12.78053795
## 151 L 1.824 4.437 41.04124408
## 152 L 3.062 5.167 1.43216567
## 153 L 2.146 4.936 14.12074554
## 154 L 1.572 3.365 53.87815750
## 155 L 1.839 3.950 23.69620253
## 156 L 0.533 0.815 447.36196320
## 157 L 2.175 4.860 26.37860082
## 158 L 1.954 4.473 22.11044042
## 159 L 1.974 5.062 20.20940340
## 160 L 1.667 4.271 23.46054788
## 161 L 1.476 4.098 26.32991703
## 162 L 1.771 5.867 22.75438896
## 163 L 1.698 4.231 17.08815883
## 164 L 1.763 5.341 3.83823254
## 165 L 1.527 2.157 56.00370885
## 166 L 1.982 2.923 36.09305508
## 167 L 1.875 4.359 0.06882312
## 168 L 1.890 6.340 0.25236593
## 169 L 2.277 6.037 1.07669372
## 170 L 1.648 5.127 1.28730252
## 171 L 2.453 5.178 18.38547702
## 172 L 4.374 7.607 4.78506639
## 173 L 3.775 6.709 6.48382769
## 174 L 2.963 6.629 7.82923518
## 175 L 2.642 4.595 24.06964091
## 176 L 2.385 5.413 6.96471458
## 177 L 3.030 4.794 5.67375886
## 178 L 2.977 6.953 11.79347044
## 179 L 3.971 6.916 25.80971660
## 180 L 3.162 6.733 6.00029704
## 181 L 2.911 6.126 8.60267711
## 182 L 3.107 5.147 10.80240917
## 183 L 3.691 5.390 33.06122449
## 184 L 2.939 5.633 14.27303391
## 185 L 3.300 7.230 18.25726141
## 186 L 2.886 5.893 34.29492618
## 187 L 3.375 6.912 13.94675926
## 188 L 2.870 4.840 10.59917355
## 189 L 3.403 6.239 15.98012502
## 190 L 3.013 7.053 13.05827308
## 191 L 2.502 6.042 22.04568024
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange)) + geom_boxplot()
3. Notice that the chart has no color and that the syntax on the axes
labels is poor, we can add to our initial code to change these things,
run the code chunk below:
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() + labs( x = "Tide Height", y = "Change in Width (mm)")
4. Notice how the figure now has the proper titles for the axes and the
boxes are colored. One thing of note, there is now a figure legend. With
respect to the figure legend, we still see the syntax error relating to
the column name (we don’t want it to say Tide.Height!) and also, do we
really need a figure legend if the x-axis contains the info we need? Run
the two code chunks below to see how we can address this issue.
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() +
labs( x = "Tide Height", y = "Change in Width (mm)") +
theme(legend.position="none")
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot() +
labs( x = "Tide Height", y = "Change in Width (mm)", fill = "Tide Height")
5. We’ve now seen ways in which we can change the legend information.
But what if we wanted different colors? What if we didn’t like having
the gray in the background? CHALLENGE: recreate the plot above but
change the color of the boxes to ANYTHING YOU WANT (as long as there are
2 different colors), also, create the plot with a white background, not
a gray one!
ggplot(barnacle, aes(x = Tide.Height, y = WidthChange, fill = Tide.Height)) + geom_boxplot(alpha=0.3) + scale_fill_brewer(palette="Dark2") + theme_bw() +
labs( x = "Tide Height", y = "Change in Width (mm)", fill = "Tide Height")
Using code from these two links will be helpful in accomplishing this! https://r-graph-gallery.com/264-control-ggplot2-boxplot-colors.html http://www.sthda.com/english/wiki/ggplot2-themes-and-background-colors-the-3-elements
In R Lab 5, you all plotted the relationships of two size variables. Let’s see how we can use R to make the scatter plot a bit nicer!
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) +
geom_point()
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) +
geom_point(color = "darkblue", shape = 3, size = 12 )
3. The scatter plot above included code to change the shape of the
points, the size of the points, and the color of the points! Challenge:
recreate the code above but change the color to anything you want, the
size to anything you want, and the shape to a filled in square. Use the
links below to look at color and shape options! http://www.sthda.com/english/wiki/ggplot2-point-shapes
https://sape.inf.usi.ch/quick-reference/ggplot2/colour
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) +
geom_point(color = "skyblue", shape = 15, size = 8 )
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) +
geom_point() +
geom_smooth(method=lm , color="red", se=FALSE)
## `geom_smooth()` using formula = 'y ~ x'
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) +
geom_point() +
geom_smooth(method=lm , color="red", fill="#69b3a2", se=TRUE)
## `geom_smooth()` using formula = 'y ~ x'
The above plots show that we can add the line, change the color, and even have the option of including a standard error visual!
CHALLENGE: create a scatter plot for OperculumAug vs. BasalAug, but change the background to white, change the axes labels so they just say “Operculum (mm)” and “Basal Width (mm)”, the plot should also include the line of best fit and SE.
ggplot(barnacle, aes(x=OperculumAug, y=BasalAug)) + theme_bw() +
geom_point() + labs( x = "Operculum (mm)", y = "Basal Width (mm)") +
geom_smooth(method=lm , color="red", fill="#69b3a2", se=TRUE)
## `geom_smooth()` using formula = 'y ~ x'
What you may have noticed about the boxplot you created is that it is a little tough to see patterns and make comparisons as the boxes get flattened with some of the outliers. To create a better visual, you may want to consider plotting the mean values as bars and adding error bars to help visualize variation.
Creating a barplot in ggplot (and in R) is a little bit trickier as it involves a little manipulation. NOTE: there are other ways to create a barplot with means, I am showing you one way
mean_basal <- aggregate(barnacle$WidthChange, by = list(barnacle$Tide.Height), mean)
sd_basal <- aggregate(barnacle$WidthChange, by = list(barnacle$Tide.Height), sd)
2.We’ve calculated the mean and SD for this data, but they exist in separate tables! It tends to be easier to work with data when they are part of the same data table, the code below is going to merge the two data tables together into a new data table. NOTE: if you recall, the column titles from the aggregate code show up as “Group.1” and “x”, I will need to use these in the merge function below.
total_basal <- merge(mean_basal, sd_basal, by = "Group.1")
colnames(total_basal) <- c("Tide.Height", "Mean", "SD")
ggplot(total_basal) +
geom_bar( aes(x=Tide.Height, y=Mean), stat="identity", fill="skyblue", alpha=0.5) +
geom_errorbar( aes(x=Tide.Height, ymin=Mean-SD, ymax=Mean+SD), width=0.4, colour="orange", alpha=0.9, size=1.5)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
ggplot(total_basal) +
geom_bar( aes(x=Tide.Height, y=Mean, fill=Tide.Height), stat="identity", alpha=0.5) + scale_fill_manual(values = c("red", "green")) + theme_bw() +
geom_errorbar( aes(x=Tide.Height, ymin=Mean-SD, ymax=Mean+SD), width=0.1, colour="black", alpha=0.9, size=1.5) +
labs( x = "Tide Height (mm)", y = "Mean")