BIOL 204L Spring 2026

Lab 2 - Bar Charts in ggplot2

There is an R script associated with these instructions in your section’s shared Posit Cloud Workspace. Please open this R script and as you go through the following material, complete the following exercises. The background information and examples include code that you can copy into your own R script. Make sure to #comment all of your code.

Uploading .csv Files as Data Frames

For your homework, you used a data frame that was built into R. All of the data that we collect in lab, will need to be uploaded into R. Most often we will be compiling data into a shared Google sheet. This sheet will need to downloaded as a .csv file then uploaded into R using the read.csv() function. As with all character data or object names in R, your file names need to have no spaces or special characters.

Installing and Loading Packages for Today: Step 1

You have already installed ggplot2 in your Posit Cloud packages library. Today we will be using another package called dplyr. We will need to install and load this package before we begin.

install.packages("dplyr") #install dplyr package
library(dplyr) #load dplyr
library(ggplot2) #load ggplot2

dplyr is a widely used package designed to make data manipulation and transformation of tabular data faster, easier, and more intuitive. Today we will be using it to calculate mean and standard error of our data in a format that is easy to plot in ggplot2. More information about this package and helpful resources such as a cheat sheet for functions can be found here.

Uploading Your Own Data: Step 2

For today, in the files tab, where you have been accessing these assignments, there will be a file called algae_absorbance_data.csv. To import this into R as a data frame you can either click on the file and import it or use the read.csv() function.

OR

algae_absorbance_data = read.csv("algae_absorbance_data.csv")

Now this data frame should be stored in your environment. Click on it to view how it’s structure.

# View the data frame
algae_absorbance_data

Calculate the Mean and Standard Error to Plot: Step 3

For all of the bar plots that we create this semester, we will be adding error bars that represent the standard error (SE) of the mean. In order to add this data to our plots, we first need to calculate it and store it in a separate data frame that is formatted for easy plotting.

The group_by() function in dplyr can be very useful for calculating statistics for groups or categories of data, in this case species.

Think of group_by() as creating separate groups of your data. When we use group_by(Species), R separates all the Chlorella rows into one pile and all the Chlamydomonas rows into another pile. Then, when we use summarise(), R performs the calculations separately on each group and gives us one result per group. So we get one mean and one SE for Chlorella, and one mean and one SE for Chlamydomonas.

We are storing this data in a new data frame called summary_stats.

# Calculate means and SE for each species
summary_stats <- algae_absorbance_data %>%
  group_by(Species) %>%
  summarise(
    mean = mean(Absorbance),
    se = sd(Absorbance) / sqrt(length(Absorbance))
  )

Notice how we used a %>% in the above calculations. This is a dplyr package feature called a pipe. The pipe (%>%) means “and then.” It takes the result from the left side and feeds it into the function on the right side. It helps us write code that reads like a sentence.

Click on the newly created summary_stats data frame to see its contents

# View the summary
summary_stats
# A tibble: 2 × 3
  Species        mean     se
  <chr>         <dbl>  <dbl>
1 Chlamydomonas 0.513 0.0808
2 Chlorella     0.853 0.0985

The numbers in this data frame should match the numbers you calculated last week.

Create a Bar Plot with Error Bars: Step 4

Now we can create a barplot in ggplot2 using this data. Rather than using the geom_point() function, we will be using the geom_col() function. ggplot2 has a built in function to create error bars: geom_errorbar(). We want our error bars to be both positive and negative, so we place the ymin and ymax arguments one standard error away from the mean (mean +/- se).

# Create bar plot with error bars
ggplot(summary_stats, aes(x = Species, y = mean, fill = Species)) +
  geom_col() +
  geom_errorbar(aes(ymin = mean - se, ymax = mean + se), 
                width = 0.2) +
  labs(y = "Absorbance", x = "Species") +
  theme_classic()

We used a lot of new code today, so don’t worry if it feels like a lot to absorb at once! Next week’s homework will be focused on code interpretation to help reinforce what each part does. For now, focus on getting the code to run and understanding the overall goal: calculating standard error and creating professional-looking plots with error bars.