Today, you’ll learn how to get started with R Projects - a powerful tool for keeping your work organized and reproducible. As your projects grow, with multiple inputs, scripts, and outputs, managing files can become overwhelming.
R Projects create a structured, self-contained working directory with consistent file paths, environments, and settings. Starting with R Projects from the outset helps you stay organized and work efficiently as your research progresses.
👎 What’s wrong with this?
🤷🏻♀️ Any positives?
Put simply, an R Project is just a folder (directory) on your
machine that organizes all your files for a specific project.
In R studio…
This will open a new R session and set your working directory to the
folder that you just created. You can use the getwd()
function to double check that it’s set correctly.
getwd()
## [1] "/Users/ceciliacerrilla/My Drive/Employment/Freelance ecology/2025/UCT/Data clinic/R Projects/Intro-to-R-Projects"
Now it’s time to set up your file management system outside
of R to keep your workflow neat and tidy moving forward.
How you choose to organize your files is entirely up to you, but
I’ll show you how I do it. You may find it useful to start with this
setup as a template and adjust it to suit your needs.
Within my R project folder, I have the following subfolders:
And the following file:
R_project_name.Rproj
^ This is automatically generated when you create an R project. To open an R project, simply double-click this file.
You can create these folders just like you would any other folders on your machine (you do not need to do this within R).
This is where all of my data goes. In my case, it’s Excel files. When you want to load a particular dataset in R studio, you will know the exact file path to call, because all of your data will be stored in the input folder within your R project folder (directory).
This is how you would access data from your input folder:
# df_name <- read_excel("input/data_file_name.xlsx")
readxl()
package is installed and loaded.
If it isn’t, install it, then run this code (without the #):# library(readxl)
This is where all of your R scripts are stored.
This is where your exported output files will be stored. In my case, my only output files were figures (PDF and JPG files). However, you might be exporting manipulated datasets, tables, or other types of output files, and you can house them here as well.
Example: PhD Chapter 5 R Project
Now that we know how to create and manage R Projects, let’s dive
into working with R scripts.
Whenever I start a new R script, I include a script header with important information about the script. Below is an example of what a header on one of my scripts looks like. Feel free to use this as a template and adjust it as needed.
## TITLE: 05_Returns_Data-prep
## PURPOSE: Prepare PIT tag antenna data for analysis & produce summary stats
## AUTHOR: Cecilia Cerrilla (cecilia.cerrilla@gmail.com)
## DATE STARTED: 6-Dec-2022
## DATE LAST UPDATED: 19-Sep-2024
## Libraries
# library(tidyverse)
# library(readxl)
# library(ggpubr)
# library(janitor)
# library(lubridate)
# library(ggplot2)
# library(openxlsx)
## Master data files
# releases <- read_excel("input/PIT_antenna_hits.xlsx")
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
read_excel()
function in the
## Libraries section
, with the library(readxl)
call.Let’s finish by looking at how to properly export files to the correct folders. You’ve already seen how to access files in your input folder by specifying the correct file path (refer to the Master data files bullet under the Script Header section). Now, let’s do the reverse—export files to the appropriate directories.
By saving outputs (such as cleaned datasets, figures, or reports) in designated folders, you maintain an organized workflow and ensure your files are easy to locate.
Sidenote: It’s up to you which packages you use to
export files, but I like to do most of my coding within the
Tidyverse. The Tidyverse is a collection of R packages
designed for data science, emphasizing a consistent and streamlined
approach to data manipulation, visualization, and analysis. It provides
a coherent set of tools that follow a shared philosophy, making it
easier to work with data in an intuitive and readable way. At its core,
the Tidyverse is built around the idea of tidy data, where each
variable is in its own column, each observation is in its own row, and
each value has its own cell. This structure makes data easier to
manipulate and visualize. When you load the Tidyverse
library(tidyverse)
, you get access to several essential
useful packages. My favourites include:
ggplot2
– for data visualizationdplyr
– for data manipulation (e.g., filtering,
summarizing, grouping)Next, I’ll show you how to export figures using the
ggsave()
function, which is part of the
ggplot2
package (a key component of the
Tidyverse).
Load the ggplot2 package if you haven’t done so already in your Script header
library(ggplot2)
Let’s use the mtcars built-in data frame from R for our example. No need to load it as it’s built-in. First, let’s take a look at what the data look like:
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Now, let’s create a scatter plot to visualize the relationship between fuel efficiency (mpg) and horsepower (hp). Name this plot “cars_scatterplot”, and view the plot.
cars_scatterplot <- ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point()
cars_scatterplot # View the plot
Finally, save the plot using the ggsave()
function.
#ggsave(plot = cars_scatterplot,
#filename = "01_cars_mpg_hp_scatterplot.pdf",
#path = "output/figures/cars",
#width = 30, height = 22, units = "cm")
Understanding the ggsave()
function and its
arguments:
plot
argument.filename
argument. The
file type (e.g., .pdf, .jpg, .png) is determined by the suffix you
choose.path
argument specifies where your figure will be
saved. Make sure it’s directed to the correct folder (or
subfolder).width
, height
, and units
arguments control the dimensions of your plot. Feel free to play around
with these until you end up with your desired size.