Collaborative Coding and Version Control in R with GitHub: A Zero to Hero Guide

Author

George K Agyen

Published

August 1, 2025

Modified

September 19, 2025

Introduction

Why Version Control Matters in R projects

Imaging working on an R script, analysing some very complex data. You spend days cleaning the data and building models and creating some elegant visualisations. Then one small change breaks everything - and you cannot remember what the code looked like two days ago.

Now imaging another scenario where 3 other people are working on the same project, each making changes simultaneously. Without coordination, chaos ensues.

🔁 What is Version Control?

Version control is like a “time machine” for your code. It tracks every change you make, lets you revert to any previous version, and allows multiple people to collaborate without overwriting each other’s work.

Think of it as an office 365 word document on your OneDrive for code, but way more powerful.

🛠️ Git + Github: The Dynamic Duo

Git: This is a local version control system installed on your computer. It manages changes to your files

Github: A cloud platform that hosts your git repositories (repos), enabling collaboration, backup and sharing.

When combined with RStudio, the most popular IDE for R, Git and Github become seamless tools for collaborative data science

💡 Why use Git & GitHub with R

✅Track changes in scripts, data, and reports
✅Collaborate with team-mates without file conflicts
✅Reproduce results by revisiting earlier versions
✅Share packages or research openly

In this tutorial, you will go from zero experience to confidently managing team-based R projects using Git and Github, all within RStudio.

Prerequisites & Setup

Before diving into version control we need to set up our environment

Note

“🧰 You’ll need:”

  • A GitHub account (free at github.com )
  • Administrator access to install software
  • Internet connection

1️⃣ Step1: Install Git

Git must be installed so RStudio can communicate with GitHub.

🔽Download and Install Git

  1. Go to https://git-scm.com

  2. Download Git for your OS (Windows, Mac or Linux)

  3. Run the installer:

    • On windows: Choose ‘Use Git from Windows Command Prompt’ (adds Git to PATH)
    • On macOS: Install Command Line Tools via Terminal (xcode-select --install) or download GUI installer

🔐Verify Installation

Open Terminal (macOS/Linux) or Command Prompt/PowerShell (Windows) and run

git --version

✅Expected output: git version 2.x.x❌If not found: Reinstall Git and ensure ‘Add to PATH’ is selected.

2️⃣ Step 2: Install R & RStudio

  1. Download R from https://cran.r-project.org
  2. Download and install RStudio Desktop from https://posit.co/download/rstudio-desktop/

Install both in order. (It is recommended to Install R before RStudio).

🔍Verify RStudio Git Integration

Open Rstudio –> Go to Tools > Global Options > Git/SVN

✅You should see:

  • “Git executable.” pointing to a path (e.g., /usr/bin/git or C:\Program Files\Git\bin\git.exe)
  • Check box for “Enable version control interface…”

If missing reinstall Git or manually set the path

3️⃣ Step 3: Install Essential R Packages

We shall use modern R tools to simplify Git/GitHub setup.

In RStudio console:

# install required packages

install.packages("usethis", "devtools","gert","credentials", "gitcreds")

These packages helps to automate setting up your Git/GitHub version control seamlessly

  • usethis: Automates common project tasks
  • gert: Lightweight git interface in R
  • credentials & gitcreds: Handles authentication securely
  • devtools : For advanced package workflows
Tip

“Restart R session after installing packages”

Initial Git & GitHub Configuration

Now have installed and verified Git, R and RStudio. Let’s configure Git globally and authenticate with GitHub.

👨🏽‍💼 Set Your Identity

Git needs your name and email to label commits. Use usethis to set this globally

library(usethis)

use_git_config(
  user.name = "Your Name",
  user.email = "your.email@example.com",
)

Replace with your real name and the email linked to your Github account

Tip

“🔗Pro Tip: Use the same email as GitHub to ensure commits are properly attributed.”

Verify with:

# verify identity details
gh_token_help()

# OR
git_sitrep()

🔐Generate SSH Key for Secure Access (I do not recommend this)

Instead of typing your password every time you want to connect to github, we will use an SSH key - a secure digital key pair. Run this

library(credentials)

ssh_setup_github()

This will :

  1. Check if you already have an SSH key
  2. Create one if not. The created key will be saved in ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub.
  3. Copy the public key to clipboard

Add SSH Key to GitHub

  1. Go to https://github.com/settings/keys
  2. Under the SSH Keys, click on "New SSH Key"
  3. Paste the public key under the Key box area (starts with ecdsa-sha2 ...)
  4. Give it a title e.g., “My Laptop for Rstudio”
  5. Click "Add SSH key"
Warning

“Never share your private key (id_ecdsa) – only the public one (id_ecdsa.pub) goes online.

Creating & Cloning Repositories

Now we will create a new project and connect them to GitHub

🆕 Option1: Create a New R Project with Git

# choose a directory
proj_path <- file.path('~', "my_first_git_project") # ~ is your root user directory

# Create project + initialise Git
create_project(proj_path) # creates and opens a new R project

If you are prompted that: “This directory is not a Git repository. Create On” –> Select Yes.

Otherwise, to create a git repo out of you newly created project type this in the RStudio console of the newly created project "my_first_git_project".

usethis::use_git()

This will initialise your project as a git repository and prompt you to commit any file/folder found in the project directory.

Select the option that lets you commit. You have to do this in order to be able to push it to GitHub. Restart RStudio to to activate the git pane

You now have:

  • A local Git repo
  • An .Rproj file
  • RStudio’s Git pane visible (in the environments pane)

📤 Option2: Clone an Existing Repository

You can also clone a GitHub repository into your local environment. Example: Clone a public repo

repo_url <- "https://github.com/rstudio/learnr"
local_path <- file.path(tempdir(), 'learnr-demo')

usethis::create_from_github(repo_url, destdir = local_path)

This clones the repo, opens it as an R project and then sets up Git remotes.

Exercise: Try cloning your own GitHub repo or a friend’s. Make a tiny edit and try to push

Basic Git Workflow in RStudio

Time to make your first collaborative change!

🔄️ The Git Commit Cycle: Stage > Commit > Push

Let’s simulate a typical workflow.

📝Make a Change

In the newly created project directory

  1. Open the .Rproj file

  2. Create a new R script: analysis.R

  3. Add these lines of code to the script:

    # First analysis script
    data(mtcars)
    summary(mtcars)
    plot(mtcars$wt, mtcars$mpg)
  4. Save the file.

🟨Stage & Commit in RStudio

Look at the Git Pane in RStudio. You will see analysis.R listed with status ‘untracked’ (hover mouse over yellow question mark symbol).

Click the checkbox next to it – moves to staged (now ready to commit). Click commit. In the commit message box, type:

Add initial mtcars analysis script

Click commit again (the analysis.R file disappears) and close the dialogue box.

The above process can be done programmatically with the gert package

library(gert)

#check overview of staged and unstaged files
git_status()

This will show all files that are unstaged or not monitored by Git. You should see analysis.R file. Stage the file like this

# stage a specific file
git_add("analysis.R")

# stage all files
git_add(".")

# check status again after staging
git_status() # this should return an empty tibble

Now we can commit the staged changes with a message

git_commit("Add initial mtcars analysis script")
Important

“Best Practice: Write clear concise messages like”Fix typo” or “Add regression model” when committing.”

Tip

“Use gert::git_reset_mixed to unstage a file/s”

🚀Push to GitHub

Once the new analysis.R file has been committed, you push it to reflect in the Github repo. Click Push (up arrow icon) in the Git Pane. RStudio sends your commit to GitHub.

Go to your repo online - you will see analysis.R!

🔍Check Status & History

Use gert to explore from R

library(gert)
# check overview of staged and unstaged files
git_status()

# view commit history
git_log

sample output

Note

“Analogy: Each commit is a snapshot in a photo album. Pushing uploads the album to the cloud.”

🖊️Exercise: Try It Yourself

  1. Create a new file called explore.R:
  2. Use only gert commands to:
    • Add the file to staging
    • Commit it with the message "Add exploration script"

Branching and Collaboration

Now imagine two team mates: You want to test a new model, while your colleague updates the report. How do you avoid stepping on each other’s toes?

🌿Enter: Git Branches

Branches are like parallel universes for your code. You can experiment safely without breaking the main version

➕ Create a New Branch

We can create and switch to a new branch using the git_branch_create() function from gert. This function creates a branch from your current commit but you can equally branch from a specific commit by specifying a value (the commit hash) for the ref argument

# create and switch to a new branch
git_branch_create("feature/new-model") 

# check if you are in a branch
git_branch()

Now we can make changes freely and they will not affect the main branch. Let us edit the analysis.R file by adding these lines of code

# Try Linear model
model <- lm(mpg ~ wt + hp, data = mtcars)
summary(model)

Save the file, stage and commit with the message “Add multiple regression model”. Do not push yet

Tip
# create a branch from a commit hash 
commits <- git_log()
hash1 <- commits$commit[2] # select last but one commit hash

# create a new branch with commit hash
git_branch_create('fix-bug', ref = hash1)

🔄️Sync with Remote

Once we have create a local git branch we can tell Git to create that same branch on GitHub using git_push. Think of git_push() as the command you use to send your work from your computer to the remote server (in this case, GitHub). When you push a branch that doesn’t exist on GitHub yet, Git creates it for you automatically.

git_push(set_upstream = TRUE)

Setting set_upstream = TRUE tells Git: “Push this branch, and from now on, remember that the local branch named my-branch is connected to the remote branch named my-branch. You only need to do this the first time you push a new branch.

sample output

Now your team mates can see and review your work

🤝Teammate Makes Changes on Main

Meanwhile, another person from the team updated README.md on the main branch and pushed

To get those changes, you can pull from the remote (GitHub).

# pull recent changes on GitHub
git_pull()

But wait! we are on the "feature/new-model" branch. Won’t that create a conflict? No! Everything will workout just fine 😁.

Pulling updates your current branch with latest from its remote counterpart.

To update your local main branch, switch back to your main version (which is usually called main or sometimes master), you will use the git_branch_checkout() function.

This command tells Git to switch your working directory to the state of that branch.

# switch to main branch
git_branch_checkout("master") 

# update local main branch
git_pull()  
Important

“Before you switch branches, make sure you have committed any work you want to save on your current branch. If you have uncommitted changes, Git will usually prevent you from switching branches to avoid losing your work”

🔄️Merge Back When Ready

Once your model is approved by all members of the team, you can merge the branch to the main/master branch

The git_merge() function is what you use to combine the changes from one branch into another. The key thing to remember is that you always merge into the branch you are currently on.

Before you merge, it’s crucial to follow these steps to avoid issues.

  1. Switch to the main branch: You must be on the main branch because this is the branch you are merging into.
  2. Pull the latest changes from GitHub: This ensures your local main branch is up to date with the remote main branch. This is an important step to prevent conflicts
  3. Run the git_merge() command: Now you can run the merge command using the name of the branch you want to merge e.g., (‘feature/new-model’)
# merge branch to master
git_merge("feature/new-model") 

# push merged changes
git_push()

After running this, your main branch will have all the changes from the feature/new-model branch 🎉. If the merge is successful, Git will automatically create a merge commit, and you will see the updates in your files.

❌Handling Merge Conflicts

Merging conflicts often happen when team members edit the same line of code within a script file. For instance in the analysis.R file team mate 1 edits the line of code containing the linear model and team mate 2 also edits the same line.

HEAD
summary(lm(mpg ~ wt, data = mtcars))
=======
summary(lm(mpg ~ wt + hp, data = mtcars))
feature/new-model

You have to resolve this manually which you can do in RStudio by editing the line of code in the file

  • Open the file
  • Choose which version to keep (or combine)
  • Delete <<<<<<<<<<, ==============, >>>>>>>>>>> markers
  • Save —> Stage —> Commit
git_add(".")
git_commit("Resolve merge conflict")
Tip

“Analogy: Two people editing a Google Doc at once. The editor highlights conflicts – you decide the final text.”