Estimated Time to Complete: 60 minutes
GitHub caps file sizes at 100 MB, which makes it less ideal for storing certain kinds of files (e.g., geospatial datasets, images, or zipped archives). Most project files will be smaller than this, but we still want to be able to:
Keep version control on large files
Make them easily shareable
Ensure they are accessible through the local repository clones
To achieve this, we’ll use Box LFS. This is an R package that works similarly to Git LFS. It links large files stored on Box with GitHub so that Git maintains versioning, while the actual files live outside GitHub’s storage limits.
How it works:
Large files are stored on Box, inside a
box-lfs folder within your project.
GitHub only stores .boxtracker pointer
files, which record the file’s location and version
history.
When working in a GitHub repository, Box LFS makes sure your local copy has the correct large files.
With Box LFS:
If Box Drive is installed:
Otherwise you will still manually download and upload files to Box.
In the next lesson we’ll go over some tricks to avoid needing to use Box LFS.
We’ll use the example WWS-TEST-Box-LFS. It’s similar to the WWS-TEST-example-repo, but now includes a large file tracked with Box LFS.
Open the repo and compare its structure to WWS-TEST-example-repo.
Notice the new box-lfs folder and the files inside.
Notice that the actual file names are only stored within the
.boxtracker files, the actual file names are a string of
letters and numbers called hashes
Open a .boxtracker file to see how GitHub stores
“pointers” to large files.
Confirm that the large file itself is not in GitHub. It lives in Box instead.
Check the README: it tells you Box LFS is in use.
First, install the Box-LFS R package:
remotes::install_github("wildfire-water-security/WWS-box-lfs")Then clone the repo as usual in GitHub Desktop:
File → Clone Repository (CTRL + SHIFT + O)
Choose WWS-TEST-Box-LFS
Set the local path:
C:\Users\your username\Documents
Click Clone
Look in the data folder for the large file, is it
there?
Now use Box-LFS to fetch the large files:
library(blfs)
dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
clone_repo_blfs(dir = dir, download=NULL)You’ll get a message that the large files have been fetched from Box and put in your repository.
Check the data folder. Notice the new
large-file1.docx. Box-LFS has grabbed the
file from Box and restored it into the correct location.
Now let’s practice collaborative editing with Box LFS.
Partner 1
Create a new branch with your names.
Open example-csv.csv, make a change, don’t
commit or push yet.
Run Box LFS push_repo_blfs to check for updates:
dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
push_repo_blfs(dir = dir)
Now push your changes in GitHub Desktop.
Partner 2
Pull Partner 1’s changes in GitHub Desktop.
Run Box LFS pull_repo_blfs to check for updates:
dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
pull_repo_blfs(dir = dir, download = NULL)Make your own changes:
Edit large-file1.docx.
Save and close.
Go back to Github Desktop, did it recognize this file was changed?
Run Box LFS push check to see if the file changed:
dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
push_repo_blfs(dir = dir)
You’ll get a message that large files have been synced with Box.
In GitHub Desktop you’ll see the .boxtracker file
was updated.
Commit and push the changes.
.boxtracker in the commit message.Note: Box LFS doesn’t support branches yet. Uploading overwrites the file in Box for all branches, but older versions are still stored as history.
Up till now we’ve been working with pre-made Git repositories. Let’s create a new one and use Box LFS to start tracking any large files in the repo. This can be done in two ways:
From the start, before you have any files
From an existing folder
Work with a partner to try both ways.
Method 1: Brand New Project
Go to the WWS GitHub organization.
Click the green New repository button (top right).
Name your repo using the format: WWS-TEST-New-yournames
Add a brief description (what the project is about).
Choose Public or Private. (Private = only visible within WWS organization.)
Check “Initialize with a README.” This will be your repo’s welcome page with links, notes, and contact info.
Click the green Create repository button.
Clone the repo to your documents folder
Make it an R Project, in RStudio, run:
#install.package("usethis") #only need to run the first time
library(usethis)
#the location you of your project folder
names <- "katie" #replace with your names
dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-New-", names))
#create a project from the directory
usethis::create_project(dir)Copy the files from WWS-TEST-Box-LFS/data into the
repository
Start tracking large files with Box-LFS:
names <- "Katie" #replace with your names
#new
dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-New-", names))
new_repo_blfs(dir = dir, size = 10) #the 10 indicates the files greater than 10 MB will be tracked
You’ll be prompted for the Box path the project folder. Find this by navigating to where you want the large files stored.
Wildfire_Water_Security/02_Nodes/01_Empirical/06_Projects/data-management/data-management-workshop/WWS-Test-Existing-yournamesMethod 2: Existing Project
Create a folder in C:/Users/your username/Documents
called WWS-TEST-Existing-yournames
Copy the files from WWS-TEST-Box-LFS/data into the
folder
Make into an R project
#install.package("usethis") #only need to run the first time
library(usethis)
#the location you of your project folder
names <- "katie" #replace with your names
dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-Existing-", names))
#create a project from the directory
usethis::create_project(dir)In the new RStudio window that opens, initialize Git in the project.
usethis::use_git()
When prompted, do not create the initial commit, we’ll do that manually in a bit.
Add to GitHub Desktop
In GitHub Desktop: File → Add local repository
(CTRL + O)
Select your project folder and click Add Repository.
You’ll now see your files listed in GitHub Desktop.
Publish the repo to GitHub
In GitHub Desktop, click Publish repository (top right).
Name your repo using the format: WWS-TEST-Existing-yournames
Click Publish repository (blue button).
Initialize Box LFS by running the following code after updating the folder name:
names <- "Katie" #replace with your names
#existing
dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-Existing-", names))
new_repo_blfs(dir = dir, size = 10) #the 10 indicates the files greater than 10 MB will be tracked
You’ll be prompted for the Box path the project folder. Find this by navigating to where you want the large files stored.
Wildfire_Water_Security/02_Nodes/01_Empirical/06_Projects/data-management/data-management-workshop/WWS-Test-Existing-yournamesAfter running:
GitHub stops tracking your large files
box-lfs folder is created with:
.boxtracker files
upload folder with real large files (hashed
names)
path-hash.csv
Your large file should now be stored on Box in the specified project folder.
Pull first, push often → prevents merge
conflicts with .boxtracker files
Box LFS doesn’t store file diffs — each update is a full new file
If two people change the same large file without pulling first, you’ll get a merge conflict
Box LFS is a new project and may still have bugs. If you run into issues or have suggestions for new features please email Katie or create an issue on GitHub.