Learning Goals

  1. Know how to maintain version history of large project files.
  2. Understand how to use Box-LFS.

GitHub caps file sizes at 100 MB, which makes it less ideal for storing certain kinds of files (e.g., geospatial datasets, images, or zipped archives). Most project files will be smaller than this, but we still want to be able to:

To achieve this, we’ll use Box LFS. This is an R package (with plans to eventually make it a standalone executable) that works similarly to Git LFS. It links large files stored on Box with GitHub so that Git maintains versioning, while the actual files live outside GitHub’s storage limits.

How it works:

With Box LFS:


Activity 1: Explore the structure of a Box LFS tracked project

We’ll use the example reop WWS-TEST-Box-LFS. It’s similar to the WWS-TEST-example-repo, but now includes a large file tracked with Box LFS.

  1. Open the repo and compare its structure to WWS-TEST-example-repo.

    • Notice the new box-lfs folder and the files inside.

    • Open a .boxtracker file to see how GitHub stores “pointers” to large files.

    • Check the README — it tells you Box LFS is in use.

    • Confirm that the large file itself is not in GitHub. It lives in Box instead.


Activity 2: Clone a Box LFS tracked project

  1. First, install the Box-LFS R package:

    remotes::install_github("wildfire-water-security/WWS-box-lfs", subdir="blfs")
  2. Then clone the repo as usual in GitHub Desktop:

    • File → Clone Repository (CTRL + SHIFT + O)

    • Choose WWS-TEST-Box-LFS

    • Set the local path: C:\Users\your username\Documents

    • Click Clone

  3. Now use Box-LFS to fetch the large files:

    library(blfs)
    dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
    clone_repo_blfs(dir = dir, download=NULL)
  4. You’ll get a message that the large files have been fetched from Box and put in your repository.

  5. Check the data folder. Notice the new large-file1.docx. Box-LFS has grabbed the file from Box and restored it into the correct location.


Activity 3: Push and Pull Changes in a Box LFS tracked project

Now let’s practice collaborative editing with Box LFS.

Partner 1

  1. Create a new branch with your names.

  2. Open example-csv.csv, make a change, and commit it. Don’t push yet.

  3. Run Box LFS push_repo_blfs to check for updates:

    dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
    push_repo_blfs(dir = dir)
    • If it runs successfully you’ll get a message that large files have been synced with Box.
  4. Now push your changes in GitHub Desktop.

Partner 2

  1. Pull Partner 1’s changes in GitHub Desktop.

  2. Run Box LFS pull_repo_blfs to check for updates:

        dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
        pull_repo_blfs(dir = dir, download = NULL)
  3. Make your own changes:

    • Edit large-file1.docx.

    • Save and close.

  4. Run Box LFS push check:

    dir <- file.path(fs::path_home(), "Documents", "WWS-TEST-Box-LFS")
    push_repo_blfs(dir = dir)
    • You’ll get a message that large files have been synced with Box.

    • In GitHub Desktop you’ll see the .boxtracker file was updated.

    • Commit and push the changes.

      • Make sure to describe the changes to the file, not the .boxtracker in the commit message.

Note: Box LFS doesn’t support branches yet. Uploading overwrites the file in Box for all branches, but older versions are still stored as history.


Activity 4: Initialize Box LFS in a Repository

Up till now we’ve been working with pre-made Git repositories. Let’s create a new one and use Box LFS to start tracking any large files in the repo. This can be done in two ways:

  1. From the start, before you have any files

  2. From an existing folder

Work with a partner to try both ways.

Method 1: Brand New Project

  1. Go to the WWS GitHub organization.

  2. Click the green New repository button (top right).

  3. Name your repo using the format: WWS-TEST-New-yournames

  4. Add a brief description (what the project is about).

  5. Choose Public or Private. (Private = only visible within WWS organization.)

  6. Check “Initialize with a README.” This will be your repo’s welcome page with links, notes, and contact info.

  7. Click the green Create repository button.

  8. Clone the repo to your documents folder

  9. Make it an R Project, in RStudio, run:

        #install.package("usethis") #only need to run the first time 
        library(usethis)
    
        #the location you of your project folder
    names <- "katie" #replace with your names
    dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-New-", names))
    
    #create a project from the directory
    usethis::create_project(dir)
  10. Copy the files from WWS-TEST-Box-LFS/data into the repository

  11. Start tracking large files with Box-LFS:

    names <- "Katie" #replace with your names
    
    #new
    dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-New-", names))
    new_repo_blfs(dir = dir, size = 10) #the 10 indicates the files greater than 10 MB will be tracked

    You’ll be prompted for the Box path the project folder. Find this by navigating to where you want the large files stored.

    • Wildfire_Water_Security/02_Nodes/01_Empirical/06_Projects/data-management/data-management-workshop/WWS-Test-Existing-yournames

Method 2: Existing Project

  1. Create a folder in C:/Users/your username/Documents called WWS-TEST-Existing-yournames

  2. Copy the files from WWS-TEST-Box-LFS/data into the folder

  3. Make into an R project

    #install.package("usethis") #only need to run the first time 
    library(usethis)
    
    #the location you of your project folder
    names <- "katie" #replace with your names
    dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-Existing-", names))
    
    #create a project from the directory
    usethis::create_project(dir)
  4. In the new RStudio window that opens, initialize Git in the project.

    usethis::use_git()

    When prompted, do not create the initial commit, we’ll do that manually in a bit.

  5. Add to GitHub Desktop

    • In GitHub Desktop: File → Add local repository (CTRL + O)

    • Select your project folder and click Add Repository.

    • You’ll now see your files listed in GitHub Desktop.

  6. Publish the repo to GitHub

    • In GitHub Desktop, click Publish repository (top right).

    • Name your repo using the format: WWS-TEST-Existing-yournames

    • Click Publish repository (blue button).

  7. Initialize Box LFS by running the following code after updating the folder name:

    names <- "Katie" #replace with your names
    
    #existing
    dir <- file.path(fs::path_home(), paste0("Documents/WWS-TEST-Existing-", names))
    new_repo_blfs(dir = dir, size = 10) #the 10 indicates the files greater than 10 MB will be tracked

    You’ll be prompted for the Box path the project folder. Find this by navigating to where you want the large files stored.

    • In this case: Wildfire_Water_Security/02_Nodes/01_Empirical/06_Projects/data-management/data-management-workshop/WWS-Test-Existing-yournames

After running:


Best practices for Box LFS:

  • Pull first, push often → prevents merge conflicts with .boxtracker files

  • Box LFS doesn’t store file diffs — each update is a full new file

  • If two people change the same large file without pulling first, you’ll get a merge conflict

Box LFS is a new project and may still have bugs. If you run into issues or have suggestions for new features please email Katie or create an issue on GitHub.