Data Storage Workflows

In the Wildfire Water Security project we primarily use the following tools to store and share files which both have their benefits and limitations:

Box

Git/GitHub

Both these tools are preferable over a organization specific network drive because they:

  • allow easy collaboration across organizations

  • automatically back up work and save version history

I have a file where does it go?

Generally, the following rules apply:

  • Box: Large files, collaborative Office files, files not related to a specific research project

  • GitHub: Project files, code, manuscript files (besides document itself)

If you’re unsure of where to put a file you can follow the flow chart below:

Where to NOT put files:

  • Network drive: even keeping shared GitHub repositories on the network drive can cause issues. Keep your own copies of files on a local drive (C)
  • OneDrive
  • Sharepoint
  • Google Drive
  • Dropbox

Data belongs to the project, not individuals. Store it where the team can find and it.


Folder and File Naming Conventions

Keeping your files and folders organized makes it easier for everyone on the team to find what they need and avoids confusion down the road.

Folder Organization

  1. Git Repo First
    Your top-level project folder (root directory) should be a GitHub repository. This ensures you have version control and backups from the start.

  2. Limit the Number of Folders in the Root
    Aim for fewer than 10 top-level folders — for example: data/, code/, figures/, methods/

  3. Use Nested Folders for Subcategories
    For example inside data/:

    • raw-data/
    • processed-data/
    • metadata/
  4. Avoid Spaces & Special Characters
    Use - or _ instead of spaces. Avoid characters like .:*?"<>|[]&$.

  5. Descriptive Names Name folders so someone unfamiliar with your project can still guess what’s inside.

  6. Organize by Date (if needed)

  7. Force Folder Order with Numbers

File Organization

  1. Avoid Spaces & Special Characters
    Use - or _ instead of spaces. Avoid characters like .:*?"<>|[]&$.

  2. Be Concise but Descriptive

    • DON’T: use words like “the” or “and”

    • DO: use standard abbreviations and keywords

  3. Self-Contained File Names: name files so they still make sense outside the folder.

  4. Let Git (or Box) Handle Versions: don’t add dates, initials, or “final_v2” to file names.

    Note: Do use dates and initials when emailing files or working outside Git and Box.

  5. No Duplicate Files: edit the original and commit often instead of making copies.


Working with Git & GitHub

Cloning a Repository

A clone makes a complete copy of a repository on your computer so you can work locally. It keeps your changes separate from the main online repository (repo) until you commit and push.

To clone with GitHub Desktop:

  1. Open GitHub Desktop.

  2. Go to File → Clone repository.

  3. Choose the repository from the list or paste its URL.

  4. Select the folder where you want it saved

    Store your repos in a local folder (C drive) to prevent overwriting others work on shared directories

  5. Click Clone.

Pushing and Pulling

  • Pull:

    • Right before you start working → get the latest version.

    • Right before committing → make sure you’re not overwriting someone else’s changes.

  • Push:

    • Right after you commit → share your changes with others.

Committing Changes

  • Commit Often – one issue or logical set of changes per commit.

  • Write Good Commit Messages:

    • Short header = what you changed

    • Body (optional) = why you changed it

Special Files

.gitignore

  • Tells Git which files/folders not to track.

  • Good defaults: OS-specific files, temporary files

  • Generate from gitignore.io.

  • In GitHub Desktop: right-click a file → Ignore file/folder.

README.md

Your repo’s “front page” on GitHub.
Include:

  • Project summary

  • Contact info

  • Links to important files/folders (especially those stored outside GitHub)

Branches

A branch is like a parallel version of your repository. Each branch can have different versions of files without affecting the main (default) branch. You can easily switch between branches in GitHub Desktop, and when you do, the files on your computer will change to match that branch.

Important: If you switch branches and it looks like your work is missing, don’t panic! It’s still there, just on a different branch.

When to Use Branches

Branches are especially helpful when:

  • You’re doing experimental work and don’t want to affect other collaborators’ files.

  • You’re working on a manuscript connected to a project and want to keep manuscript-related changes separate from the main data or analysis.

  • You want to test a new analysis approach without risking your main project files.

Creating a New Branch

  1. Open GitHub Desktop.

  2. Click the Current Branch dropdown → New Branch.

  3. Give your branch a clear, descriptive name.

    • For manuscript branches, use the format: FirstAuthor-Year (e.g., Smith-2025) or keywords-Year to describe the manuscript.
  4. Click Create Branch.

Switching Between Branches

  1. Open GitHub Desktop.

  2. Click the Current Branch dropdown.

  3. Select the branch you want to switch to.

Merge Commits

When you’ve finished work on a branch and are ready to add those changes back into the main branch, you can do what’s called a merge.

A merge takes all the changes from your branch and copies them into the main branch so everyone can access the updated work.

Before You Merge

  • Double-check your work: Make sure your branch is complete and won’t break anything in the main branch.

  • Pull recent changes: Update your branch by pulling any new commits from the main branch first, this reduces the risk of conflicts.

  • Commit all changes: Save (commit) everything on your branch before starting the merge.

How to Merge a Branch into Main in GitHub Desktop

  1. Open GitHub Desktop.

  2. In the top bar, check that you are currently on the main branch.

    • If not, click Current Branch → select main.
  3. Click Current Branch again.

  4. At the bottom of the dropdown, click Choose a branch to merge into main.

  5. Select the branch you want to merge.

  6. Click the blue Create a merge commit button.

  7. Push the changes to GitHub so the updated main branch is available online

After Merging

  • You can delete the branch if you no longer need it (optional but helps keep things tidy).

Quick Reference

Action When to Do It What It Does
Clone At the start of working on a repo Makes a local copy you can work in
Pull Before editing & before committing Updates your local copy with the latest changes
Commit After making a set of changes Saves those changes locally in Git history
Push After committing Sends your changes to GitHub so others can see them
Merge When you have a stable branch you want to share Brings changes to a branch onto the main branch

Further Reading