git is a version control system designed by the developers of the Linux kernel. It is free to use and install and offers a whole bunch of ways to help with coding work, particularly when this is collaborative. This includes:
The whole system works through repositories: a set of files dedicated to a given project. This is not restricted to R files, and can be used with most computer languages, documents and files.
The examples used here are based on the information available at happygitwithr. This website has much more extensive instructions on the use of git and R (and git on its own), and is an excellent place to follow up.
GitHub is one of several online hosting sites, that provide cloud-based storage for your git repositories. This (and git) now integrate quite seamlessly with R, and particularly with RStudio, so in this lab, we’ll go through the steps of setting up R and GitHub to play nicely together.
The overall setup scheme is as follows:
We’ll tackle the first 4 of these in this document.
Go to this site: https://github.com and follow the instructions. Choose your username wisely - remember that this is way that potential employers can see your work!
git is the underlying software that manages the repositories, keeps files current and copies and sends files between computers, so we need to install this first so that R can use it.
The easiest way to install git is using Git for windows. This is a self contained installer; download it and run it and follow all the usual prompts.
The easiest way to install git on a Mac is by installing the XCode command line tools. Open a terminal and enter the following command to do so:
xcode-select --install
This can take a little time, but when finished will provide you with a wide range of useful tools for programming.
Use your favorite package manager to install git directly.
Ubuntu or Debian Linux:
sudo apt-get install git
Fedora or RedHat Linux:
sudo yum install git
You can also get the latest version of git here.
Once the installation is complete, open a terminal to check that git is working. There are several options to access a terminal, but an easy one is to use the terminal in RStudio. Open RStudio, then go to [Tools] > [Terminal] > [New Terminal], and you should get a new tab in the console panel. Go to this, and type the following to check that git is working:
which git
and
git --version
If this doesn’t work, open [Tools] > [General options]. Select the Terminal from the left hand side, and look at the drop down list titled [Open new terminal with..]. If there is an open for git bash, select this and try again.
On Mac or Linux, you can also use a regular terminal, as long as it is running bash. On Windows, you can access Git Bash from the Start menu.
If this works, the next steps are to tell git your name and email address. The email address should be the address that you registered with GitHub, but the name can be your real name. Type the following in your terminal:
git config --global user.name 'Jane Doe'
git config --global user.email 'jane@example.com'
To check that this has been correctly entered:
git config --global --list
In this section, we will walk through the steps of connecting to GitHub for the first time, and both pulling from and pushing to GitHub from your computer.
Go to https://github.com and make sure you are logged in to your account. Click on [Repositories], then click the green [New] button.
This will open a form with several options. Enter the following:
myrepo (or whatever name you will easily remember).Click button [Create repository.]
Once this is created, we will now link to this from your computer. Click the button [Clone or download], and copy the HTTPS URL (if you click on the little clipboard icon this will get copied to your clipboard).
Go back to the terminal on your computer. Change your working directory until you are somewhere where you are happy to be making a copy of your repository. In a terminal:
pwd prints the current directorycd changes the directory
cd will go to your home directorycd .. will go up one levelcd /path/to/my/files will change to whatever path you have listedTo make a clone of your repository, type the following into the terminal, once you have changed directory:
git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY.git
Replace the https:// address with the URL copied from GitHub.
To check that this has worked, we will change directory to the newly created clone repository. and take a look at its contents.
cd myrepo
ls
head README.md
git remote show origin
The last of these commands will provide information about where this would cloned from, information about where the master repository is held, and whether or not the local clone copy is up-to-date (it is for now). jenny@2015-mbp myrepo $ git remote show origin
We will make a simple change to the READ.md file, and then check to see if we can push this to GitHub. You can open this file with a text editor, or run the following command line to insert some new text.
echo "A line I wrote on my local computer" >> README.md
Now if you look at the status of git, it will tell you that one file (README.md) has been modified.
modified: README.md
Now let’s push this change to git. This is done in three parts. First we ‘stage’ the changes (basically creating a list of all the modifications we wish to push. Then we create a ‘commit’, the package of changes along with a brief description of what the modifications are. Finally, we push this to git. You should at this point be asked to enter your username and password for GitHub, so do so, and the changes will be pushed to GitHub.
git add -A
git commit -m "A commit from my local computer"
git push
The description of changes is a key part of the way git works, and you will not be allowed to push changes without this. The goal here is to make the description brief but informative, and in turn the commits you make should be relatively small increments each time. Remember that the goal of these comments is to allow collaborators and other users to understand what is going on and how this is affecting the code.
If this has worked, you should see some output similar to this:
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 311 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/geogrtest/myrepo.git
b4112c5..de669ba master -> master
Now let’s check to see if the local change propagated to the GitHub repository. Go back to your GitHub page and refresh the repository page. You should now see the new line in the README file.
If you click on [commits], at the top of the page, you will get a list of the past changes (there will only be one so far). At the far right hand side of this page is a small icon that looks like this <> for each commit. Clicking this will show you the files in the repository before that commit was made, which allows you to track the changes made by yourself (and others).
If you have got to this point, you now have a successfully working git installation. We will next look at how to integrate this properly with R/RStudio.