Creating and Committing a GitHub Repository

This tutorial shows the process necessary to commit your working directory to a github repository

1. First, sign into your github account and create a new repository

2. Next, name your repository and give it a short description

Choose whether privacy is set to ‘Public’ or ‘Private’
Make sure you click the box beside ‘Initialize this repository with a README’
Then click the green ‘Create repository’ button

3. Now, after installing Git Bash from https://gitforwindows.org/ (accept all defaults), open the Git Bash shell

See the following website for some common Git Bash shell commands https://smbc-nzp.github.io/dataSci/git_bash.html
Use the ‘pwd’ (print working directory) command to see where your Git is currently pointed
Use ‘cd ..’ (change directory) to move back one step
Then use cd to navigate to your directory of interest

4. Now our directory is set to where our R directory and data files are

We can check what files are located in our directory with ‘ls’
We must now create a nested sub-directory (folder), called gits
We can now clone our repository into the gits directory using ‘git clone https://github.com/FWeco/dataRequests.git’ (the hyperlink is from our newly created github repository)

5. Next, we’ll navigate to the directory where we cloned our repository, and check the status

We’re good to go!

6. Now we need to add the files/documents that we want to push to our repository using ‘git add’

There is a way to do this is Git Bash, but I think it’s easier to just copy and paste
- So, copy all the files/docs you want to include in your Git Hub repository and paste them in the folder nested under gits/
- Then, use ‘git add’ to specify what files you’d like to commit in the next step

7. Now we’re ready to commit our directory to our repository on github!

Use the ‘git commit -m’ command in Git Bash with a message in quotations
The last step is to push all the commits to the github repository using ‘git push origin master’

That’s it! The files/docs magically appear in your github repository! Any github user can clone/download the repository!

This is a direct link to our data, which can be accessed in R

After copying the link, we save it to the ‘gitUrl’ object
- Then we can use the paste0() function to complete the hyperlink, which will be useful if we have many datasets saved in the same repository
  - All we have to do is change the file name (the link will not change)

gitUrl <-
  'https://raw.githubusercontent.com/FWeco/dataRequests/master/'


paste0(
  gitUrl,
  '2020-04-03_SRBC.csv')

## [1] "https://raw.githubusercontent.com/FWeco/dataRequests/master/2020-04-03_SRBC.csv"

Now we simply use the `read_csv` function and the objects saved above to import our data into R!

# Read in the data:

read_csv(
  paste0(
    gitUrl,
    '2020-04-03_SRBC.csv'))

## Parsed with column specification:
## cols(
##   STR_STATION_ID = col_character(),
##   tag = col_character()
## )

## # A tibble: 100 x 2
##    STR_STATION_ID            tag     
##    <chr>                     <chr>   
##  1 20150709-1514-ablascovich Impaired
##  2 20150818-0959-jeremmille  Impaired
##  3 20150901-1557-ablascovic  Impaired
##  4 20150902-0836-ablascovic  Impaired
##  5 20150902-0844-ablascovic  Impaired
##  6 20150902-1310-ablascovic  Impaired
##  7 20150902-1330-ablascovic  Impaired
##  8 20150902-1351-ablascovic  Impaired
##  9 20150902-1502-ablascovic  Impaired
## 10 20150902-1507-jeremmille  Impaired
## # ... with 90 more rows

So, what happens if you send someone a dataset as an email attachment, then two hours later realize you have more data, outliers need removed, etc..?

You end up sending another email, with another dataset named ‘fishData2.csv’. Then another change is made. For complex analyses, before long your working with ‘fishData6.csv’.

The alternative using github is making changes to the dataset on in your local directory and pushing it to your repository. This way, your colleagues that are accessing your data are always working with an up to date dataset.

This logic can be extended to entire directories and Rstudio projects using the steps above! You can allow access to your R scripts (which are constantly being updated). Staff in different regions/agencies can access up to date datasets and scripts, and run your analysis exactly as you have, enhancing the ability to collaborate.

Creating and Committing a GitHub Repository

Matt Shank

4/6/2020

This tutorial shows the process necessary to commit your working directory to a github repository

1. First, sign into your github account and create a new repository

2. Next, name your repository and give it a short description

Choose whether privacy is set to ‘Public’ or ‘Private’

Make sure you click the box beside ‘Initialize this repository with a README’

Then click the green ‘Create repository’ button

3. Now, after installing Git Bash from https://gitforwindows.org/ (accept all defaults), open the Git Bash shell

See the following website for some common Git Bash shell commands https://smbc-nzp.github.io/dataSci/git_bash.html

Use the ‘pwd’ (print working directory) command to see where your Git is currently pointed

Use ‘cd ..’ (change directory) to move back one step

Then use cd to navigate to your directory of interest

4. Now our directory is set to where our R directory and data files are

We can check what files are located in our directory with ‘ls’

We must now create a nested sub-directory (folder), called gits

We can now clone our repository into the gits directory using ‘git clone https://github.com/FWeco/dataRequests.git’ (the hyperlink is from our newly created github repository)

5. Next, we’ll navigate to the directory where we cloned our repository, and check the status

We’re good to go!

6. Now we need to add the files/documents that we want to push to our repository using ‘git add’

There is a way to do this is Git Bash, but I think it’s easier to just copy and paste

So, copy all the files/docs you want to include in your Git Hub repository and paste them in the folder nested under gits/

Then, use ‘git add’ to specify what files you’d like to commit in the next step

7. Now we’re ready to commit our directory to our repository on github!

Use the ‘git commit -m’ command in Git Bash with a message in quotations

The last step is to push all the commits to the github repository using ‘git push origin master’

That’s it! The files/docs magically appear in your github repository! Any github user can clone/download the repository!

This is a direct link to our data, which can be accessed in R

After copying the link, we save it to the ‘gitUrl’ object

Then we can use the `paste0()` function to complete the hyperlink, which will be useful if we have many datasets saved in the same repository

All we have to do is change the file name (the link will not change)

Now we simply use the `read_csv` function and the objects saved above to import our data into R!

So, what happens if you send someone a dataset as an email attachment, then two hours later realize you have more data, outliers need removed, etc..?

You end up sending another email, with another dataset named ‘fishData2.csv’. Then another change is made. For complex analyses, before long your working with ‘fishData6.csv’.

The alternative using github is making changes to the dataset on in your local directory and pushing it to your repository. This way, your colleagues that are accessing your data are always working with an up to date dataset.

Creating and Committing a GitHub Repository

Matt Shank

4/6/2020

This tutorial shows the process necessary to commit your working directory to a github repository

1. First, sign into your github account and create a new repository

2. Next, name your repository and give it a short description

Choose whether privacy is set to ‘Public’ or ‘Private’

Make sure you click the box beside ‘Initialize this repository with a README’

Then click the green ‘Create repository’ button

3. Now, after installing Git Bash from https://gitforwindows.org/ (accept all defaults), open the Git Bash shell

See the following website for some common Git Bash shell commands https://smbc-nzp.github.io/dataSci/git_bash.html

Use the ‘pwd’ (print working directory) command to see where your Git is currently pointed

Use ‘cd ..’ (change directory) to move back one step

Then use cd to navigate to your directory of interest

4. Now our directory is set to where our R directory and data files are

We can check what files are located in our directory with ‘ls’

We must now create a nested sub-directory (folder), called gits

We can now clone our repository into the gits directory using ‘git clone https://github.com/FWeco/dataRequests.git’ (the hyperlink is from our newly created github repository)

5. Next, we’ll navigate to the directory where we cloned our repository, and check the status

We’re good to go!

6. Now we need to add the files/documents that we want to push to our repository using ‘git add’

There is a way to do this is Git Bash, but I think it’s easier to just copy and paste

So, copy all the files/docs you want to include in your Git Hub repository and paste them in the folder nested under gits/

Then, use ‘git add’ to specify what files you’d like to commit in the next step

7. Now we’re ready to commit our directory to our repository on github!

Use the ‘git commit -m’ command in Git Bash with a message in quotations

The last step is to push all the commits to the github repository using ‘git push origin master’

That’s it! The files/docs magically appear in your github repository! Any github user can clone/download the repository!

One extremely useful function is sharing of data via github. Read on for instructions:

We can leverage the hyperlink address to import datasets directly from github!

In the repository, click on the .csv file that we pushed from our directory (called ‘2020-04-03.csv’)

Then click on the ‘Raw’ button

This is a direct link to our data, which can be accessed in R

After copying the link, we save it to the ‘gitUrl’ object

Then we can use the paste0() function to complete the hyperlink, which will be useful if we have many datasets saved in the same repository

All we have to do is change the file name (the link will not change)

Now we simply use the read_csv function and the objects saved above to import our data into R!

So, what happens if you send someone a dataset as an email attachment, then two hours later realize you have more data, outliers need removed, etc..?

You end up sending another email, with another dataset named ‘fishData2.csv’. Then another change is made. For complex analyses, before long your working with ‘fishData6.csv’.

The alternative using github is making changes to the dataset on in your local directory and pushing it to your repository. This way, your colleagues that are accessing your data are always working with an up to date dataset.

Then we can use the `paste0()` function to complete the hyperlink, which will be useful if we have many datasets saved in the same repository

Now we simply use the `read_csv` function and the objects saved above to import our data into R!