Source file ⇒ 2017-lec20.Rmd
If you had lab already this week then you already did this. I am reposting the instructions for your convenience.
We can’t upload files to the server directly. Instead, we need to use something called scp. Here’s how it works—you will have an opportunity to practice this in lab this week.
The instructions here are different for Mac and PC users:
MAC USERS
Open a terminal window by typing terminal in the spotlight search in the top right of your screen.
In the terminal, use the following command to upload your file to the ‘radagast‘ Berkeley statistics server.
scp pathToFileOnYourComputer/file.extension username@server:/PathToCopyFileInto
PC USERS
You will need to download and install WinSCP:
Here is a link to download the latest version of WinSCP:
Here is a you tube video on how to use WinSCP:
Find your file in the left side of WinSCP and drag it to the Documents directory on the right side of WinSCP.
Here are some useful command line shortcuts:
wget
download a file from the web
egrep
- print lines matching a pattern (regex)
cut
- extract columns of data from a field-delimited file
Here is a tab delimited data set about potatoes
wget -O potatoes.txt http://s3.amazonaws.com/assets.datacamp.com/course/importing_data_into_r/potatoes.txt
head potatoes.txt
## --2017-04-01 11:04:53-- http://s3.amazonaws.com/assets.datacamp.com/course/importing_data_into_r/potatoes.txt
## Resolving s3.amazonaws.com... 52.216.227.75
## Connecting to s3.amazonaws.com|52.216.227.75|:80... connected.
## HTTP request sent, awaiting response... 200 OK
## Length: 3575 (3.5K) [text/plain]
## Saving to: 'potatoes.txt'
##
## 0K ... 100% 136M=0s
##
## 2017-04-01 11:04:54 (136 MB/s) - 'potatoes.txt' saved [3575/3575]
##
## area temp size storage method texture flavor moistness
## 1 1 1 1 1 2.9 3.2 3.0
## 1 1 1 1 2 2.3 2.5 2.6
## 1 1 1 1 3 2.5 2.8 2.8
## 1 1 1 1 4 2.1 2.9 2.4
## 1 1 1 1 5 1.9 2.8 2.2
## 1 1 1 2 1 1.8 3.0 1.7
## 1 1 1 2 2 2.6 3.1 2.4
## 1 1 1 2 3 3.0 3.0 2.9
## 1 1 1 2 4 2.2 3.2 2.5
lets cut out the first and second and third columns and save to a file called small_potatoes
cat potatoes.txt | cut -f 1-3 > small_potatoes
head small_potatoes
## area temp size
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
## 1 1 1
lets keep only those small potatoes with size equal to 2:
cat potatoes.txt | cut -f 1-3 | egrep "^.[[:space:]]2" > size2_small_potatoes
head size2_small_potatoes
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
## 1 2 1
For example:
scp ~/Desktop/swimming_pools.csv alucas@radagast.berkeley.edu:~/.
cat swimming_pools.csv | egrep Centre
cat swimming_pools.csv | egrep Centre | cut -d "," -f 1-2
cat swimming_pools.csv | egrep Centre | cut -d "," -f 1-2 > small_pools
then if have mac on your computer’s terminal type:
scp alucas@radagast.berkeley: ~/small.pools ~/Desktop/.
# unix command
cat swimming_pools.csv | egrep Centre | cut -d "," -f 1-2 > center_pools.csv
scp pathToFileOnYourComputer/file.extension username@server:/PathToCopyFileInto
Do example 1a in Star wars.
https://scf.berkeley.edu:3838/shiny/alucas/Lecture-20-collection/
Sed has many uses but we will focus on sed for substitution
syntax: sed s/regex/replacement/FLAG file
OR
cat file | sed s/regex/replacement/FLAG
FLAGS can be any of the following:
EXAMPLE:
echo one two three, three two one, one one hundred > file
cat file | sed s/one/ONE/g
## ONE two three, three two ONE, ONE ONE hundred
EXAMPLE:
echo day sunday | sed s/day/night/
## night sunday
We saw the following commands to make a file called small_potatoes
wget -O potatoes.txt http://s3.amazonaws.com/assets.datacamp.com/course/importing_data_into_r/potatoes.txt
cat potatoes.txt | cut -f 1-2 > small_potatoes
head -5 small_potatoes
Suppose we would like to actually make this into a script that we can reuse.
Steps:
wget -O potatoes.txt http://s3.amazonaws.com/assets.datacamp.com/course/importing_data_into_r$
cat potatoes.txt | cut -f $1-$2 > small_potatoes
head -$3 small_potatoes
Files and directories in Unix may have three types of permissions: read (r
), write (w
), and execute (x
). Each permission may be on or off for each of three categories of users: the file or directory owner; other people in the same group as the owner; and all others.
in terminal add permission to execute (chmod u+x potatoes.sh)
parameterize (./potatoes.sh 1 2 5)
Do example 2a,2b in sed and scripts
https://scf.berkeley.edu:3838/shiny/alucas/Lecture-20-collection/