Linux Tutorial

Introduction (Why to use Linux)

In the following tutorial, I have taken most of the information from Ryans linux tutorial

Linux command line may seem daunting, complex and scary. It is actually quite simple and intuitive (once you understand what is going on, and you start using it in your everyday life)

The keys to manage it with Linux are:

Laziness

You might (often not) spend more time to achieve a single task than windows, but you will earn time and effort when you want to repeat the same task many times.

You want to create 100 files, called lazyfiles1, lazyfiles2, … A way to achieve your goal is to open a text editor and use “save as…” 100 times. Perhaps, this is not a very good way to go…

 ls lazyfiles*
 ## ls: cannot access lazyfiles*: No such file or directory
for i in `seq 1 100`;do
  touch lazyfiles$i
done
 ls lazyfiles*
 ## lazyfiles1
 ## lazyfiles10
 ## lazyfiles100
 ## lazyfiles11
 ## lazyfiles12
 ## lazyfiles13
 ## lazyfiles14
 ## lazyfiles15
 ## lazyfiles16
 ## lazyfiles17
 ## lazyfiles18
 ## lazyfiles19
 ## lazyfiles2
 ## lazyfiles20
 ## lazyfiles21
 ## lazyfiles22
 ## lazyfiles23
 ## lazyfiles24
 ## lazyfiles25
 ## lazyfiles26
 ## lazyfiles27
 ## lazyfiles28
 ## lazyfiles29
 ## lazyfiles3
 ## lazyfiles30
 ## lazyfiles31
 ## lazyfiles32
 ## lazyfiles33
 ## lazyfiles34
 ## lazyfiles35
 ## lazyfiles36
 ## lazyfiles37
 ## lazyfiles38
 ## lazyfiles39
 ## lazyfiles4
 ## lazyfiles40
 ## lazyfiles41
 ## lazyfiles42
 ## lazyfiles43
 ## lazyfiles44
 ## lazyfiles45
 ## lazyfiles46
 ## lazyfiles47
 ## lazyfiles48
 ## lazyfiles49
 ## lazyfiles5
 ## lazyfiles50
 ## lazyfiles51
 ## lazyfiles52
 ## lazyfiles53
 ## lazyfiles54
 ## lazyfiles55
 ## lazyfiles56
 ## lazyfiles57
 ## lazyfiles58
 ## lazyfiles59
 ## lazyfiles6
 ## lazyfiles60
 ## lazyfiles61
 ## lazyfiles62
 ## lazyfiles63
 ## lazyfiles64
 ## lazyfiles65
 ## lazyfiles66
 ## lazyfiles67
 ## lazyfiles68
 ## lazyfiles69
 ## lazyfiles7
 ## lazyfiles70
 ## lazyfiles71
 ## lazyfiles72
 ## lazyfiles73
 ## lazyfiles74
 ## lazyfiles75
 ## lazyfiles76
 ## lazyfiles77
 ## lazyfiles78
 ## lazyfiles79
 ## lazyfiles8
 ## lazyfiles80
 ## lazyfiles81
 ## lazyfiles82
 ## lazyfiles83
 ## lazyfiles84
 ## lazyfiles85
 ## lazyfiles86
 ## lazyfiles87
 ## lazyfiles88
 ## lazyfiles89
 ## lazyfiles9
 ## lazyfiles90
 ## lazyfiles91
 ## lazyfiles92
 ## lazyfiles93
 ## lazyfiles94
 ## lazyfiles95
 ## lazyfiles96
 ## lazyfiles97
 ## lazyfiles98
 ## lazyfiles99

Now, the files are there…. Sometimes it's good to be lazy (or at least use your energy for something creative)

#Let's delete all of them
rm lazyfiles*

\( \color{red}{\text{rm will not ask you if you really want to delete the files.}} \)

Reproducibility

You need reproducibility to allow somebody else, perform a similar analysis with the analysis that you have performed

Regarding reproducibility, Linux allows to:

Connect to a server

We will connect to the server (ip, username password will be given in the class).

If you have a windows machine, we will use

\( \color{red}{\text{Task 1}} \)

Navigation

View the contents of a directory**

ls
## analysis.out
## analysis10.out
## analysis20.out
## linux_tutorial.Rmd
## linux_tutorial.html
## linux_tutorial.md
## molphe2018
## myAnalysis.out
## sayings.txt
## sayings.txt~
## seq.fa
# view as a list
ls -l
## total 120
## -rw-r--r-- 1 pavlos users  3714 Jun  6 22:12 analysis.out
## -rw-r--r-- 1 pavlos users   696 Jun  6 23:16 analysis10.out
## -rw-r--r-- 1 pavlos users   696 Jun  6 23:16 analysis20.out
## -rw-r--r-- 1 pavlos users  7594 Jun  7 15:12 linux_tutorial.Rmd
## -rw-r--r-- 1 pavlos users 29163 Jun  6 23:16 linux_tutorial.html
## -rw-r--r-- 1 pavlos users 23970 Jun  6 23:16 linux_tutorial.md
## drwxr-xr-x 2 pavlos users  4096 Jun  6 20:34 molphe2018
## -rw-r--r-- 1 pavlos users   696 Jun  6 23:16 myAnalysis.out
## -rw-r--r-- 1 pavlos users 15636 Jun  6 23:18 sayings.txt
## -rw-r--r-- 1 pavlos users 15594 Jun  6 23:10 sayings.txt~
## -rw-r--r-- 1 pavlos users  4080 Jun  6 22:50 seq.fa
# view as a list, sorted by reverse order based on the time the file was created. Size in human readable format
ls -rtlh
## total 120K
## drwxr-xr-x 2 pavlos users 4.0K Jun  6 20:34 molphe2018
## -rw-r--r-- 1 pavlos users 3.7K Jun  6 22:12 analysis.out
## -rw-r--r-- 1 pavlos users 4.0K Jun  6 22:50 seq.fa
## -rw-r--r-- 1 pavlos users  16K Jun  6 23:10 sayings.txt~
## -rw-r--r-- 1 pavlos users  696 Jun  6 23:16 analysis10.out
## -rw-r--r-- 1 pavlos users  696 Jun  6 23:16 analysis20.out
## -rw-r--r-- 1 pavlos users  696 Jun  6 23:16 myAnalysis.out
## -rw-r--r-- 1 pavlos users  24K Jun  6 23:16 linux_tutorial.md
## -rw-r--r-- 1 pavlos users  29K Jun  6 23:16 linux_tutorial.html
## -rw-r--r-- 1 pavlos users  16K Jun  6 23:18 sayings.txt
## -rw-r--r-- 1 pavlos users 7.5K Jun  7 15:12 linux_tutorial.Rmd
# view as a list, sorted by reverse order based on the time the file was created OF A CERTAIN DIRECTORY. 
# Size in human readable format
ls -rtlh <DIR>
## bash: -c: line 2: syntax error near unexpected token `newline'
## bash: -c: line 2: `ls -rtlh <DIR>'

View the available space

# 'd'isk 'f'ree
df 
## Filesystem                                  1K-blocks        Used   Available Use% Mounted on
## udev                                         49468620          16    49468604   1% /dev
## tmpfs                                         9896036        1452     9894584   1% /run
## /dev/dm-1                                    77679592    66445024     7265544  91% /
## none                                                4           0           4   0% /sys/fs/cgroup
## none                                             5120           0        5120   0% /run/lock
## none                                         49480172          16    49480156   1% /run/shm
## none                                           102400          24      102376   1% /run/user
## /dev/mapper/cerberus--vg-home              1551700964  1003969836   468886200  69% /home
## //139.91.162.46/pavlos_temp/              41115215464 27594248340 13520967124  68% /home/kutsukos/smbyiota
## //139.91.162.46/pavlos_temp/              41115215464 27594248340 13520967124  68% /home/mariav/smbyiota
## //139.91.162.59/evolab/mariav             33714898208 19508970428 14205927780  58% /home/mariav/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/kutsukos/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/anna/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/nefelyp/synology
## //139.91.162.59/evolab/maria_grigoriou    33714898208 19508970428 14205927780  58% /home/maria/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/pavlos/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/ioanna/synology
## //139.91.162.59/evolab/                   33714898208 19508970428 14205927780  58% /home/joanna/synology
## pavlos@139.91.162.90:/home/cluster/pavlos 15380542988 11500201580  3099047636  79% /home/pavlos/data
# human readable
df -h
## Filesystem                                 Size  Used Avail Use% Mounted on
## udev                                        48G   16K   48G   1% /dev
## tmpfs                                      9.5G  1.5M  9.5G   1% /run
## /dev/dm-1                                   75G   64G  7.0G  91% /
## none                                       4.0K     0  4.0K   0% /sys/fs/cgroup
## none                                       5.0M     0  5.0M   0% /run/lock
## none                                        48G   16K   48G   1% /run/shm
## none                                       100M   24K  100M   1% /run/user
## /dev/mapper/cerberus--vg-home              1.5T  958G  448G  69% /home
## //139.91.162.46/pavlos_temp/                39T   26T   13T  68% /home/kutsukos/smbyiota
## //139.91.162.46/pavlos_temp/                39T   26T   13T  68% /home/mariav/smbyiota
## //139.91.162.59/evolab/mariav               32T   19T   14T  58% /home/mariav/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/kutsukos/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/anna/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/nefelyp/synology
## //139.91.162.59/evolab/maria_grigoriou      32T   19T   14T  58% /home/maria/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/pavlos/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/ioanna/synology
## //139.91.162.59/evolab/                     32T   19T   14T  58% /home/joanna/synology
## pavlos@139.91.162.90:/home/cluster/pavlos   15T   11T  2.9T  79% /home/pavlos/data

View how much space your current directory has

du 
## 4    ./molphe2018
## 124  .
# Don't go to all sub-directories
du --max-depth=1 
# also du -h etc
## 4    ./molphe2018
## 124  .

It's a good practise to organize your work in different folders

Create directories

# avoid using spaces and 'strange' characters
# Use only letters and numbers, prefer small letters
mkdir molphe2018
## mkdir: cannot create directory ‘molphe2018’: File exists

Navigate into directories

# cd is from 'c'hange 'd'irectory
cd molphe2018

Navigate into directories

# cd to the parent directory
cd ../

create and open a file to write something in it

nano <FILENAME>

Create an empty file (not so useful)

#create an empty file
touch myfile

Delete files

# delete files
rm myfile

Paths

A path is the 'route' from the root of the machine to where you are

# This is an 'absolute' path
pwd
## /home/pavlos/teaching/petnica/linux

A relative path is the location of a directory in relation to where you are now

\( \color{red}{\text{Task 2}} \)

Transferring files from your machine to the server

Use WinSCP. Donwload it and install it.

The IP, username and the port will be provided during the lecture.

\( \color{red}{\text{Task 3}} \)

File manipulation

Often, you will have to do something with files. For example:

View the contents of a file

# View all the contents of the file
cat analysis.out
## Position Likelihood  Alpha
## 20.0000  0.000000e+00    1.200000e+03
## 1025.4545    4.393404e-08    7.272726e-02
## 2030.9091    2.106529e-01    3.003142e-02
## 3036.3636    2.428282e-08    1.250000e-01
## 4041.8181    1.658971e-08    6.315789e-01
## 5047.2726    1.658958e-08    1.791044e-01
## 6052.7272    4.393404e-08    1.935484e-01
## 7058.1817    1.779838e-03    2.880991e-02
## 8063.6362    0.000000e+00    9.230769e-01
## 9069.0908    0.000000e+00    6.315790e-01
## 10074.5453   6.456803e-09    1.621621e-01
## 11079.9998   3.343408e-01    3.883636e+00
## 12085.4543   0.000000e+00    2.400001e+00
## 13090.9089   4.393404e-08    6.000000e-01
## 14096.3634   0.000000e+00    1.276596e-01
## 15101.8179   5.156527e-02    2.495351e-02
## 16107.2725   1.123356e-01    2.233667e-02
## 17112.7270   4.227156e-08    2.857143e-01
## 18118.1815   0.000000e+00    4.651165e-02
## 19123.6360   3.634563e-08    1.714285e+00
## 20129.0906   0.000000e+00    1.188119e-01
## 21134.5451   2.428283e-08    1.250000e-01
## 22139.9996   0.000000e+00    5.454546e-02
## 23145.4542   1.041301e-08    1.263158e-01
## 24150.9087   2.884595e-01    2.963999e+02
## 25156.3632   4.069906e-08    4.999999e-01
## 26161.8177   0.000000e+00    3.870968e-01
## 27167.2723   3.634560e-08    7.058822e-01
## 28172.7268   1.187716e-07    7.361965e-02
## 29178.1813   3.015034e-02    1.169391e-01
## 30183.6359   0.000000e+00    1.445784e-01
## 31189.0904   0.000000e+00    5.714287e-01
## 32194.5449   0.000000e+00    1.818182e-01
## 33199.9995   0.000000e+00    3.000000e-01
## 34205.4540   1.125698e+00    3.303387e-02
## 35210.9085   2.054864e-04    5.254086e-02
## 36216.3630   1.041234e-08    6.185565e-02
## 37221.8176   4.069905e-08    6.629833e-02
## 38227.2721   0.000000e+00    4.000001e+00
## 39232.7266   4.069902e-08    2.307691e-01
## 40238.1812   0.000000e+00    2.500001e-01
## 41243.6357   1.125718e+00    6.056210e-01
## 42249.0902   0.000000e+00    1.333334e+00
## 43254.5447   0.000000e+00    1.818182e-01
## 44259.9993   4.069907e-08    2.000000e-01
## 45265.4538   0.000000e+00    5.106384e-02
## 46270.9083   3.819798e-01    3.948344e-02
## 47276.3629   1.658953e-08    3.333333e-01
## 48281.8174   0.000000e+00    1.739131e-01
## 49287.2719   0.000000e+00    1.165049e-01
## 50292.7264   2.428284e-08    1.111111e-01
## 51298.1810   0.000000e+00    1.016949e-01
## 52303.6355   1.536771e-02    2.325384e-02
## 53309.0900   3.634563e-08    5.714285e-01
## 54314.5446   0.000000e+00    2.608696e-01
## 55319.9991   0.000000e+00    8.000000e-02
## 56325.4536   2.433211e-03    2.130534e-01
## 57330.9081   0.000000e+00    2.000001e-01
## 58336.3627   0.000000e+00    8.823533e-02
## 59341.8172   0.000000e+00    8.510639e-02
## 60347.2717   4.069906e-08    5.381165e-02
## 61352.7263   0.000000e+00    4.411765e-02
## 62358.1808   3.764147e-01    4.009462e-02
## 63363.6353   8.027959e-08    1.791044e-01
## 64369.0898   0.000000e+00    2.400001e-01
## 65374.5444   0.000000e+00    3.529412e-01
## 66379.9989   0.000000e+00    4.000000e-01
## 67385.4534   4.069900e-08    1.599999e-01
## 68390.9080   8.412437e-08    4.800000e-02
## 69396.3625   0.000000e+00    5.000001e-01
## 70401.8170   6.457014e-09    8.053688e-02
## 71407.2715   2.428300e-08    2.790697e-01
## 72412.7261   6.457213e-09    1.304347e-01
## 73418.1806   0.000000e+00    1.666667e-01
## 74423.6351   4.069904e-08    1.714285e+00
## 75429.0897   2.248192e-03    2.280246e-02
## 76434.5442   0.000000e+00    7.500001e-01
## 77439.9987   0.000000e+00    1.200000e+00
## 78445.4532   0.000000e+00    2.400000e+00
## 79450.9078   4.069905e-08    9.230767e-02
## 80456.3623   0.000000e+00    3.000000e+00
## 81461.8168   0.000000e+00    1.200000e+01
## 82467.2714   0.000000e+00    9.230772e-01
## 83472.7259   0.000000e+00    1.500000e+00
## 84478.1804   1.036417e-01    4.174892e-02
## 85483.6349   4.001489e-03    1.188064e-02
## 86489.0895   0.000000e+00    5.714287e-01
## 87494.5440   2.721137e-01    1.590676e-02
## 88499.9985   4.069901e-08    1.714285e-01
## 89505.4531   4.805540e-01    2.156102e-02
## 90510.9076   0.000000e+00    4.800002e-02
## 91516.3621   4.227154e-08    7.228914e-02
## 92521.8167   0.000000e+00    1.518988e-01
## 93527.2712   1.076509e-01    7.103568e-02
## 94532.7257   0.000000e+00    2.307693e-01
## 95538.1802   0.000000e+00    4.580153e-02
## 96543.6348   4.069906e-08    7.643311e-02
## 97549.0893   4.151799e-01    2.635617e-02
## 98554.5438   0.000000e+00    2.000000e+00
## 99560.0000   0.000000e+00    1.200000e+03

View the contents page-by-page

# View all the contents of the file
# 'less' has also some cool options to do more advanced things
less analysis.out

Get a specific column from the file

cut -f1 analysis.out # -f2 the second column, -f3 the third column

Get the first 10 lines from the file

head -10 analysis.out
## Position Likelihood  Alpha
## 20.0000  0.000000e+00    1.200000e+03
## 1025.4545    4.393404e-08    7.272726e-02
## 2030.9091    2.106529e-01    3.003142e-02
## 3036.3636    2.428282e-08    1.250000e-01
## 4041.8181    1.658971e-08    6.315789e-01
## 5047.2726    1.658958e-08    1.791044e-01
## 6052.7272    4.393404e-08    1.935484e-01
## 7058.1817    1.779838e-03    2.880991e-02
## 8063.6362    0.000000e+00    9.230769e-01

Get the last 10 lines from the file

tail -10 analysis.out
## 90510.9076   0.000000e+00    4.800002e-02
## 91516.3621   4.227154e-08    7.228914e-02
## 92521.8167   0.000000e+00    1.518988e-01
## 93527.2712   1.076509e-01    7.103568e-02
## 94532.7257   0.000000e+00    2.307693e-01
## 95538.1802   0.000000e+00    4.580153e-02
## 96543.6348   4.069906e-08    7.643311e-02
## 97549.0893   4.151799e-01    2.635617e-02
## 98554.5438   0.000000e+00    2.000000e+00
## 99560.0000   0.000000e+00    1.200000e+03

Find strings in the file

# find sequence names in the file
grep '>' seq.fa
## >ENST00000448914.1 cds chromosome:GRCh38:14:22449113:22449125:1 gene:ENSG00000228985.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRDD3 description:T cell receptor delta diversity 3 [Source:HGNC Symbol;Acc:HGNC:12256]
## >ENST00000631435.1 cds chromosome:GRCh38:CHR_HSCHR7_2_CTG6:142847306:142847317:1 gene:ENSG00000282253.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRBD1 description:T cell receptor beta diversity 1 [Source:HGNC Symbol;Acc:HGNC:12158]
## >ENST00000632684.1 cds chromosome:GRCh38:7:142786213:142786224:1 gene:ENSG00000282431.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRBD1 description:T cell receptor beta diversity 1 [Source:HGNC Symbol;Acc:HGNC:12158]
## >ENST00000434970.2 cds chromosome:GRCh38:14:22439007:22439015:1 gene:ENSG00000237235.2 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRDD2 description:T cell receptor delta diversity 2 [Source:HGNC Symbol;Acc:HGNC:12255]
## >ENST00000415118.1 cds chromosome:GRCh38:14:22438547:22438554:1 gene:ENSG00000223997.1 gene_biotype:TR_D_gene transcript_biotype:TR_D_gene gene_symbol:TRDD1 description:T cell receptor delta diversity 1 [Source:HGNC Symbol;Acc:HGNC:12254]
## >ENST00000633010.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105895279:105895294:-1 gene:ENSG00000282274.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD4-17 description:immunoglobulin heavy diversity 4-17 [Source:HGNC Symbol;Acc:HGNC:5503]
## >ENST00000632968.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105891962:105891978:-1 gene:ENSG00000282592.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD1-20 description:immunoglobulin heavy diversity 1-20 [Source:HGNC Symbol;Acc:HGNC:5484]
## >ENST00000603693.1 cds chromosome:GRCh38:15:21011451:21011469:-1 gene:ENSG00000270451.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD4OR15-4B description:immunoglobulin heavy diversity 4/OR15-4B (non-functional) [Source:HGNC Symbol;Acc:HGNC:5507]
## >ENST00000452198.1 cds chromosome:GRCh38:14:105881539:105881556:-1 gene:ENSG00000225825.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD6-25 description:immunoglobulin heavy diversity 6-25 [Source:HGNC Symbol;Acc:HGNC:5516]
## >ENST00000632609.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105905268:105905298:-1 gene:ENSG00000282373.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD3-10 description:immunoglobulin heavy diversity 3-10 [Source:HGNC Symbol;Acc:HGNC:5495]
## >ENST00000439842.1 cds chromosome:GRCh38:14:105865551:105865561:-1 gene:ENSG00000236597.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD7-27 description:immunoglobulin heavy diversity 7-27 [Source:HGNC Symbol;Acc:HGNC:5518]
## >ENST00000632911.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105905452:105905482:-1 gene:ENSG00000281939.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD3-9 description:immunoglobulin heavy diversity 3-9 [Source:HGNC Symbol;Acc:HGNC:5499]
## >ENST00000633504.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105907982:105908012:-1 gene:ENSG00000282132.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD2-8 description:immunoglobulin heavy diversity 2-8 [Source:HGNC Symbol;Acc:HGNC:5492]
## >ENST00000632304.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105910678:105910694:-1 gene:ENSG00000282495.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD1-7 description:immunoglobulin heavy diversity 1-7 [Source:HGNC Symbol;Acc:HGNC:5486]
## >ENST00000633159.1 cds chromosome:GRCh38:CHR_HSCHR14_3_CTG1:105892470:105892490:-1 gene:ENSG00000282487.1 gene_biotype:IG_D_gene transcript_biotype:IG_D_gene gene_symbol:IGHD6-19 description:immunoglobulin heavy diversity 6-19 [Source:HGNC Symbol;Acc:HGNC:5515]

Count strings in the file

# find sequence names in the file
grep -c '>' seq.fa
## 15

Count the number of lines, words characters

wc analysis.out
##  101  303 3714 analysis.out

Count only the number of lines

wc -l analysis.out
## 101 analysis.out

Direct output to another file

# take the first 10 lines and save them into another file
head -10 analysis.out > analysis10.out
cat analysis10.out
## Position Likelihood  Alpha
## 20.0000  0.000000e+00    1.200000e+03
## 1025.4545    4.393404e-08    7.272726e-02
## 2030.9091    2.106529e-01    3.003142e-02
## 3036.3636    2.428282e-08    1.250000e-01
## 4041.8181    1.658971e-08    6.315789e-01
## 5047.2726    1.658958e-08    1.791044e-01
## 6052.7272    4.393404e-08    1.935484e-01
## 7058.1817    1.779838e-03    2.880991e-02
## 8063.6362    0.000000e+00    9.230769e-01

Direct output to another file – append

# take the first 10 lines and save them into another file
head -10 analysis.out >> analysis10.out
cat analysis10.out
## Position Likelihood  Alpha
## 20.0000  0.000000e+00    1.200000e+03
## 1025.4545    4.393404e-08    7.272726e-02
## 2030.9091    2.106529e-01    3.003142e-02
## 3036.3636    2.428282e-08    1.250000e-01
## 4041.8181    1.658971e-08    6.315789e-01
## 5047.2726    1.658958e-08    1.791044e-01
## 6052.7272    4.393404e-08    1.935484e-01
## 7058.1817    1.779838e-03    2.880991e-02
## 8063.6362    0.000000e+00    9.230769e-01
## Position Likelihood  Alpha
## 20.0000  0.000000e+00    1.200000e+03
## 1025.4545    4.393404e-08    7.272726e-02
## 2030.9091    2.106529e-01    3.003142e-02
## 3036.3636    2.428282e-08    1.250000e-01
## 4041.8181    1.658971e-08    6.315789e-01
## 5047.2726    1.658958e-08    1.791044e-01
## 6052.7272    4.393404e-08    1.935484e-01
## 7058.1817    1.779838e-03    2.880991e-02
## 8063.6362    0.000000e+00    9.230769e-01
# move and copy files
cp ./analysis10.out ./analysis20.out # copy to another file in the same dir
cp ./analysis20.out ../ # copy to the parent directory
cp ./analysis20.out ../myAnalysis.out # copy and change name to the new
mv ../myAnalysis.out ./ #move file to the parent directory and rename. mv myAnalysis.out analysis20.out #it will rename (AND OVERWRITE, HERE)

Combine commands

We use the '|' to send the output of the one command to the next command.

#grep and count
# how many sequences exist?
grep '>' seq.fa | wc -l
## 15

\( \color{red}{\text{Task 4}} \)

#grep and count
# how many sequences exist?
grep -i God sayings.txt | grep -i man
grep -i God sayings.txt | grep -i man | wc -l
grep -i God sayings.txt | grep -i woman 
## When God saw how faulty was man He tried again and made woman.  As to
## God made machine language; all the rest is the work of man.
## 2
## When God saw how faulty was man He tried again and made woman.  As to

Running applications (programs) in Linux

We will now run RAxML, which is a phylogenetic software to estimate bootstrap values and a phylogeny in Linux. We will use primate DNA sequences from the D-loop of the mitochondrion. A short description can be found here. “ This is a 14-species data set with mitochondrial sequences. It was selected by Masami Hasegawa from a data set of sequences of 896 nucleotides collected by a number of researchers in Japan. Hasegawa selected sites from the D-loop noncoding region and third positions of adjacent coding sequences, to achieve a set of sites that was as close as possible to having no rate variation. It is used in Figure 19.1, Table 19.1, and Figure 19.2.”