Worksheet 21: reading Excel files and some ant data.

Task 1

read.csv("CompleteGeneticData.csv")-> CSVcompleteGeneticData 

to read the .csv file of the complete genetic data.

Task 2

Task 3

  1. Yes, there seem to be differences in the way the data is being interpreted by R, when the file is opened using different means.

  2. the file read using read.xlsx() seems to read all columns as a factor data type. on the other hand, the file read using read.csv() seems to give each column the right data type. For example, if the data is a set of numbers, then the column is read a numeric vector.

Task 4

CGTTCAATAAGCAAAAATCCATAGTTTTAGGAATGTGGGCT GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC ACACACACACACACACACACACACACACACAGTGCCAAGCA AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC AGAAGTTCGTGCTGTCGGGGCATGCGGCGAGTAAAGGGGAT

Task 5

subset(CSVcompleteGeneticData,CSVcompleteGeneticData$Plot=="WV.R")->WV.Rdatasubset
plot(WV.Rdatasubset$transect.x, WV.Rdatasubset$transect.y, xlab = "transect.X", ylab="transect.y",main="transect.x vs transect.y" )

Task 6

install.packages("plotrix",repos="http://cran.rstudio.com/")
## 
## The downloaded binary packages are in
##  /var/folders/0w/0k09n4rx6fd5gmm7hcv1yj8r0000gn/T//RtmphEzFjT/downloaded_packages
#the argument (repos="http://cran.rstudio.com/") serves so that the chunk runs smoothly when knitted by Rmd.
library(plotrix)
## Warning: package 'plotrix' was built under R version 3.2.5
multhist(CSVcompleteGeneticData[,9:20], xlab="microsattelite lengths", ylab="frequency", main="Histogram of microsattelite length")

Task 7

length(which(CSVcompleteGeneticData$free_slave==1))
## [1] 683

Task 8

unique(CSVcompleteGeneticData[,c(2,4)])->colony_free_slaves
length(which(colony_free_slaves$free_slave==1))
## [1] 37

Task 9