Task 1
read.csv("CompleteGeneticData.csv")-> CSVcompleteGeneticData
to read the .csv file of the complete genetic data.
Task 2
use instal.packages("xlsx")to install the xlsx package into our computer. consequently, I used library(xlsx) to load the package into R.
the completeGeneticData.xlsx file was read and stored as a variable using read.xlsx2("CompleteGeneticData.xlsx", sheetIndex = 1)->XLSXcompleteGeneticData
Task 3
Yes, there seem to be differences in the way the data is being interpreted by R, when the file is opened using different means.
the file read using read.xlsx() seems to read all columns as a factor data type. on the other hand, the file read using read.csv() seems to give each column the right data type. For example, if the data is a set of numbers, then the column is read a numeric vector.
Task 4
CGTTCAATAAGCAAAAATCCATAGTTTTAGGAATGTGGGCT GCTTGGTGTGATGTAGAAGGCGCCAATGCATCTCGACGTAT GCGTATACGGGTTACCCCCTTTGCAATCAGTGCACACACAC ACACACACACACACACACACACACACACACAGTGCCAAGCA AAAATAACGCCAAGCAGAACGAAGACGTTCTCGAGAACACC AGAAGTTCGTGCTGTCGGGGCATGCGGCGAGTAAAGGGGAT
Task 5
subset(CSVcompleteGeneticData,CSVcompleteGeneticData$Plot=="WV.R")->WV.Rdatasubset
plot(WV.Rdatasubset$transect.x, WV.Rdatasubset$transect.y, xlab = "transect.X", ylab="transect.y",main="transect.x vs transect.y" )
Task 6
multhist(), which is part of the package “plotrix”. therefore we needed to first install and load the package.install.packages("plotrix",repos="http://cran.rstudio.com/")
##
## The downloaded binary packages are in
## /var/folders/0w/0k09n4rx6fd5gmm7hcv1yj8r0000gn/T//RtmphEzFjT/downloaded_packages
#the argument (repos="http://cran.rstudio.com/") serves so that the chunk runs smoothly when knitted by Rmd.
library(plotrix)
## Warning: package 'plotrix' was built under R version 3.2.5
multhist() to make our histogram.multhist(CSVcompleteGeneticData[,9:20], xlab="microsattelite lengths", ylab="frequency", main="Histogram of microsattelite length")
Task 7
CSVcompleteGeneticData$free_slave every value equal to 0 means that the ant is free and every value equal to 1 is a slave ant. therefore in order to count how many ants are slaves, I used the following command:length(which(CSVcompleteGeneticData$free_slave==1))
## [1] 683
Task 8
unique() to determine which colonies has slaves and which did not. the following chunk of command describes what I did:unique(CSVcompleteGeneticData[,c(2,4)])->colony_free_slaves
length(which(colony_free_slaves$free_slave==1))
## [1] 37
Task 9