CIND 123 Fall 2019 - Assignment #1

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Use RStudio for this assignment. Edit the file assignment-1.Rmd and insert your R code where wherever you see the string “#INSERT YOUR ANSWER HERE”

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Sample Question and Solution

Use seq() to create the vector \((1,2,3,\ldots,10)\).

seq(1,10)

##  [1]  1  2  3  4  5  6  7  8  9 10

Question 1

Use the seq() function to create the vector \((1, 7, 13, \ldots, 61)\). Note that each term in this sequence is of the form \(1 + 6n\) where \(n = 0, \ldots, 10\).

seq(1,61,by=6)

##  [1]  1  7 13 19 25 31 37 43 49 55 61

Use seq() and c() to create the vector \((1, 2, 3, \ldots, 10, 9, 8, \ldots, 3, 2, 1)\).

c(seq(1,9),seq(10,8),seq(7,1))

##  [1]  1  2  3  4  5  6  7  8  9 10  9  8  7  6  5  4  3  2  1

Use rep() to create the vector \((2,3,4,\dots,2,3,4)\) in which the sequence \((2,3,4)\) is repeated 5 times.

rep(2:4,5)

##  [1] 2 3 4 2 3 4 2 3 4 2 3 4 2 3 4

Use rep() to create the vector \((1,1,\ldots,1,2,2,\ldots,2,3,3,\ldots,3)\) where each number is repeated 10 times.

rep(1:3,each = 10)

##  [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3

Question 2

Compute: \[\sum_{n=10}^{100} n^3\]

sum((10:100)^3)

## [1] 25500475

Compute: \[\sum_{n=1}^{10}\left(\frac{2^{n}}{n^2} + \frac{n^{4}}{4^{n}}\right)\]

n <- 1:10
sum(((2^n)/(n^2))+((n^4)/(4^n)))

## [1] 35.80589

Compute: \[\sum_{n=0}^{10} \frac{1}{(n+1)!}\] Hint: Use factorial(n) to compute \(n!\)

n <- 0:10
sum(1/factorial(n+1))

## [1] 1.718282

Compute: \[\prod_{n=3}^{33} \left(3n + \frac{3}{\sqrt[3]{n}}\right)\]

n <- 3:33
prod((3*n)+(3/(n^1/3)))

## [1] 7.427886e+51

Question 3

Create an empty list mylist.

mylist <- c()
mylist

## NULL

Add a component named firstVar whose value is the numeric vector \((1,2,\ldots,10)\).

firstVar <- seq(1,10)
firstVar

##  [1]  1  2  3  4  5  6  7  8  9 10

Add a component named secondVar which is a 4x5 matrix whose elements are \((1,2,\ldots,20)\) in row-wise order.

secondVar <- matrix(seq(1,20),nrow = 4,ncol = 5,byrow = TRUE)
secondVar

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20

Add a component named thirdVar which is the output of multipling each element of secondVar by the average of firstVar.

thirdVar <- secondVar * mean(firstVar)
thirdVar

##      [,1] [,2] [,3]  [,4]  [,5]
## [1,]  5.5 11.0 16.5  22.0  27.5
## [2,] 33.0 38.5 44.0  49.5  55.0
## [3,] 60.5 66.0 71.5  77.0  82.5
## [4,] 88.0 93.5 99.0 104.5 110.0

Display mylist on the screen, after rounding the elements of ‘thirdVar’ to the nearest integer.

mylist <- list(firstVar,secondVar,round(thirdVar,0))
print(mylist)

## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10
## 
## [[2]]
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
## [3,]   11   12   13   14   15
## [4,]   16   17   18   19   20
## 
## [[3]]
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    6   11   16   22   28
## [2,]   33   38   44   50   55
## [3,]   60   66   72   77   82
## [4,]   88   94   99  104  110

Question 4

iris data set gives the measurements in centimeters of the variables sepal length, sepal width, petal length and petal width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

Install the iris data set on your computer using the command install.packages("datasets"). Then load the datasets package into your session using the following command.

library(datasets)
#install.packages("datasets")

Display the first 6 rows of the iris data set

head(iris)

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Compute the average of the first four variables (Sepal.Length, Sepal.Width, Petal.Length and Petal.Width) using sapply() function.

Hint: You might need to consider removing the NA values, otherwise the average will not be computed.

sapply(iris[,-5] ,mean ,rm = NA )

## Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
##     5.843333     3.057333     3.758000     1.199333

Show how to use R to replace the imssing values in this dataset with plausible ones.

iris[is.na(iris)] <- 0

Compute the standard deviation for only the first and the third variables (Sepal.Length and Petal.Length)

sd(iris[,1])

## [1] 0.8280661

sd(iris[,3])

## [1] 1.765298

Construct a boxplot for Sepal.Width variable, then display all the outliers.

library(ggplot2)
ggplot(iris, aes(x = "", y=Sepal.Width)) +
  geom_boxplot(outlier.colour="red", 
             outlier.shape=16,
             outlier.size=2, notch=FALSE)

Compute the lower and the upper quartiles of Sepal.Width variable

dplyr::summarize(iris,  "lower Quartile" = quantile(Sepal.Length, .25),
                 "upper Quartile" = quantile(Sepal.Length, .75), )

##   lower Quartile upper Quartile
## 1            5.1            6.4

Construct a pie chart to describe the species with ‘Sepal.Length’ greater than 5 centimeters.

# Pie Chart from data frame with Appended Sample Sizes
sepalLength6 <- subset(iris, iris$Sepal.Length > 5)
mytable <- table(sepalLength6$Species)
lbls <- paste(names(mytable), "\n", mytable, sep="")
pie(mytable, labels = lbls,
   main="Pie Chart of Species\n and Sepal length > 5")

tail(sepalLength6)

##     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
## 145          6.7         3.3          5.7         2.5 virginica
## 146          6.7         3.0          5.2         2.3 virginica
## 147          6.3         2.5          5.0         1.9 virginica
## 148          6.5         3.0          5.2         2.0 virginica
## 149          6.2         3.4          5.4         2.3 virginica
## 150          5.9         3.0          5.1         1.8 virginica