HW 1

Question #1

Describe the data set

The Iris data set is based on data recorded on flowers. The sepal width and length desribes the green, leaf-like portion of the flower that normally encapsulates the flower. The petal descriptions show the width and length of the flowers found. The species is the kind of iris flower that the data was recorded from. The different sizes of sepals and petals can be used to distinguish the species of flowers.

Question #2

Shapiro-Wilk Test

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.97935, p-value = 0.02342

Null: The data is normal (p>.05) Null: {y1, y2, y3, y4} follow a multivariate normal distribution Alternate: The data is not normal (p<.05)

Conclusion: The data is not normal since the p-value is less than .05 at .023.

Question #3

Jarque-Bera Test

## 
##  Jarque-Bera test for normality
## 
## data:  iris_OD
## JB = 1.4978, p-value < 2.2e-16

Null: The data is normal (p>.05) Alternate: The data is not normal (p<.05)

Conclusion: The data is not normal sicne the p-value is less than .05.

Question #4

Anderson-Darling Tests

## 
##  Anderson-Darling normality test
## 
## data:  iris_OD$Sepal.Length
## A = 0.8892, p-value = 0.02251

## 
##  Anderson-Darling normality test
## 
## data:  iris_OD$Sepal.Width
## A = 0.90796, p-value = 0.02023

## 
##  Anderson-Darling normality test
## 
## data:  iris_OD$Petal.Length
## A = 7.6785, p-value < 2.2e-16

## 
##  Anderson-Darling normality test
## 
## data:  iris_OD$Petal.Width
## A = 5.1057, p-value = 1.125e-12

Cramer-von Mises Tests

## 
##  Cramer-von Mises normality test
## 
## data:  iris_OD$Sepal.Length
## W = 0.1274, p-value = 0.04706

## 
##  Cramer-von Mises normality test
## 
## data:  iris_OD$Sepal.Width
## W = 0.18065, p-value = 0.009336

## Warning in cvm.test(iris_OD$Petal.Length): p-value is smaller than 7.37e-10,
## cannot be computed more accurately

## 
##  Cramer-von Mises normality test
## 
## data:  iris_OD$Petal.Length
## W = 1.2223, p-value = 7.37e-10

## 
##  Cramer-von Mises normality test
## 
## data:  iris_OD$Petal.Width
## W = 0.72156, p-value = 4.338e-08

Kolmogorov-Smirnov Tests

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  iris_OD$Sepal.Length
## D = 0.088654, p-value = 0.005788

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  iris_OD$Sepal.Width
## D = 0.10566, p-value = 0.0003142

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  iris_OD$Petal.Length
## D = 0.19815, p-value = 7.901e-16

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  iris_OD$Petal.Width
## D = 0.17283, p-value = 7.33e-12

Pearson Chi-Squared Tests

## 
##  Pearson chi-square normality test
## 
## data:  iris_OD$Sepal.Length
## P = 17.4, p-value = 0.1352

## 
##  Pearson chi-square normality test
## 
## data:  iris_OD$Sepal.Width
## P = 46.2, p-value = 6.409e-06

## 
##  Pearson chi-square normality test
## 
## data:  iris_OD$Petal.Length
## P = 192.8, p-value < 2.2e-16

## 
##  Pearson chi-square normality test
## 
## data:  iris_OD$Petal.Width
## P = 155.6, p-value < 2.2e-16

Shapiro-Francia Test

## 
##  Shapiro-Francia normality test
## 
## data:  iris_OD$Sepal.Length
## W = 0.97961, p-value = 0.02621

## 
##  Shapiro-Francia normality test
## 
## data:  iris_OD$Sepal.Width
## W = 0.98483, p-value = 0.0902

## 
##  Shapiro-Francia normality test
## 
## data:  iris_OD$Petal.Length
## W = 0.88186, p-value = 1.844e-08

## 
##  Shapiro-Francia normality test
## 
## data:  iris_OD$Petal.Width
## W = 0.90831, p-value = 3.006e-07

Question #5

Histogram of Sepal Length

## $breaks
## [1] 4.1 4.6 5.1 5.6 6.1 6.6 7.1 7.6 8.1
## 
## $counts
## [1]  9 32 24 30 27 17  6  5
## 
## $density
## [1] 0.12000000 0.42666667 0.32000000 0.40000000 0.36000000 0.22666667 0.08000000
## [8] 0.06666667
## 
## $mids
## [1] 4.35 4.85 5.35 5.85 6.35 6.85 7.35 7.85
## 
## $xname
## [1] "iris_OD$Sepal.Length"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

## [1] 5.843333

## [1] 0.8280661

## [1] 4.3

## [1] 7.9

The histogram, although kind of funky, shows a relatively normal distribution of the sepal lengths among the flowers.

The average sepal length is 5.84 with a standard deviation of 0.83. There appears to be some skewness within the graph to the right. The probability of getting a sepal length between 4.7 and 6.7 is around 87%.

Histogram Sepal Width

## $breaks
##  [1] 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4
## 
## $counts
##  [1]  4  7 13 23 36 24 18 10  9  3  2  1
## 
## $density
##  [1] 0.13333333 0.23333333 0.43333333 0.76666667 1.20000000 0.80000000
##  [7] 0.60000000 0.33333333 0.30000000 0.10000000 0.06666667 0.03333333
## 
## $mids
##  [1] 2.1 2.3 2.5 2.7 2.9 3.1 3.3 3.5 3.7 3.9 4.1 4.3
## 
## $xname
## [1] "iris_OD$Sepal.Width"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

## [1] 3.057333

## [1] 0.4358663

## [1] 2

## [1] 4.4

The Sepal Width histogram shows a bit of skewness to the right as more data sits left of the center of the histogram.

The average width is 3.06 with a standard deviation of 0.44. There is some skewness to the right. The probability of getting a sepal length between 2.5 and 3.5 is 83%.

Histogram Petal Length

## $breaks
##  [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
## 
## $counts
##  [1] 37 13  0  1  4 11 21 21 17 16  5  4
## 
## $density
##  [1] 0.49333333 0.17333333 0.00000000 0.01333333 0.05333333 0.14666667
##  [7] 0.28000000 0.28000000 0.22666667 0.21333333 0.06666667 0.05333333
## 
## $mids
##  [1] 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 5.25 5.75 6.25 6.75
## 
## $xname
## [1] "iris_OD$Petal.Length"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

## [1] 3.758

## [1] 1.765298

## [1] 1

## [1] 6.9

This histogram shows a strange distribution relative to the sepal length and width. With most of the observations taking place at the left of the histogram. However, there appears to be some normality of distribution from petal lengths 3 through 7.

The average width of the Petals is 3.76 with a standard deviation of 1.77. The histogram is skewed to the right with the median sitting at 4.0. Despite this, 57% of the observations sat between 3.5 and 5.5 with another 33% of the observations sitting between 1.0 and 1.5.

Histogram Petal Width

## $breaks
##  [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6
## 
## $counts
##  [1] 34 14  2  0  7  8 21 16 14 11  9 11  3
## 
## $density
##  [1] 1.13333333 0.46666667 0.06666667 0.00000000 0.23333333 0.26666667
##  [7] 0.70000000 0.53333333 0.46666667 0.36666667 0.30000000 0.36666667
## [13] 0.10000000
## 
## $mids
##  [1] 0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5
## 
## $xname
## [1] "iris_OD$Petal.Width"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

## [1] 1.199333

## [1] 0.7622377

## [1] 0.1

## [1] 2.5

This histogram appears to be very much right skewed showing the most observations at 0 for the petal width.

The mean for Petal Width is 1.20 with a standard deviation of .76. The mean and the median are the same in the data set making determind the skewness difficult. 60% of the observations sit between 1.0 and 2.4 with 32% of the observations sitting between 0.0 and 0.2.

Box Plot for Iris Data

The boxplot shows a few outliers within sepal width, but a relatively small distribution. Petal Length has the largest box of the group, but is clearly skewed by its large number of observations on the lower end of the data set.

Question #6

Variance-Covariance

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    0.6856935  -0.0424340    1.2743154   0.5162707
## Sepal.Width    -0.0424340   0.1899794   -0.3296564  -0.1216394
## Petal.Length    1.2743154  -0.3296564    3.1162779   1.2956094
## Petal.Width     0.5162707  -0.1216394    1.2956094   0.5810063

## [1] 0.00191273

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
## Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
## Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
## Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000

## [1] 0.008109611

Question #7

Box M Test

## $Chisq
## [1] 140.943
## 
## $df
## [1] 20
## 
## $p.value
## [1] 3.352034e-20
## 
## $Test
## [1] "BoxM"
## 
## attr(,"class")
## [1] "MVTests" "list"

Null: All three of the matrices are equal. Alternate: At least two of the matrices are not equal.

Conclusion: The null is rejected since the matrices are not equal. This is known by observing a p-value that is less than .05.