##Data Analysis #2

## 'data.frame':    1036 obs. of  8 variables:
##  $ SEX   : Factor w/ 3 levels "F","I","M": 2 2 2 2 2 2 2 2 2 2 ...
##  $ LENGTH: num  5.57 3.67 10.08 4.09 6.93 ...
##  $ DIAM  : num  4.09 2.62 7.35 3.15 4.83 ...
##  $ HEIGHT: num  1.26 0.84 2.205 0.945 1.785 ...
##  $ WHOLE : num  11.5 3.5 79.38 4.69 21.19 ...
##  $ SHUCK : num  4.31 1.19 44 2.25 9.88 ...
##  $ RINGS : int  6 4 6 3 6 6 5 6 5 6 ...
##  $ CLASS : Factor w/ 5 levels "A1","A2","A3",..: 1 1 1 1 1 1 1 1 1 1 ...

Test Items starts from here - There are 10 sections - total of 75 points

#### Section 1: (5 points) ####

(1)(a) Form a histogram and QQ plot using RATIO. Calculate skewness and kurtosis using ‘rockchalk.’ Be aware that with ‘rockchalk’, the kurtosis value has 3.0 subtracted from it which differs from the ‘moments’ package.

## Skewness: 0.6573096
## Adjusted Kurtosis: 0.3296656

(1)(b) Tranform RATIO using log10() to create L_RATIO (Kabacoff Section 8.5.2, p. 199-200). Form a histogram and QQ plot using L_RATIO. Calculate the skewness and kurtosis. Create a boxplot of L_RATIO differentiated by CLASS.

## Skewness of L_RATIO: 0.3528897
## Adjusted Kurtosis of L_RATIO: -0.1956505

(1)(c) Test the homogeneity of variance across classes using bartlett.test() (Kabacoff Section 9.2.2, p. 222).

## 
##  Bartlett test of homogeneity of variances
## 
## data:  L_RATIO by CLASS
## Bartlett's K-squared = 33.434, df = 4, p-value = 9.734e-07

Essay Question: Based on steps 1.a, 1.b and 1.c, which variable RATIO or L_RATIO exhibits better conformance to a normal distribution with homogeneous variances across age classes? Why?

Answer: (I think based on the analysis from steps 1.a, 1.b, and 1.c, the variable L_RATIO (log-transformed RATIO) exhibits better conformance to a normal distribution with homogeneous variances across age classes. The histogram and QQ plot for L_RATIO show a more symmetric distribution with reduced skewness and kurtosis values compared to RATIO, indicating that L_RATIO is closer to a normal distribution. Additionally, the Bartlett’s test results show a significant p-value, suggesting that variances across classes may not be fully homogeneous. However, L_RATIO still provides a more normalized and consistent spread across classes than RATIO, making it a better fit for analyses that assume normality and homogeneity of variance.)