Practice Worksheet - 3

Question 4

data <- c(-0.6, 3.1, 25.3, -16.8, -7.1, -6.2, 16.1, 25.2, 22.6, 26.0)

# Q-Q plot
library(car, quietly=TRUE)
qqPlot(data, pch=19)
plot of chunk unnamed-chunk-1
[1]  4 10

From the Q-Q plot, it seems that data are normally distributed.

# A test of normality
shapiro.test(data)
    Shapiro-Wilk normality test

data:  data
W = 0.87947, p-value = 0.1287

As p-value = 0.1287 > 0.10, the results of the normality test corroborate the results of the Q-Q plot.

Question 5

setwd("C:/Users/asank/Desktop/My Personal/01-JobWork/UOM Docs/MultivariateMethods/Intake 20/RLabs")
data <- read.table("datasets/T3-2.dat")

# Scatterplot of X1 and X2
plot(data$V1, data$V2, pch=19, xlab="X1", ylab="X2")
plot of chunk unnamed-chunk-3

There do not appear to be any outliers with the possible exception of observation #21.

# Test of marginal normality of X1
shapiro.test(data$V1)
    Shapiro-Wilk normality test

data:  data$V1
W = 0.91701, p-value = 0.04382

As p-value = 0.04382 < 0.05, X1 data are not normally distributed. Thus, let's use the power transformation to determine the _lambda_ value that makes X1 data approximately normal.

library(MASS, quietly=TRUE)
b <- boxcox(lm(data$V1 ~ 1))
plot of chunk unnamed-chunk-5
lambda <- b$x[which.max(b$y)]
lambda
[1] 0.06060606

The power transformation _lambda_= 0, i.e., logarithm makes X1 observations more nearly normal.

# Q-Q plot of the transformed observations
library(car, quietly=TRUE)
V1log <- log(data$V1)
qqPlot(V1log, pch=19)
plot of chunk unnamed-chunk-6
[1] 19  5

From the Q-Q plot, it seems that the transformed data are normally distributed.

# Test of marginal normality of X2
shapiro.test(data$V2)
    Shapiro-Wilk normality test

data:  data$V2
W = 0.88247, p-value = 0.007773

As p-value = 0.007773 < 0.05, X2 data are not normally distributed. Thus, let's use the power transformation to determine the _lambda_ value that makes X2 data approximately normal.

library(MASS, quietly=TRUE)
b <- boxcox(lm(data$V2 ~ 1))
plot of chunk unnamed-chunk-8
lambda <- b$x[which.max(b$y)]
lambda
[1] -0.7070707

The power transformation _lambda_= -0.7. Let's use _lambda_= -0.5, i.e., reciprocal of square root and see whehter it makes X2 observations more nearly normal.

# Q-Q plot of the transformed observations
library(car, quietly=TRUE)
V2resqrt <- 1/sqrt(data$V2)
qqPlot(V2resqrt, pch=19)
plot of chunk unnamed-chunk-9
[1] 21 24

From the Q-Q plot, it seems that the transformed data are normally distributed.

# Test of marginal normality of transformed X2
shapiro.test(V2resqrt)
    Shapiro-Wilk normality test

data:  V2resqrt
W = 0.97313, p-value = 0.7247
# Testing multivariate normality of the transformed observations
library(QuantPsyc, quietly=TRUE, warn.conflicts=FALSE)
Attaching package: 'boot'
The following object is masked from 'package:car':

    logit
Attaching package: 'dplyr'
The following object is masked from 'package:MASS':

    select
The following object is masked from 'package:car':

    recode
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Attaching package: 'purrr'
The following object is masked from 'package:car':

    some
# Perform Multivariate normality test
mult.norm(cbind(V1log, V2resqrt))$mult.test
         Beta-hat     kappa     p-val
Skewness 1.478165  6.159022 0.1875830
Kurtosis 7.833410 -0.104119 0.9170749

From the results of multivariate normality test for the transformed observations, it would be difficult to reject bivariate normality.