MATH 376 Homework 3

Part 1: Independent or Dependent Variables?

1)

Scatter and contour plots for X and Y. It seems that X and Y are independent since both the scatter plot and contour plot have no particular shape or pattern.

X = runif(1000)
Y = runif(1000)

xx = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
xx.kde=kde2d(xx[,1], xx[,2], n =50)

yy = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
yy.kde=kde2d(yy[,1], yy[,2], n =50)


contour(xx.kde)

contour(yy.kde)

plot(X)

plot(Y)

Scatter and contour plot for U and V. From the scatter plots, it seems that there is a concentration of points near the center. The contour plot is also stretched out which implies that U and V are not independent.

U = X-Y
V = X+Y

uu = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
uu.kde=kde2d(uu[,1], uu[,2], n =50)

vv = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
vv.kde=kde2d(vv[,1], vv[,2], n =50)


contour(uu.kde)

contour(vv.kde)

plot(U)

plot(V)

2)

It seems that Z1 and Z2 are independent since both the scatter plot and contour plot have no particular shape or pattern.

Z1 = rnorm(1000)
Z2 = rnorm(1000)

zz1 = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
zz1.kde=kde2d(zz1[,1], zz1[,2], n =50)

zz2 = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
zz2.kde=kde2d(zz2[,1], zz2[,2], n =50)

U = Z1-Z2
V = Z1+Z2

contour(zz1.kde)

contour(zz2.kde)

plot(X)

plot(Y)

b) U and V both accumulate around 0 and have a stretched out contour plot. Therefore, U and V do not seem independent.

U = Z1-Z2
V = Z1+Z2

uu = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
uu.kde=kde2d(uu[,1], uu[,2], n =50)

vv = mvrnorm(1000, mu=c(0,0), Sigma =matrix(c(1,.5,.5,1), 2))
vv.kde=kde2d(vv[,1], vv[,2], n =50)


contour(uu.kde)

contour(vv.kde)

plot(U)

plot(V)

## Part 2 Exploring Distributions

###1)

Chi-Squared

x <- rchisq(1000, 1)
plot(density(x))
hist(x, freq=FALSE, add=TRUE)

mean(x)

## [1] 0.9766505

var(x)

## [1] 1.877138

x <- rchisq(1000, 10)
plot(density(x))
hist(x, freq=FALSE, add=TRUE)

mean(x)

## [1] 9.945407

var(x)

## [1] 19.2963

2)

The graphs for df = 1 is much skinnier and gives less information than df = 30. This makes sense since there is only 1 degree of freedom. In the graph with more degrees of freedom, we can see that the distribution is roughly normal centered at 0. The t.ratio has less skew than the t distribution.

t1 <- rt(1000, df=1)
plot(density(t1))
hist(t1, freq=FALSE, add=TRUE)

z <- rnorm(1000)
v <- rchisq(1000, 1)

t.ratio <- z/sqrt(v/1)
plot(density(t.ratio))
hist(t.ratio, freq=FALSE, add=TRUE)

t1 <- rt(1000, df=30)
plot(density(t1))
hist(t1, freq=FALSE, add=TRUE)

z <- rnorm(1000)
v <- rchisq(1000, 30)

t.ratio <- z/sqrt(v/30)
plot(density(t.ratio))
hist(t.ratio, freq=FALSE, add=TRUE)

c) The 95th percentile would mean that 95% of the data is below that value. So, a quantile represents parts of the data that is greater than or below a certain limit.

qnorm(0.95)

## [1] 1.644854

qt(.95, 1)

## [1] 6.313752

qt(.95, 2)

## [1] 2.919986

qt(.95, 3)

## [1] 2.353363

qt(.95, 10)

## [1] 1.812461

qt(.95, 20)

## [1] 1.724718

qt(.95, 30)

## [1] 1.697261

3)

u <- rf(1000, 3, 7)
plot(density(u))
hist(u, freq=FALSE, add=TRUE)

v <- rf(1000, 3, 27)
plot(density(v))
hist(v, freq=FALSE, add=TRUE)

 qf(0.95, 3, 7)

## [1] 4.346831

 qf(0.95, 7, 3)

## [1] 8.886743

The distribution with more degrees of freedom is wider than the one with less degrees of freedom. The distributions aer both right skewed.
The two qfs are very different depending on df1 and df2. The data will be more spread with high df so the .95 quantile will be greater.

MATH 376 Homework 3

Olivia Chu

2/11/2021

Part 1: Independent or Dependent Variables?

1)

2)

2)

3)