Tugas Terstruktur APG Pertemuan 1

I. Soal Latihan Terstruktur

Nomor 1.1

Consider the seven pairs of measurements (\(x_{1}\),\(x_{2}\)) plotted in Figure 1.1 :

##    [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## x1    3  4.0    2    6    8    2  5.0
## x2    5  5.5    4    7   10    5  7.5

Calculate the sample means \(\bar{x}_{1}\) and \(\bar{x}_{2}\) , the sample variances \(s_{11}\) and \(s_{22}\) and the sampel covariance \(s_{12}\) !

Nomor 1.4

The 10 largest U.S. industrial corporations yield the following data :

(a.) Plot the scatter diagram and marginal dot diagrams for variables \(x_{1}\) and \(x_{2}\) . Comment on the appearance of the diagrams.

(b.) Compute \(\bar{x}_{1}\) , \(\bar{x}_{2}\) , \(s_{11}\), \(s_{22}\), \(s_{12}\) and \(r_{12}\) . Interpret \(r_{12}\) .

Sumber Soal : Johnson, R. A. dan Winchern, D. W. 2002. Applied Multivariate Statistical Analysis 5th edition. New Jersey: Prentice Hall, Pages 38-39.

II. Jawaban

Nomor 1.1

Calculate the sample means \(\bar{x}_{1}\) and \(\bar{x}_{2}\) , the sample variances \(s_{11}\) and \(s_{22}\) and the sampel covariance \(s_{12}\) !

x1 <- c(3,4,2,6,8,2,5) # Buat vektor x1
x2 <- c(5,5.5,4,7,10,5,7.5) # Buat vektor x2
data1.1 <- data.frame(x1,x2)
xbar1.1 <- colMeans(data1.1) # Melakukan penghitungan rata-rata pada tiap kolom di data1.1
print("Vector Means : ")
print(xbar1.1)
varian.kovarian <- function(x){ # Membuat fungsi untuk matriks varian kovarian (biased)
    A = as.matrix(x)
    n = dim(x)[1]
    satu = rep(1,n) # Vektor dengan elemen 1
    matriks1 = satu %*% t(satu) # Membuat matriks persegi dengan semua elemen bernilai 1
    a = A - matriks1 %*% A / n
    ata = t(a) %*% a
    Sn = ata / n # Matriks varian kovarian (Jika ingin unbiased ganti n dengan n-1)
    # maka akan setara dengan fungsi cov(...) yang unbiased.
return(Sn)
}
matriks1.1 <- varian.kovarian(data1.1) # Melakukan penghitungan matriks varian-kovarian
print("Matriks varians-kovarians : ")
print(matriks1.1) # Matriks varians-kovarians

## [1] "Vector Means : "
##       x1       x2 
## 4.285714 6.285714 
## [1] "Matriks varians-kovarians : "
##          x1       x2
## x1 4.204082 3.704082
## x2 3.704082 3.561224

Hasil Penghitungan :

Nilai rata-rata \(x_{1}\) atau \(\bar{x}_{1}\) adalah 4,2857.
Nilai rata-rata \(x_{2}\) atau \(\bar{x}_{2}\) adalah 6,2857 .
Nilai varians sampel dari \(x_{1}\) atau \(s_{11}\) adalah 4,2040.
Nilai varians sampel dari \(x_{2}\) atau \(s_{22}\) adalah 3,5612.
Nilai kovarians antara \(x_{1}\) dan \(x_{2}\) atau \(s_{12}\) adalah 3,7040.

Nomor 1.4

(a.) Plot the scatter diagram and marginal dot diagrams for variables \(x_{1}\) and \(x_{2}\) . Comment on the appearance of the diagrams.

sales <- c(126974,96933,86656,63438,55264,50976,39069,36156,35209,32416) #X1
profits <- c(4224,3835,3510,3758,3939,1809,2946,359,2480,2413) #X2
data1.4 <- data.frame(sales,profits)
#Script plot didapatkan dari situs :
## https://www.stat.ncsu.edu/people/bloomfield/courses/st731/dotplots
## Define a layout. Here we devide the plotting area to four subareas. 
## For details, read the help 
## file for function "layout".
nf <- layout(matrix(c(3,1,0,2),2,2,byrow=TRUE), c(1,5), c(5,1), TRUE)
x1 <- data1.4$sales
x2 <- data1.4$profits
## Plot scatter plot first.
## par(mar = c()) is used to leave proper margins around plots.
par(mar=c(5,4,2,2))
plot(x1, x2, xlim=c(33000,127000), ylim=c(350,4300), 
     xlab="X1, Sales,millions of dollars",
     ylab="X2, Profits, millions of dollars",
     main = "Scatter Diagram dan Marginal Dot Diagram")

## Then plot dot diagram under the scatter plot, and dot diagram to the left of it.
## Some setting of axis is just to make the plots look better.
par(mar=c(3,4,2,2))
plot(x1, rep(1,10), xlim=c(33000,127000), ylim=c(1,1), xlab="", ylab="",axes = F)
axis(side = 3,at = seq(from = 33000, to = 127000, by = 3000))
par(mar=c(5,3,2,2))
plot(rep(1,10), x2, xlim=c(1,1), ylim=c(350,4300), xlab="", ylab="",axes = F)
axis(side = 4,at = seq(from = 350, to = 4300, by = 300))

Data \(x_{1}\) dan \(x_{2}\) :

##      [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
## x1 126974 96933 86656 63438 55264 50976 39069 36156 35209 32416
## x2   4224  3835  3510  3758  3939  1809  2946   359  2480  2413

Memperjelas distribusi dari variabel \(x_{1}\) dan \(x_{2}\) dengan marginal density diagram.

padat <- data.frame(x1,x2)
p <- ggplot(padat, aes(x=x1, y=x2, color=1, size=2)) +
      geom_point() +
      theme(legend.position="none") + 
      ggtitle("Scatter diagram dan Marginal density diagram")  
 
# with marginal density diagram
p1 <- ggMarginal(p, type="density",fill = "LightSalmon")
p1

Interpretasi Scatter dan Marginal dot diagram:
Scatter diagram memperlihatkan bahwa variabel sales ( \(x_{1}\) ) dan profit ( \(x_{2}\) ) memiliki hubungan yang positif. Marginal dot diagram menunjukkan bahwa variabel \(x_{1}\) memiliki distribusi menceng kanan (positive skewness). Sementara itu, untuk variabel \(x_{2}\) memiliki distribusi menceng kiri (negative skewness). Untuk memperjelas bentuk distribusi dari \(x_{1}\) dan \(x_{2}\) juga dibentuk marginal density diagram.

(b.) Compute \(\bar{x}_{1}\) , \(\bar{x}_{2}\) , \(s_{11}\), \(s_{22}\), \(s_{12}\) and \(r_{12}\) . Interpret \(r_{12}\) .

sales <- c(126974,96933,86656,63438,55264,50976,39069,36156,35209,32416)
profits <- c(4224,3835,3510,3758,3939,1809,2946,359,2480,2413)
data1.4 <- data.frame(sales,profits)
xbar1.4 <- colMeans(data1.4)
xbar1.4 # Vector means

##   sales profits 
## 62309.1  2927.3

matriks1.4 <- varian.kovarian(data1.4)
matriks1.4 # Matriks varians-kovarian

##             sales  profits
## sales   900458202 23018040
## profits  23018040  1287018

korelasi <- cov2cor(matriks1.4)
korelasi # Matriks korelasi

##             sales   profits
## sales   1.0000000 0.6761519
## profits 0.6761519 1.0000000

Hasil Penghitungan :

Nilai rata-rata sales atau \(\bar{x}_{1}\) adalah 62309,1.
Nilai rata-rata profits atau \(\bar{x}_{2}\) adalah 2927,3.
Nilai varians sampel dari sales atau \(s_{11}\) adalah 900458202.
Nilai varians sampel dari sales atau \(s_{22}\) adalah 1287018.
Nilai kovarians antara sales dan profits atau \(s_{12}\) adalah 23018040.
Nilai korelasi antara sales dan profits atau \(r_{12}\) adalah 0,6761.

Interpretasi \(r_{12}\) :
Nilai korelasi antara variabel sales dan profits sebesar 0,6761 , mengindikasikan bahwa hubungan antara kedua variabel tersebut cukup kuat dan positif.

Tugas Terstruktur APG Pertemuan 1

Oleh : Fayadh Abiyyi

I. Soal Latihan Terstruktur

Nomor 1.1

Nomor 1.4

II. Jawaban

Nomor 1.1

Nomor 1.4