Lesson 1: Measures of Central Tendency, Dispersion and
Association
- Example 1-5: Women’s Health Survey (Descriptive Statistics)
|
Variable
|
n
|
Mean
|
StdDev
|
Minimum
|
Maximum
|
|
calcium
|
737
|
624.04925
|
397.277540
|
7.44
|
2866.440
|
|
iron
|
737
|
11.12990
|
5.984191
|
0.00
|
58.668
|
|
protein
|
737
|
65.80344
|
30.575756
|
0.00
|
251.012
|
|
a
|
737
|
839.63535
|
1633.539828
|
0.00
|
34434.270
|
|
c
|
737
|
78.92845
|
73.595272
|
0.00
|
433.339
|
- The variance-covariance matrix :
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
calcium
|
157829.4439
|
940.08944
|
6075.8163
|
102411.127
|
6701.6160
|
|
iron
|
940.0894
|
35.81054
|
114.0580
|
2383.153
|
137.6720
|
|
protein
|
6075.8163
|
114.05803
|
934.8769
|
7330.052
|
477.1998
|
|
vitamin A
|
102411.1266
|
2383.15341
|
7330.0515
|
2668452.371
|
22063.2486
|
|
vitamin C
|
6701.6160
|
137.67199
|
477.1998
|
22063.249
|
5416.2641
|
- Sample correlations
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
calcium
|
1.0000000
|
0.3954301
|
0.5001882
|
0.1578060
|
0.2292111
|
|
iron
|
0.3954301
|
1.0000000
|
0.6233662
|
0.2437905
|
0.3126009
|
|
protein
|
0.5001882
|
0.6233662
|
1.0000000
|
0.1467574
|
0.2120670
|
|
vitamin A
|
0.1578060
|
0.2437905
|
0.1467574
|
1.0000000
|
0.1835227
|
|
vitamin C
|
0.2292111
|
0.3126009
|
0.2120670
|
0.1835227
|
1.0000000
|
- Example 1-6: Woman’s Health Survey (Variance)
|
Variable
|
Variance
|
|
Calcium
|
157829.443870502
|
|
Iron
|
35.810535582296
|
|
Protein
|
934.87688135283
|
|
Vitamin A
|
2668452.37064258
|
|
Vitamin C
|
5416.26407815779
|
|
Total
|
2832668.76600817
|
- Example 1-7: Woman’s Health Survey (Generalized
Variance)
Example: Nutrient Intake Data - Generalized variance
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
|
calcium
|
157829.4439
|
940.08944
|
6075.8163
|
102411.127
|
6701.6160
|
2.831042e+19
|
|
iron
|
940.0894
|
35.81054
|
114.0580
|
2383.153
|
137.6720
|
2.831042e+19
|
|
protein
|
6075.8163
|
114.05803
|
934.8769
|
7330.052
|
477.1998
|
2.831042e+19
|
|
vitamin A
|
102411.1266
|
2383.15341
|
7330.0515
|
2668452.371
|
22063.2486
|
2.831042e+19
|
|
vitamin C
|
6701.6160
|
137.67199
|
477.1998
|
22063.249
|
5416.2641
|
2.831042e+19
|
Lesson 2: Linear Combinations of Random
Variables
- 2-3: Women’s Health Survey (Population Mean)
|
Variable
|
Mean
|
|
Calcium
|
624.04925
|
|
Iron
|
11.12990
|
|
Protein
|
65.80344
|
|
Vitamin A
|
839.63535
|
|
Vitamin C
|
78.92845
|
[1] "Estimated mean intake of the two vitamins:"
[1] 79.76808
- 2-4: Women’s Health Survey (Population Variance)
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
calcium
|
157829.4439
|
940.08944
|
6075.8163
|
102411.127
|
6701.6160
|
|
iron
|
940.0894
|
35.81054
|
114.0580
|
2383.153
|
137.6720
|
|
protein
|
6075.8163
|
114.05803
|
934.8769
|
7330.052
|
477.1998
|
|
vitamin A
|
102411.1266
|
2383.15341
|
7330.0515
|
2668452.371
|
22063.2486
|
|
vitamin C
|
6701.6160
|
137.67199
|
477.1998
|
22063.249
|
5416.2641
|
[1] "The sample variance of Y is:"
[1] 5463.059
- 2-5: Women’s Health Survey (Pop. Covariance and
Correlation)
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
calcium
|
157829.4439
|
940.08944
|
6075.8163
|
102411.127
|
6701.6160
|
|
iron
|
940.0894
|
35.81054
|
114.0580
|
2383.153
|
137.6720
|
|
protein
|
6075.8163
|
114.05803
|
934.8769
|
7330.052
|
477.1998
|
|
vitamin A
|
102411.1266
|
2383.15341
|
7330.0515
|
2668452.371
|
22063.2486
|
|
vitamin C
|
6701.6160
|
137.67199
|
477.1998
|
22063.249
|
5416.2641
|
Note, \(Y_1\) is the total intake of
vitamins A and C and \(Y_2\) is the
total intake of calcium and iron.
[1] "The sample covariance between Y_1 and Y_2:"
[1] 6944.082
[1] "The sample variance of Y_2 is:"
[1] 159745.4
[1] "The sample correlation between Y_1 and Y_2 is:"
[1] 0.2350621
Lesson 3: Graphical Display of Multivariate
Data
Lesson 4: Multivariate Normal Distribution
- Bivariate Normal Distribution
- 4.3 Calculating Mahalanobis Distance
|
dist2
|
|
0.1546640
|
|
8.9860868
|
|
3.8413545
|
|
2.8801184
|
|
0.6785061
|
|
0.7314023
|
|
0.6988023
|
|
1.1029161
|
|
37.6264510
|
|
1.0880096
|
|
0.6447859
|
|
0.8679835
|
|
0.2064665
|
|
0.0794967
|
|
0.2787355
|
|
4.6633784
|
|
12.1952311
|
|
6.0454561
|
|
0.2917265
|
|
1.0271777
|
|
4.6434462
|
|
0.8539685
|
|
2.1987106
|
|
1.5010624
|
|
1.2807514
|
|
2.8544397
|
|
1.3753355
|
|
0.9072836
|
|
10.5129471
|
|
5.7833057
|

- 4.7 - Example: Wechsler Adult Intelligence Scale
|
Variable
|
Mean
|
|
Information
|
12.567568
|
|
Similarities
|
9.567568
|
|
Arithmetic
|
11.486486
|
|
Picture Completion
|
7.972973
|
|
|
info
|
sim
|
arith
|
pict
|
|
info
|
11.474474
|
9.0855856
|
6.382883
|
2.0713213
|
|
sim
|
9.085586
|
12.0855856
|
5.938438
|
0.5435435
|
|
arith
|
6.382883
|
5.9384384
|
11.090090
|
1.7912913
|
|
pict
|
2.071321
|
0.5435435
|
1.791291
|
3.6936937
|
The eigenvalues and corresponding eigenvectors are given below:
|
Eigenvalue
|
|
26.245278
|
|
6.255366
|
|
3.931553
|
|
1.911647
|
|
|
X1
|
X2
|
X3
|
X4
|
|
info
|
-0.6057467
|
-0.6047618
|
-0.5051337
|
-0.1103360
|
|
sim
|
0.2176473
|
0.4958117
|
-0.7946452
|
-0.2744802
|
|
arith
|
0.4605028
|
-0.3196759
|
-0.3349263
|
0.7573433
|
|
pict
|
0.6112591
|
-0.5350152
|
0.0346888
|
-0.5821664
|
Lesson 6: Multivariate Conditional Distribution and Partial
Correlation
- 6-2: Wechsler Adult Intelligence Scale
Example 6-2: Wechsler Adult Intelligence Data
|
|
info
|
sim
|
arith
|
pict
|
|
info
|
1.0000000
|
0.7118787
|
0.2267109
|
0.3463270
|
|
sim
|
0.7118787
|
1.0000000
|
0.1890167
|
-0.2962762
|
|
arith
|
0.2267109
|
0.1890167
|
1.0000000
|
0.1758075
|
|
pict
|
0.3463270
|
-0.2962762
|
0.1758075
|
1.0000000
|
The partial correlation between Information and Similarities is
r=0.7118787.
|
|
info
|
sim
|
arith
|
pict
|
|
info
|
Inf
|
1.0240963
|
0.6413618
|
0.3296027
|
|
sim
|
1.0240963
|
Inf
|
0.5667183
|
0.0815325
|
|
arith
|
0.6413618
|
0.5667183
|
Inf
|
0.2875493
|
|
pict
|
0.3296027
|
0.0815325
|
0.2875493
|
Inf
|
- Example 6-3: Wechsler Adult Intelligence Data
|
|
|
|
T-test Statistic
|
5.822893
|
t
|
|
|
|
Fisher Transform (z)
|
0.890982564683951
|
|
95% CI for z
|
( 0.5445 , 1.2375 )
|
|
95% CI for r
|
( 0.4964 , 0.8447 )
|
Based on the result, we can conclude that we are 95% confident that
the interval (0.4964, 0.8447) contains the partial correlation between
Information and Similarities scores given scores on Arithmetic and
Picture Completion.
Lesson 7: Inferences Regarding Multivariate Population
Mean
- 7-1: Woman’s Survey Data (Hotelling’s \(T^2\) Test)
The sample variance-covariance matrix:
|
|
calcium
|
iron
|
protein
|
vitamin A
|
vitamin C
|
|
calcium
|
157829.4439
|
940.08944
|
6075.8163
|
102411.127
|
6701.6160
|
|
iron
|
940.0894
|
35.81054
|
114.0580
|
2383.153
|
137.6720
|
|
protein
|
6075.8163
|
114.05803
|
934.8769
|
7330.052
|
477.1998
|
|
vitamin A
|
102411.1266
|
2383.15341
|
7330.0515
|
2668452.371
|
22063.2486
|
|
vitamin C
|
6701.6160
|
137.67199
|
477.1998
|
22063.249
|
5416.2641
|
|
Statistic
|
Value
|
|
T^2
|
1758.5413
|
|
F
|
349.7968
|
|
df1
|
5.0000
|
|
df2
|
732.0000
|
|
p
|
0.0000
|
- 7-2 to 7-4: Confidence Intervals
The one-at-a-time confidence intervals are given by “loone” and
“upone”. On the other hand, simultaneous confidence intervals are given
by the columns for “losim” and “upsim” while for Bonferroni intervals
are given by “lobon” and “upbon”.
|
Variable
|
n
|
means
|
var
|
t1
|
tb
|
f
|
loone
|
upone
|
lobon
|
upbon
|
losim
|
upsim
|
|
calcium
|
737
|
624.04925
|
1.578294e+05
|
1.963192
|
2.582526
|
2.22634
|
595.32008
|
652.77843
|
586.25681
|
661.84169
|
575.09118
|
673.00733
|
|
iron
|
737
|
11.12990
|
3.581054e+01
|
1.963192
|
2.582526
|
2.22634
|
10.69715
|
11.56265
|
10.56063
|
11.69917
|
10.39244
|
11.86735
|
|
protein
|
737
|
65.80344
|
9.348769e+02
|
1.963192
|
2.582526
|
2.22634
|
63.59235
|
68.01453
|
62.89481
|
68.71207
|
62.03547
|
69.57141
|
|
vitamin A
|
737
|
839.63535
|
2.668452e+06
|
1.963192
|
2.582526
|
2.22634
|
721.50572
|
957.76498
|
684.23906
|
995.03163
|
638.32780
|
1040.94289
|
|
vitamin C
|
737
|
78.92845
|
5.416264e+03
|
1.963192
|
2.582526
|
2.22634
|
73.60640
|
84.25050
|
71.92743
|
85.92946
|
69.85901
|
87.99788
|
d1 d2 d3 d4
d1 0.82298851 0.07816092 -0.0137931 -0.05977011
d2 0.07816092 0.80919540 -0.2137931 -0.15632184
d3 -0.01379310 -0.21379310 0.5620690 0.51034483
d4 -0.05977011 -0.15632184 0.5103448 0.60229885
Test Statistics
|
Test
|
Value
|
|
T^2
|
13.1278403
|
|
F
|
2.9424470
|
|
df1
|
4.0000000
|
|
df2
|
26.0000000
|
|
p
|
0.0393691
|
- Example 7-10: Spouse Data (Bonferroni CI)
|
|
Simultaneous_L
|
Simulatneous_U
|
Bonferroni_L
|
Bonferroni_U
|
|
d1
|
-0.5127078
|
0.6460411
|
-0.3744357
|
0.5077690
|
|
d2
|
-0.7078322
|
0.4411655
|
-0.5707236
|
0.3040570
|
|
d3
|
-0.7788034
|
0.1788034
|
-0.6645333
|
0.0645333
|
|
d4
|
-0.6289758
|
0.3623091
|
-0.5106869
|
0.2440202
|

- Example 7-11: Spouse Data (Question 2)
Test Statistics
|
Test
|
Value
|
|
T^2
|
6.4266211
|
|
F
|
1.4404495
|
|
df1
|
4.0000000
|
|
df2
|
26.0000000
|
|
p
|
0.2488886
|
- Example 7-13: Swiss Banknotes (Two-Sample Hotelling’s)
Mean values for real and fake groups
|
Group
|
Length
|
Left
|
Right
|
Bottom
|
Top
|
Diag
|
|
Genuine
|
214.969
|
129.943
|
129.720
|
8.305
|
10.168
|
141.517
|
|
Counterfeit
|
214.823
|
130.300
|
130.193
|
10.530
|
11.133
|
139.450
|
Variance-covariance Matrix for Genuine Notes
|
|
length
|
left
|
right
|
bottom
|
top
|
diag
|
|
length
|
0.1502414
|
0.0580131
|
0.0572929
|
0.0571263
|
0.0144525
|
0.0054818
|
|
left
|
0.0580131
|
0.1325768
|
0.0858990
|
0.0566515
|
0.0490667
|
-0.0430616
|
|
right
|
0.0572929
|
0.0858990
|
0.1262626
|
0.0581818
|
0.0306465
|
-0.0237778
|
|
bottom
|
0.0571263
|
0.0566515
|
0.0581818
|
0.4132071
|
-0.2634747
|
-0.0001869
|
|
top
|
0.0144525
|
0.0490667
|
0.0306465
|
-0.2634747
|
0.4211879
|
-0.0753091
|
|
diag
|
0.0054818
|
-0.0430616
|
-0.0237778
|
-0.0001869
|
-0.0753091
|
0.1998091
|
Variance-covariance Matrix for Counterfeit Notes
|
|
length
|
left
|
right
|
bottom
|
top
|
diag
|
|
length
|
0.1240111
|
0.0315152
|
0.0240010
|
-0.1005960
|
0.0194354
|
0.0115657
|
|
left
|
0.0315152
|
0.0650505
|
0.0467677
|
-0.0240404
|
-0.0119192
|
-0.0050505
|
|
right
|
0.0240010
|
0.0467677
|
0.0889404
|
-0.0185758
|
0.0001323
|
0.0341919
|
|
bottom
|
-0.1005960
|
-0.0240404
|
-0.0185758
|
1.2813131
|
-0.4901919
|
0.2384848
|
|
top
|
0.0194354
|
-0.0119192
|
0.0001323
|
-0.4901919
|
0.4044556
|
-0.0220707
|
|
diag
|
0.0115657
|
-0.0050505
|
0.0341919
|
0.2384848
|
-0.0220707
|
0.3112121
|
Pooled Variance-Covariance Matrix
|
|
length
|
left
|
right
|
bottom
|
top
|
diag
|
|
length
|
0.1371263
|
0.0447641
|
0.0406470
|
-0.0217348
|
0.0169439
|
0.0085237
|
|
left
|
0.0447641
|
0.0988136
|
0.0663333
|
0.0163056
|
0.0185737
|
-0.0240561
|
|
right
|
0.0406470
|
0.0663333
|
0.1076015
|
0.0198030
|
0.0153894
|
0.0052071
|
|
bottom
|
-0.0217348
|
0.0163056
|
0.0198030
|
0.8472601
|
-0.3768333
|
0.1191490
|
|
top
|
0.0169439
|
0.0185737
|
0.0153894
|
-0.3768333
|
0.4128217
|
-0.0486899
|
|
diag
|
0.0085237
|
-0.0240561
|
0.0052071
|
0.1191490
|
-0.0486899
|
0.2555106
|
- Example 7-14: Women’s Health Survey
[,1]
diff1 0.117944052
diff2 0.354730710
diff3 -0.047179834
diff4 0.002835103
diff1 diff2 diff3 diff4
diff1 0.19164212 -0.07101776 0.04511467 -0.03756199
diff2 -0.07101776 0.16538367 -0.17884359 0.02955601
diff3 0.04511467 -0.17884359 4.12372604 -3.75507101
diff4 -0.03756199 0.02955601 -3.75507101 4.39690660
|
Statistic
|
Value
|
|
T^2
|
1030.795
|
|
F
|
256.648
|
|
df1
|
4.000
|
|
df2
|
733.000
|
|
p
|
0.000
|
- Example 7-15: Swiss Banknotes
Confidence Intervals - Swiss Bank Notes
|
variable
|
n1
|
xbar1
|
s21
|
n2
|
xbar2
|
s22
|
sp
|
losim
|
upsim
|
lobon
|
upbon
|
|
bottom
|
100
|
8.305
|
0.4132071
|
100
|
10.530
|
1.2813131
|
0.8472601
|
-2.6980943
|
-1.7519057
|
-2.5719164
|
-1.8780836
|
|
diag
|
100
|
141.517
|
0.1998091
|
100
|
139.450
|
0.3112121
|
0.2555106
|
1.8071972
|
2.3268028
|
1.8764886
|
2.2575114
|
|
left
|
100
|
129.943
|
0.1325768
|
100
|
130.300
|
0.0650505
|
0.0988136
|
-0.5185652
|
-0.1954348
|
-0.4754745
|
-0.2385255
|
|
length
|
100
|
214.969
|
0.1502414
|
100
|
214.823
|
0.1240111
|
0.1371263
|
-0.0443267
|
0.3363267
|
0.0064349
|
0.2855651
|
|
right
|
100
|
129.720
|
0.1262626
|
100
|
130.193
|
0.0889404
|
0.1076015
|
-0.6415965
|
-0.3044035
|
-0.5966305
|
-0.3493695
|
|
top
|
100
|
10.168
|
0.4211879
|
100
|
11.133
|
0.4044556
|
0.4128217
|
-1.2952331
|
-0.6347669
|
-1.2071574
|
-0.7228426
|
Simultaneous 95% Confidence Intervals - Swiss Bank Notes
|
variable
|
losim
|
upsim
|
|
bottom
|
-2.6980943
|
-1.7519057
|
|
diag
|
1.8071972
|
2.3268028
|
|
left
|
-0.5185652
|
-0.1954348
|
|
length
|
-0.0443267
|
0.3363267
|
|
right
|
-0.6415965
|
-0.3044035
|
|
top
|
-1.2952331
|
-0.6347669
|
- 7.2.4 - Bonferroni Corrected (1 - α) x 100% Confidence
Intervals
95% Simultaneous Confidence Intervals (Bonferroni corrected) - Swiss
Bank Notes
|
variable
|
losim_adj
|
upsim_adj
|
|
bottom
|
-2.5719890
|
-1.8780110
|
|
diag
|
1.8764488
|
2.2575512
|
|
left
|
-0.4754993
|
-0.2385007
|
|
length
|
0.0064057
|
0.2855943
|
|
right
|
-0.5966564
|
-0.3493436
|
|
top
|
-1.2072080
|
-0.7227920
|
- Profile Plots

- 7.2.6 - Example (Box’s Test )
Warning: package 'biotools' was built under R version 4.3.3
Box's M-test for Homogeneity of Covariance Matrices
data: swiss[, -1]
Chi-Sq (approx.) = 121.9, df = 21, p-value = 3.198e-16
|
Statistic
|
Value
|
|
T^2
|
2412.451
|
|
F
|
391.922
|
|
df1
|
6.000
|
|
df2
|
193.000
|
|
p
|
0.000
|
- Example 7-16: Swiss Bank Notes
Simultaneous (1 - α) x 100% Confidence Intervals
|
variable
|
losim_adj
|
upsim_adj
|
|
bottom
|
-2.6868875
|
-1.7631125
|
|
diag
|
1.8133515
|
2.3206485
|
|
left
|
-0.5147380
|
-0.1992620
|
|
length
|
-0.0398182
|
0.3318182
|
|
right
|
-0.6376027
|
-0.3083973
|
|
top
|
-1.2874105
|
-0.6425895
|