## 'data.frame': 1036 obs. of 8 variables:
## $ SEX : Factor w/ 3 levels "F","I","M": 2 2 2 2 2 2 2 2 2 2 ...
## $ LENGTH: num 5.57 3.67 10.08 4.09 6.93 ...
## $ DIAM : num 4.09 2.62 7.35 3.15 4.83 ...
## $ HEIGHT: num 1.26 0.84 2.205 0.945 1.785 ...
## $ WHOLE : num 11.5 3.5 79.38 4.69 21.19 ...
## $ SHUCK : num 4.31 1.19 44 2.25 9.88 ...
## $ RINGS : int 6 4 6 3 6 6 5 6 5 6 ...
## $ CLASS : Factor w/ 5 levels "A1","A2","A3",..: 1 1 1 1 1 1 1 1 1 1 ...
(1)(a) (1 point) Use summary() to obtain and present descriptive statistics from mydata. Use table() to present a frequency table using CLASS and RINGS. There should be 115 cells in the table you present.
## 'data.frame': 1036 obs. of 10 variables:
## $ SEX : Factor w/ 3 levels "F","I","M": 2 2 2 2 2 2 2 2 2 2 ...
## $ LENGTH: num 5.57 3.67 10.08 4.09 6.93 ...
## $ DIAM : num 4.09 2.62 7.35 3.15 4.83 ...
## $ HEIGHT: num 1.26 0.84 2.205 0.945 1.785 ...
## $ WHOLE : num 11.5 3.5 79.38 4.69 21.19 ...
## $ SHUCK : num 4.31 1.19 44 2.25 9.88 ...
## $ RINGS : int 6 4 6 3 6 6 5 6 5 6 ...
## $ CLASS : Factor w/ 5 levels "A1","A2","A3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ VOLUME: num 28.7 8.1 163.4 12.2 59.7 ...
## $ RATIO : num 0.15 0.147 0.269 0.185 0.165 ...
## SEX LENGTH DIAM HEIGHT WHOLE
## F:326 Min. : 2.73 Min. : 1.995 Min. :0.525 Min. : 1.625
## I:329 1st Qu.: 9.45 1st Qu.: 7.350 1st Qu.:2.415 1st Qu.: 56.484
## M:381 Median :11.45 Median : 8.925 Median :2.940 Median :101.344
## Mean :11.08 Mean : 8.622 Mean :2.947 Mean :105.832
## 3rd Qu.:13.02 3rd Qu.:10.185 3rd Qu.:3.570 3rd Qu.:150.319
## Max. :16.80 Max. :13.230 Max. :4.935 Max. :315.750
## SHUCK RINGS CLASS VOLUME
## Min. : 0.5625 Min. : 3.000 A1:108 Min. : 3.612
## 1st Qu.: 23.3006 1st Qu.: 8.000 A2:236 1st Qu.:163.545
## Median : 42.5700 Median : 9.000 A3:329 Median :307.363
## Mean : 45.4396 Mean : 9.993 A4:188 Mean :326.804
## 3rd Qu.: 64.2897 3rd Qu.:11.000 A5:175 3rd Qu.:463.264
## Max. :157.0800 Max. :25.000 Max. :995.673
## RATIO
## Min. :0.06734
## 1st Qu.:0.12241
## Median :0.13914
## Mean :0.14205
## 3rd Qu.:0.15911
## Max. :0.31176
##
## 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
## A1 9 8 24 67 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## A2 0 0 0 0 91 145 0 0 0 0 0 0 0 0 0 0 0 0
## A3 0 0 0 0 0 0 182 147 0 0 0 0 0 0 0 0 0 0
## A4 0 0 0 0 0 0 0 0 125 63 0 0 0 0 0 0 0 0
## A5 0 0 0 0 0 0 0 0 0 0 48 35 27 15 13 8 8 6
##
## 21 22 23 24 25
## A1 0 0 0 0 0
## A2 0 0 0 0 0
## A3 0 0 0 0 0
## A4 0 0 0 0 0
## A5 4 1 7 2 1
Question (1 point): Briefly discuss the variable types and distributional implications such as potential skewness and outliers.
Answer: There are 10 variables. Sex (“SEX”) is a categorical variable, which details the sex of the individual abalone. There are 3 varieties: M for male, F for female, and I for infant. From the table above, female and infant are fairly even. But male has about 14% more than either. each of the three varieties seem to be evenly distributed. Length (“LEGNTH”) is a numerical variable, which details the longest shell length, in centimeters (cm). From the table above, length ranges from 2.73 to 16.80 and seems to be skewed towards the larger values (right). Diameter (“DIAM”) is a numerical variable, which details the length (diameter) perpendicular to the length, in centimeters (cm). From the table above, diameter ranges from 1.995 to 13.230 and seems to have most data gathered around 7-9 cm. Height (“HEIGHT”) is a numerical value, which details the length (height) perpendicular to the length and diameter, in centimeters (cm). From the table above, height ranges from 0.525 to 4.935 and most of the data spans over 2.4 to 3.6. However, with the third quartile being 3.570, there might be outliers. Whole weight (“WHOLE”) is a numerical variable, which details the whole weight of the abalone, in grams (g). From the table above, whole weight ranges from 1.625 to 315.750 and most of the data spans over 55 to 150. However, with the third quartile being 150.319, I highly suspect there is at least one outlier. Shuck weight (“SHUCK”) is a numerical variable, which details the shucked weight of the meat, in grams (g). From the table above, shuck weigh ranges from 0.5625 to 157.0800 and seems to be skewed towards the smaller values (left). Age (“RINGS”) is a numerical variable, which details the number of growth rings on the abalone’s shell. To determine the age of the abalone, add 1.5 to the number of rings. From the table above, age ranges from 3 to 25 and most of the data spans over 8 to 10. Class (“CLASS”) is a categorical variable, which details the classification of the abalone and is determined by the number of growth rings. There are 5 varieties: A1 is the youngest and A5 is the oldest. From the table above, A2 and A3 seem to be the two most popular classes, with A3 having more than twice the amount as A1 and almost twice the amount as A5. Volume (“VOLUME”) is a numerical variable, which is calculated by multiplying length, diameter, and height, in centimeters cubed (cm^3). From the table above, volume ranges from 3.612 to 995.673. However, with the third quartile being 463.264, I highly suspect there is at least one outlier. Ratio (“RATIO”) is a numerical variable, which is calculated by dividing shuck weight by volume, in grams per centimeter cubed (g/cm^3). From the table above, ratio ranges from 0.06734 to 0.31176 and seems to have most data gathered around 12-16 g/cm^3.
(1)(b) (1 point) Generate a table of counts using SEX and CLASS. Add margins to this table (Hint: There should be 15 cells in this table plus the marginal totals. Apply table() first, then pass the table object to addmargins() (Kabacoff Section 7.2 pages 144-147)). Lastly, present a barplot of these data; ignoring the marginal totals.
##
## A1 A2 A3 A4 A5 Sum
## F 5 41 121 82 77 326
## I 91 133 65 21 19 329
## M 12 62 143 85 79 381
## Sum 108 236 329 188 175 1036
Essay Question (2 points): Discuss the sex distribution of abalones. What stands out about the distribution of abalones by CLASS?
Answer: In the graph above, I can see that most infant abalones fell into classes A1, A2, and A3. This makes sense: since they are young, they should be low in age classification. I can see that most male abalones fell into class A3 and most female abalones also fell into class A3. This also makes sense: since they are adults, they should not be low in age classification. I also understand why there aren’t large amounts of males and females in classes A4 and A5 since it might an indication of survival once developing reproductive organs. These trends are also evident in the classes. Class A1 is mostly full of infants. Classes A4 and A5 are mostly full of males and females. However, there are some things that stand out to me. First, I see that, from Class A2 to A3, there is a sharp decrease in infants and a sharp increase in males and females. I wonder if this is an indication of the age of when abalones generally mature and develop their reproductive organs. I also see that there are still infants in classes A4 and A5. Age classification is determined by the number of growth rings. So it seems to me that there are either some ‘old’ infants that are about to develop into either males or females or perhaps there is just a group of abalones that do not develop reproductive organs.
(1)(c) (1 point) Select a simple random sample of 200 observations from “mydata” and identify this sample as “work.” Use set.seed(123) prior to drawing this sample. Do not change the number 123. Note that sample() “takes a sample of the specified size from the elements of x.” We cannot sample directly from “mydata.” Instead, we need to sample from the integers, 1 to 1036, representing the rows of “mydata.” Then, select those rows from the data frame (Kabacoff Section 4.10.5 page 87).
Using “work”, construct a scatterplot matrix of variables 2-6 with plot(work[, 2:6]) (these are the continuous variables excluding VOLUME and RATIO). The sample “work” will not be used in the remainder of the assignment.
## SEX LENGTH DIAM HEIGHT WHOLE SHUCK RINGS CLASS VOLUME
## 415 F 11.0250 9.03000 2.83500 105.437500 54.603125 9 A3 282.240551
## 463 F 11.7600 9.24000 2.83500 100.312500 44.187500 9 A3 308.057904
## 179 I 8.1900 6.30000 2.10000 33.312500 13.812500 7 A2 108.353700
## 526 F 13.5450 10.71000 4.20000 199.856250 79.953750 12 A4 609.281190
## 195 I 9.9750 7.56000 2.62500 61.312500 25.625000 8 A2 197.953875
## 938 M 13.3350 10.50000 3.46500 162.307500 80.053750 12 A4 485.160638
## 665 M 5.6700 4.09500 1.68000 12.500000 4.812500 6 A1 39.007332
## 602 F 12.3900 9.34500 2.73000 141.562500 48.768750 13 A5 316.091822
## 709 M 10.9200 8.71500 3.67500 94.125000 31.927500 8 A2 349.741665
## 1011 M 11.1300 8.71500 2.73000 105.312500 34.375000 20 A5 264.804403
## 953 M 13.1250 10.29000 3.46500 142.353750 59.963750 11 A4 467.969906
## 348 F 11.5500 9.03000 3.15000 105.000000 49.375000 8 A2 328.533975
## 1017 M 12.9150 10.08000 3.99000 170.000000 66.312500 18 A5 519.430968
## 649 F 13.0200 10.92000 4.72500 147.937500 48.195000 23 A5 671.792940
## 989 M 13.4400 11.02500 3.88500 213.375000 94.421250 13 A5 575.663760
## 355 F 12.9150 10.08000 3.36000 156.562500 73.125000 8 A2 437.415552
## 840 M 12.9150 9.97500 3.57000 141.125000 59.338125 10 A3 459.912836
## 26 I 3.4650 2.52000 0.63000 2.687500 0.875000 3 A1 5.501034
## 519 F 13.7550 10.60500 4.09500 183.663750 88.580000 11 A4 597.344919
## 426 F 13.1250 10.08000 3.57000 169.062500 78.716875 10 A3 472.311000
## 1023 M 12.1800 9.66000 3.46500 153.437500 59.125000 16 A5 407.687742
## 766 M 12.3900 9.97500 3.46500 134.625000 56.244375 9 A3 428.240216
## 211 I 9.5550 7.03500 2.20500 50.687500 21.875000 8 A2 148.218832
## 932 M 13.1250 10.39500 3.88500 176.396250 87.036250 11 A4 530.047547
## 590 F 12.1800 9.55500 3.57000 113.437500 47.685000 13 A5 415.476243
## 593 F 13.0200 10.71000 4.30500 168.437500 60.881250 14 A5 600.307281
## 555 F 11.3400 8.82000 2.94000 102.637500 47.508750 11 A4 294.055272
## 871 M 9.6600 7.35000 2.52000 64.375000 27.720000 10 A3 178.922520
## 373 F 12.1800 9.13500 3.15000 104.250000 53.500000 8 A2 350.482545
## 844 M 12.6000 9.66000 3.15000 155.875000 66.020625 9 A3 383.405400
## 143 I 11.1300 8.82000 2.52000 74.562500 31.937500 7 A2 247.379832
## 544 F 9.2400 7.45500 2.41500 52.912500 20.406875 11 A4 166.355343
## 490 F 14.4900 12.18000 4.09500 207.250000 89.385000 10 A3 722.719179
## 621 F 13.4400 11.02500 4.51500 222.375000 57.821250 22 A5 669.014640
## 775 M 9.6600 7.66500 2.62500 58.375000 23.450625 10 A3 194.365238
## 905 M 14.2800 11.34000 3.99000 206.932500 87.771250 12 A4 646.121448
## 937 M 14.2800 11.44500 3.88500 213.180000 86.668750 11 A4 634.943421
## 842 M 12.3900 10.18500 3.25500 134.812500 56.120625 9 A3 410.755448
## 23 I 7.1400 5.35500 1.57500 22.500000 9.312500 6 A1 60.219653
## 923 M 12.8100 10.18500 3.67500 158.673750 66.640000 12 A4 479.476699
## 956 M 11.7600 9.66000 4.93500 107.036250 40.731250 12 A4 560.623896
## 309 I 11.8650 9.03000 2.83500 106.812500 37.737563 11 A4 303.744593
## 135 I 7.9800 5.98500 1.99500 30.375000 11.187500 7 A2 95.281799
## 821 M 12.2850 10.08000 3.88500 130.000000 53.707500 10 A3 481.090428
## 997 M 12.2850 8.50500 3.15000 157.062500 54.375000 15 A5 329.124364
## 224 I 10.7100 8.29500 2.20500 69.062500 29.250000 8 A2 195.890987
## 166 I 8.9250 6.51000 1.99500 43.812500 20.562500 8 A2 115.912991
## 217 I 9.5550 9.13500 2.31000 53.312500 24.375000 8 A2 201.628177
## 290 I 10.9200 8.61000 3.04500 80.750000 36.321250 9 A3 286.294554
## 581 F 13.2300 9.97500 3.67500 177.875000 52.976250 14 A5 484.986994
## 72 I 3.3600 2.31000 0.52500 2.250000 0.812500 3 A1 4.074840
## 588 F 15.4350 11.86500 4.72500 254.625000 110.925000 13 A5 865.318899
## 575 F 14.0700 11.55000 3.99000 177.288750 69.846875 12 A4 648.408915
## 141 I 5.5650 4.09500 1.15500 10.500000 4.562500 7 A2 26.320920
## 722 M 11.4450 9.45000 3.15000 97.562500 46.963125 8 A2 340.689037
## 865 M 11.1300 8.71500 2.52000 88.250000 41.518125 9 A3 244.434834
## 859 M 12.6000 10.08000 3.46500 114.562500 51.170625 9 A3 440.082720
## 153 I 10.1850 7.87500 2.94000 65.125000 25.000000 8 A2 235.808213
## 294 I 8.7150 6.82500 2.41500 41.062500 16.517531 12 A4 143.643898
## 277 I 11.7600 8.92500 2.83500 102.562500 45.508750 9 A3 297.555930
## 1035 M 12.3900 10.50000 4.20000 148.375000 51.500000 16 A5 546.399000
## 41 I 6.3000 4.62000 1.36500 15.437500 7.375000 5 A1 39.729690
## 431 F 13.6500 10.29000 3.25500 140.250000 68.806250 9 A3 457.192417
## 90 I 4.5150 3.57000 1.15500 7.562500 2.562500 6 A1 18.616925
## 316 I 11.0292 8.48400 3.07545 82.500000 31.072125 13 A5 287.775186
## 223 I 7.8750 5.98500 1.89000 32.125000 13.062500 7 A2 89.079244
## 528 F 12.6000 9.76500 3.36000 144.457500 59.997500 11 A4 413.411040
## 116 I 10.9200 8.61000 2.52000 74.375000 29.812500 8 A2 236.933424
## 606 F 11.3400 9.13500 3.67500 111.500000 41.055000 13 A5 380.696557
## 774 M 7.2450 5.67000 1.99500 24.625000 8.229375 9 A3 81.952904
## 747 M 13.5450 10.71000 4.09500 153.250000 72.826875 10 A3 594.049160
## 456 F 12.0750 9.97500 3.36000 111.875000 45.513125 9 A3 404.705700
## 598 F 9.7650 7.98000 2.83500 72.375000 26.520000 14 A5 220.916525
## 854 M 12.8100 9.87000 3.46500 131.500000 61.627500 9 A3 438.096235
## 39 I 5.7750 4.62000 1.68000 17.062500 7.062500 6 A1 44.823240
## 159 I 11.0250 8.40000 2.73000 80.687500 40.625000 8 A2 252.825300
## 752 M 11.7600 9.76500 3.04500 110.937500 41.394375 9 A3 349.676838
## 209 I 8.7150 6.82500 2.10000 41.687500 18.062500 7 A2 124.907738
## 374 F 10.6050 8.19000 2.41500 82.500000 38.062500 8 A2 209.754704
## 818 M 13.6500 10.92000 3.25500 171.000000 76.539375 9 A3 485.183790
## 34 I 7.6650 5.67000 1.78500 24.625000 10.187500 6 A1 77.577082
## 516 F 14.2800 10.81500 3.67500 206.358750 65.984375 12 A4 567.560385
## 13 I 4.5150 3.25500 1.26000 6.759375 2.625000 5 A1 18.517369
## 69 I 6.3000 4.83000 1.57500 15.875000 6.500000 6 A1 47.925675
## 895 M 10.1850 8.08500 2.62500 60.881250 24.500000 12 A4 216.157528
## 755 M 7.8750 5.88000 1.99500 27.812500 10.828125 10 A3 92.378475
## 409 F 10.7100 8.40000 2.52000 87.562500 43.808750 10 A3 226.709280
## 308 I 11.8650 9.24000 3.67500 109.187500 48.670875 11 A4 402.899805
## 278 I 10.5000 7.98000 2.83500 74.250000 36.076250 9 A3 237.544650
## 89 I 3.3600 2.31000 0.52500 2.437500 0.937500 4 A1 4.074840
## 928 M 15.5400 12.49500 3.99000 296.246250 140.813750 11 A4 774.747477
## 537 F 13.5450 10.60500 3.46500 168.045000 70.812500 11 A4 497.728972
## 291 I 11.5500 9.34500 3.04500 97.875000 35.797781 11 A4 328.661314
## 424 F 12.4950 9.76500 3.15000 134.562500 61.988750 9 A3 384.343076
## 880 M 13.0200 9.76500 3.99000 171.041250 69.886250 11 A4 507.289797
## 286 I 12.3900 9.34500 2.83500 96.437500 40.180000 9 A3 328.249199
## 908 M 13.4400 10.60500 3.78000 165.367500 72.275000 11 A4 538.767936
## 671 M 8.8200 7.14000 2.41500 52.687500 21.656250 8 A2 152.084142
## 121 I 8.1900 5.88000 1.89000 26.875000 10.562500 8 A2 91.017108
## 110 I 10.7100 8.08500 3.04500 95.812500 49.812500 8 A2 263.667616
## 158 I 7.2450 5.67000 2.31000 26.687500 10.250000 7 A2 94.892837
## 64 I 8.8200 6.61500 2.10000 42.937500 19.625000 6 A1 122.523030
## 483 F 13.0200 9.97500 3.36000 165.562500 86.670625 9 A3 436.378320
## 910 M 15.7500 11.55000 3.78000 241.357500 115.395000 11 A4 687.629250
## 477 F 13.6500 10.50000 3.99000 183.000000 80.989375 9 A3 571.866750
## 480 F 12.7050 9.55500 3.04500 122.187500 59.085000 9 A3 369.651657
## 711 M 8.8200 7.03500 2.41500 46.125000 21.161250 8 A2 149.847611
## 67 I 5.0400 3.67500 0.94500 9.656250 3.937500 5 A1 17.503290
## 663 M 10.5000 8.19000 2.83500 82.437500 39.312500 6 A1 243.795825
## 890 M 11.8650 9.24000 2.41500 117.108750 49.490000 11 A4 264.762729
## 847 M 11.4450 8.61000 2.94000 92.125000 43.188750 9 A3 289.711863
## 85 I 6.1950 4.83000 1.68000 20.312500 8.125000 5 A1 50.268708
## 165 I 10.5000 8.08500 2.52000 70.000000 35.437500 8 A2 213.929100
## 648 F 13.0200 9.87000 4.72500 139.375000 48.195000 15 A5 607.197465
## 51 I 7.6650 5.67000 1.78500 23.437500 10.125000 6 A1 77.577082
## 74 I 7.9800 5.88000 1.78500 34.187500 14.375000 6 A1 83.756484
## 178 I 10.1850 8.08500 2.73000 74.996875 31.312500 7 A2 224.803829
## 362 F 12.8100 9.87000 3.36000 134.312500 61.562500 8 A2 424.820592
## 236 I 11.0250 8.40000 3.04500 76.187500 30.380000 9 A3 281.997450
## 610 F 9.1350 7.35000 2.31000 48.000000 18.232500 13 A5 155.098597
## 330 F 9.7650 7.35000 2.62500 60.250000 28.750000 6 A1 188.403469
## 726 M 8.8200 7.24500 2.20500 53.750000 21.656250 7 A2 140.901485
## 127 I 9.0300 6.61500 2.41500 48.000000 23.562500 8 A2 144.256282
## 212 I 7.4550 5.46000 1.89000 24.812500 8.937500 7 A2 76.931127
## 686 M 12.0750 9.45000 3.46500 120.687500 61.627500 8 A2 395.386819
## 785 M 11.9700 9.45000 3.25500 149.375000 69.609375 10 A3 368.194208
## 814 M 13.1250 10.39500 3.25500 128.125000 56.925000 9 A3 444.093891
## 310 I 13.0200 9.87000 3.25500 120.750000 52.550438 11 A4 418.291587
## 744 M 11.7600 9.03000 3.04500 112.437500 57.420000 9 A3 323.357076
## 878 M 15.1200 11.76000 3.78000 202.278750 84.647500 11 A4 672.126336
## 243 I 8.5050 6.61500 2.20500 43.375000 19.661250 9 A3 124.054568
## 862 M 8.9250 6.82500 2.52000 46.937500 17.572500 9 A3 153.501075
## 926 M 9.0300 7.24500 2.41500 38.823750 11.331250 11 A4 157.994975
## 792 M 10.0800 7.87500 3.04500 97.125000 26.730000 9 A3 241.712100
## 113 I 12.0750 9.03000 2.73000 92.812500 36.187500 8 A2 297.671692
## 619 F 13.9650 11.23500 3.99000 187.000000 73.631250 17 A5 626.018132
## 1013 M 11.4450 8.61000 3.04500 109.125000 37.937500 18 A5 300.058715
## 151 I 9.1350 6.61500 2.31000 46.062500 20.187500 7 A2 139.588738
## 666 M 9.4500 7.03500 2.62500 58.259375 27.062500 6 A1 174.511969
## 614 F 14.1750 11.65500 4.30500 240.625000 90.907500 13 A5 711.227436
## 767 M 12.2850 9.97500 3.15000 133.125000 65.773125 10 A3 386.010056
## 160 I 8.9250 6.61500 1.99500 45.937500 23.312500 7 A2 117.782556
## 391 F 11.1300 8.61000 3.04500 106.572812 47.343750 9 A3 291.800219
## 155 I 11.3400 8.19000 2.62500 78.187500 31.562500 8 A2 243.795825
## 1024 M 14.1750 11.65500 4.20000 179.812500 68.125000 21 A5 693.880425
## 5 I 6.9300 4.83000 1.78500 21.187500 9.875000 6 A1 59.747341
## 326 I 11.0292 8.59005 2.96940 72.187500 23.765000 15 A5 281.325052
## 784 M 13.1250 9.55500 3.57000 135.250000 61.318125 9 A3 447.711469
## 280 I 12.8100 9.76500 3.15000 120.062500 55.063750 9 A3 394.032398
## 800 M 9.8700 7.56000 2.83500 62.625000 20.604375 10 A3 211.539762
## 789 M 13.5450 10.50000 4.09500 175.125000 76.291875 10 A3 582.401138
## 567 F 13.3350 10.18500 3.46500 161.861250 72.550625 11 A4 470.605818
## 843 M 7.9800 6.09000 2.52000 35.375000 14.540625 9 A3 122.467464
## 238 I 10.8150 8.29500 2.62500 72.562500 28.971250 9 A3 235.489866
## 764 M 12.0750 9.34500 3.36000 104.875000 49.561875 9 A3 379.145340
## 339 F 13.3350 10.50000 3.99000 161.250000 74.125000 8 A2 558.669825
## 962 M 15.7500 11.65500 4.51500 275.125000 131.360625 13 A5 828.801619
## 822 M 12.4950 9.87000 3.25500 150.187500 60.885000 10 A3 401.424991
## 137 I 8.2950 6.51000 1.78500 39.625000 19.125000 7 A2 96.390803
## 455 F 12.7050 9.55500 3.04500 107.750000 42.167500 9 A3 369.651657
## 738 M 11.0250 8.50500 2.83500 94.687500 40.899375 10 A3 265.831217
## 560 F 14.8050 11.76000 3.57000 185.831250 78.151250 11 A4 621.561276
## 589 F 10.7100 8.19000 2.20500 76.500000 23.842500 13 A5 193.411355
## 83 I 9.8700 7.35000 2.62500 53.937500 23.750000 6 A1 190.429312
## 696 M 9.5550 7.35000 2.31000 57.250000 24.750000 8 A2 162.229568
## 942 M 13.7550 10.92000 3.78000 190.230000 88.016250 11 A4 567.773388
## 196 I 9.4500 7.35000 2.73000 68.375000 30.625000 8 A2 189.618975
## 769 M 7.1400 5.56500 1.78500 24.205000 9.528750 10 A3 70.925368
## 680 M 11.2350 8.50500 2.94000 91.437500 41.580000 7 A2 280.927804
## 941 M 13.7550 10.71000 4.51500 227.396250 108.841250 11 A4 665.131966
## 968 M 13.0200 10.18500 3.25500 128.687500 52.593750 13 A5 431.641318
## 500 F 13.8600 10.50000 3.46500 151.788750 59.031875 12 A4 504.261450
## 889 M 13.6500 11.02500 3.99000 174.483750 73.193750 11 A4 600.460088
## 344 F 11.5500 9.24000 2.83500 105.437500 54.250000 8 A2 302.556870
## 909 M 8.0850 6.51000 2.10000 36.273750 13.046250 11 A4 110.530035
## 459 F 10.2900 7.66500 3.04500 79.312500 25.186875 10 A3 240.167828
## 20 I 8.4000 6.09000 2.10000 33.437500 15.062500 5 A1 107.427600
## 1032 M 12.7050 9.87000 2.41500 139.250000 49.062500 15 A5 302.837015
## 164 I 9.5550 7.35000 2.83500 67.062500 35.687500 7 A2 199.099924
## 52 I 9.8700 7.45500 2.52000 46.062500 15.750000 6 A1 185.423742
## 534 F 15.2250 11.97000 4.30500 206.486250 95.790000 11 A4 784.557191
## 177 I 10.5000 8.40000 2.52000 77.000000 32.625000 8 A2 222.264000
## 554 F 15.1200 12.07500 4.51500 267.750000 110.274375 12 A4 824.321610
## 827 M 12.4950 9.97500 2.94000 128.812500 60.946875 10 A3 366.434617
## 84 I 9.8700 7.98000 2.62500 60.562500 26.375000 6 A1 206.751825
## 523 F 14.1750 11.34000 4.41000 203.107500 88.322500 11 A4 708.883245
## 633 F 13.8600 10.92000 4.20000 209.500000 85.807500 17 A5 635.675040
## 392 F 13.2300 10.60500 4.09500 163.250000 65.145000 9 A3 574.545494
## 302 I 12.0750 9.45000 2.83500 103.062500 39.677344 11 A4 323.498306
## 597 F 14.8050 11.44500 3.78000 192.437500 77.456250 13 A5 640.495390
## 706 M 9.2400 6.82500 1.68000 51.625000 17.820000 8 A2 105.945840
## 901 M 13.9650 11.02500 3.78000 182.197500 82.258750 12 A4 581.984392
## 874 M 9.8700 7.87500 2.52000 70.953750 27.685000 12 A4 195.870150
## 430 F 12.0750 10.08000 3.46500 134.750000 64.513750 9 A3 421.745940
## 710 M 8.7150 6.61500 2.52000 50.187500 24.626250 8 A2 145.277307
## 761 M 14.4900 11.13000 3.99000 199.437500 83.902500 10 A3 643.482063
## 712 M 7.3500 5.56500 1.89000 28.187500 12.313125 7 A2 77.306197
## 428 F 12.7050 10.29000 3.15000 141.812500 66.470625 9 A3 411.813517
## 672 M 6.5100 4.72500 1.68000 16.812500 6.682500 7 A2 51.676380
## 250 I 13.2300 9.97500 3.04500 132.562500 63.271250 10 A3 401.846366
## RATIO
## 415 0.19346308
## 463 0.14343894
## 179 0.12747603
## 526 0.13122636
## 195 0.12944935
## 938 0.16500463
## 665 0.12337424
## 602 0.15428666
## 709 0.09128881
## 1011 0.12981280
## 953 0.12813591
## 348 0.15028887
## 1017 0.12766374
## 649 0.07174086
## 989 0.16402153
## 355 0.16717513
## 840 0.12902037
## 26 0.15906101
## 519 0.14828953
## 426 0.16666323
## 1023 0.14502521
## 766 0.13133838
## 211 0.14758583
## 932 0.16420461
## 590 0.11477191
## 593 0.10141681
## 555 0.16156401
## 871 0.15492740
## 373 0.15264669
## 844 0.17219534
## 143 0.12910309
## 544 0.12267039
## 490 0.12367874
## 621 0.08642748
## 775 0.12065236
## 905 0.13584327
## 937 0.13649838
## 842 0.13662783
## 23 0.15464221
## 923 0.13898486
## 956 0.07265343
## 309 0.12424110
## 135 0.11741487
## 821 0.11163702
## 997 0.16521111
## 224 0.14931774
## 166 0.17739599
## 217 0.12089084
## 290 0.12686672
## 581 0.10923231
## 72 0.19939433
## 588 0.12818973
## 575 0.10772041
## 141 0.17334121
## 722 0.13784748
## 865 0.16985355
## 859 0.11627502
## 153 0.10601836
## 294 0.11498944
## 277 0.15294184
## 1035 0.09425347
## 41 0.18562944
## 431 0.15049736
## 90 0.13764357
## 316 0.10797361
## 223 0.14663910
## 528 0.14512796
## 116 0.12582649
## 606 0.10784179
## 774 0.10041590
## 747 0.12259402
## 456 0.11245981
## 598 0.12004534
## 854 0.14067115
## 39 0.15756335
## 159 0.16068408
## 752 0.11837894
## 209 0.14460673
## 374 0.18146196
## 818 0.15775336
## 34 0.13132100
## 516 0.11625966
## 13 0.14175880
## 69 0.13562668
## 895 0.11334327
## 755 0.11721481
## 409 0.19323757
## 308 0.12080144
## 278 0.15187145
## 89 0.23007038
## 928 0.18175438
## 537 0.14227120
## 291 0.10891997
## 424 0.16128494
## 880 0.13776396
## 286 0.12240700
## 908 0.13414867
## 671 0.14239650
## 121 0.11604961
## 110 0.18892157
## 158 0.10801658
## 64 0.16017397
## 483 0.19861350
## 910 0.16781572
## 477 0.14162281
## 480 0.15983967
## 711 0.14121847
## 67 0.22495771
## 663 0.16125174
## 890 0.18692208
## 847 0.14907484
## 85 0.16163137
## 165 0.16565068
## 648 0.07937286
## 51 0.13051535
## 74 0.17162850
## 178 0.13928811
## 362 0.14491411
## 236 0.10773147
## 610 0.11755425
## 330 0.15259804
## 726 0.15369781
## 127 0.16333777
## 212 0.11617534
## 686 0.15586635
## 785 0.18905614
## 814 0.12818235
## 310 0.12563111
## 744 0.17757459
## 878 0.12593986
## 243 0.15848872
## 862 0.11447803
## 926 0.07171905
## 792 0.11058611
## 113 0.12156850
## 619 0.11761840
## 1013 0.12643359
## 151 0.14462127
## 666 0.15507532
## 614 0.12781776
## 767 0.17039226
## 160 0.19792829
## 391 0.16224714
## 155 0.12946284
## 1024 0.09817974
## 5 0.16527932
## 326 0.08447524
## 784 0.13695902
## 280 0.13974422
## 800 0.09740190
## 789 0.13099541
## 567 0.15416432
## 843 0.11873051
## 238 0.12302546
## 764 0.13071999
## 339 0.13268123
## 962 0.15849465
## 822 0.15167217
## 137 0.19841104
## 455 0.11407361
## 738 0.15385467
## 560 0.12573378
## 589 0.12327353
## 83 0.12471819
## 696 0.15256159
## 942 0.15502003
## 196 0.16150810
## 769 0.13434897
## 680 0.14800956
## 941 0.16363858
## 968 0.12184596
## 500 0.11706601
## 889 0.12189611
## 344 0.17930513
## 909 0.11803353
## 459 0.10487198
## 20 0.14021071
## 1032 0.16200959
## 164 0.17924417
## 52 0.08494058
## 534 0.12209435
## 177 0.14678490
## 554 0.13377591
## 827 0.16632401
## 84 0.12756840
## 523 0.12459386
## 633 0.13498642
## 392 0.11338528
## 302 0.12265085
## 597 0.12093178
## 706 0.16819915
## 901 0.14134185
## 874 0.14134364
## 430 0.15296828
## 710 0.16951202
## 761 0.13038825
## 712 0.15927733
## 428 0.16140953
## 672 0.12931440
## 250 0.15745134
(2)(a) (1 point) Use “mydata” to plot WHOLE versus VOLUME. Color code data points by CLASS.
(2)(b) (2 points) Use “mydata” to plot SHUCK versus WHOLE with WHOLE on the horizontal axis. Color code data points by CLASS. As an aid to interpretation, determine the maximum value of the ratio of SHUCK to WHOLE. Add to the chart a straight line with zero intercept using this maximum value as the slope of the line. If you are using the ‘base R’ plot() function, you may use abline() to add this line to the plot. Use help(abline) in R to determine the coding for the slope and intercept arguments in the functions. If you are using ggplot2 for visualizations, geom_abline() should be used.
Essay Question (2 points): How does the variability in this plot differ from the plot in (a)? Compare the two displays. Keep in mind that SHUCK is a part of WHOLE. Consider the location of the different age classes.
Answer: When comparing the two displays, they data points show an almost identical trend: as the x axis increases, the y axis increases. However, each graph is comparing two different variables. In the first graph, volume is being compared to whole weight (cm^3 vs. g). The older classes, A4 and A5, identified in darker and bolder colors, show volumes that start at 200 cm^3. Most of the younger classes, with smaller whole weights, show smaller volumes. This demonstrates that the younger abalones are smaller in volume and, therefore, smaller in whole weight. The older abalones are larger in volume and, therefore, larger in whole weight. In the second graph, shuck weight is being compared to whole weight (g vs. g). The older classes, A4 and A5, identified in darker and bolder colors, show larger whole weights that have smaller shuck weights. Most of the younger classes, with smaller whole weights, show larger shuck weights. This demonstrates that, although the younger abalones generally weigh less than the older ones, you get more meat (shucked weight) out of them compared to the older abalones.
(3)(a) (2 points) Use “mydata” to create a multi-figured plot with histograms, boxplots and Q-Q plots of RATIO differentiated by sex. This can be done using par(mfrow = c(3,3)) and base R or grid.arrange() and ggplot2. The first row would show the histograms, the second row the boxplots and the third row the Q-Q plots. Be sure these displays are legible.
Essay Question (2 points): Compare the displays. How do the distributions compare to normality? Take into account the criteria discussed in the sync sessions to evaluate non-normality.
Answer: When looking at the histograms, both female and infant seem to have a relatively normal distribution. Male, however, seems to have a slightly skewed result with a spike in the data, around 0.135 gm/cm^3. When looking at the boxplots, it’s evident that there are some outliers. Both male and infant seem to have some outliers in the upper ratio values. But female has outliers both in the upper and lower ends of the data. This is also confirmed in the qqplots. In each ggplot, there are several points at the far right of the qqline that do not fall on the qqline. This is also evident on the far left of the female qqplot. The data in the qqplots for both male and infant also deviate from the qqline towards the far right, indicating that there might be some skewness.
(3)(b) (2 points) Use the boxplots to identify RATIO outliers (mild and extreme both) for each sex. Present the abalones with these outlying RATIO values along with their associated variables in “mydata” (Hint: display the observations by passing a data frame to the kable() function).
| SEX | LENGTH | DIAM | HEIGHT | WHOLE | SHUCK | RINGS | CLASS | VOLUME | RATIO | |
|---|---|---|---|---|---|---|---|---|---|---|
| 350 | F | 7.980 | 6.720 | 2.415 | 80.93750 | 40.37500 | 7 | A2 | 129.505824 | 0.3117620 |
| 379 | F | 15.330 | 11.970 | 3.465 | 252.06250 | 134.89812 | 10 | A3 | 635.827846 | 0.2121614 |
| 420 | F | 11.550 | 7.980 | 3.465 | 150.62500 | 68.55375 | 10 | A3 | 319.365585 | 0.2146560 |
| 421 | F | 13.125 | 10.290 | 2.310 | 142.00000 | 66.47062 | 9 | A3 | 311.979938 | 0.2130606 |
| 458 | F | 11.445 | 8.085 | 3.150 | 139.81250 | 68.49062 | 9 | A3 | 291.478399 | 0.2349767 |
| 586 | F | 12.180 | 9.450 | 4.935 | 133.87500 | 38.25000 | 14 | A5 | 568.023435 | 0.0673388 |
| 3 | I | 10.080 | 7.350 | 2.205 | 79.37500 | 44.00000 | 6 | A1 | 163.364040 | 0.2693371 |
| 37 | I | 4.305 | 3.255 | 0.945 | 6.18750 | 2.93750 | 3 | A1 | 13.242072 | 0.2218308 |
| 42 | I | 2.835 | 2.730 | 0.840 | 3.62500 | 1.56250 | 4 | A1 | 6.501222 | 0.2403394 |
| 58 | I | 6.720 | 4.305 | 1.680 | 22.62500 | 11.00000 | 5 | A1 | 48.601728 | 0.2263294 |
| 67 | I | 5.040 | 3.675 | 0.945 | 9.65625 | 3.93750 | 5 | A1 | 17.503290 | 0.2249577 |
| 89 | I | 3.360 | 2.310 | 0.525 | 2.43750 | 0.93750 | 4 | A1 | 4.074840 | 0.2300704 |
| 105 | I | 6.930 | 4.725 | 1.575 | 23.37500 | 11.81250 | 7 | A2 | 51.572194 | 0.2290478 |
| 200 | I | 9.135 | 6.300 | 2.520 | 74.56250 | 32.37500 | 8 | A2 | 145.027260 | 0.2232339 |
| 746 | M | 13.440 | 10.815 | 1.680 | 130.25000 | 63.73125 | 10 | A3 | 244.194048 | 0.2609861 |
| 754 | M | 10.500 | 7.770 | 3.150 | 132.68750 | 61.13250 | 9 | A3 | 256.992750 | 0.2378764 |
| 803 | M | 10.710 | 8.610 | 3.255 | 160.31250 | 70.41375 | 9 | A3 | 300.153640 | 0.2345924 |
| 810 | M | 12.285 | 9.870 | 3.465 | 176.12500 | 99.00000 | 10 | A3 | 420.141472 | 0.2356349 |
| 852 | M | 11.550 | 8.820 | 3.360 | 167.56250 | 78.27187 | 10 | A3 | 342.286560 | 0.2286735 |
Essay Question (2 points): What are your observations regarding the results in (3)(b)?
Answer: From looking at the table above, numerical parts like length, diameter, height, whole weight, shuch weight, rings, and volume all seem to be spread across the range of each variable. However, there were two trends that I noticed. First, most outliers were infants. Second, most outliers were from younger age classifications. This leads me to believe that, since these outliers are based on ratio, which was calculated from shuck weight divided by volume, younger abalones and infant abalones sometimes provide too much or too little meat when shucked, realtive to their volume. This could be attributed to their maturity.
(4)(a) (3 points) With “mydata,” display side-by-side boxplots for VOLUME and WHOLE, each differentiated by CLASS There should be five boxes for VOLUME and five for WHOLE. Also, display side-by-side scatterplots: VOLUME and WHOLE versus RINGS. Present these four figures in one graphic: the boxplots in one row and the scatterplots in a second row. Base R or ggplot2 may be used.
Essay Question (5 points) How well do you think these variables would perform as predictors of age? Explain.
Answer: Overall, these variables would not perform well as predictors of age. When looking at the box plots, all of the actual boxes are small, indicating that most of the volume and whole weight data for each age classification is around a certain value. However, classes A3, A4, and A5 all have their boxes around the same volume (around 350-550 cm^3) and around the same weight (around 100-150 g). Each of those classes also have long ‘whiskers’ that extend, sometimes, throughout the entire range of values. This indicates that there is also a fair amount of data around the mean. This indicates that if you found an abalone that was 450 cm^3 in volume and 125 g in whole weight, these box plots would tell you that the abalone could belong to A3, A4, or A5. This is also demonstrated in the scatterplots where the right side of the graph shows the data widely dispersed. However, classes A1 and, a bit, A2 show better prospects. In the box plots, they not only each have a small box, where data is mostly gathered around a single value, but also their ‘whiskers’ don’t reach as far as the other three classes. This shows that the data is more condensed around a single value, which means that each class generally has a common whole weight and volume. This is also demonstrated in the scatterplots where the left side of each graph comes almost to a point. This point, if extrapolated, would form line, indicating a trend. So overall, these variables would not perform well as predictors of age. But if the abalone is small in volume and small in whole weight, these variables would perform well.
(5)(a) (2 points) Use aggregate() with “mydata” to compute the mean values of VOLUME, SHUCK and RATIO for each combination of SEX and CLASS. Then, using matrix(), create matrices of the mean values. Using the “dimnames” argument within matrix() or the rownames() and colnames() functions on the matrices, label the rows by SEX and columns by CLASS. Present the three matrices (Kabacoff Section 5.6.2, p. 110-111). The kable() function is useful for this purpose. You do not need to be concerned with the number of digits presented.
## [1] "Volume"
## Class A1 Class A2 Class A3 Class A4 Class A5
## Female 255.29938 276.8573 412.6079 498.0489 486.1525
## Infant 66.51618 160.3200 270.7406 316.4129 318.6930
## Male 103.72320 245.3857 358.1181 442.6155 440.2074
## [1] "Shuck Weight"
## Class A1 Class A2 Class A3 Class A4 Class A5
## Female 38.90000 42.50305 59.69121 69.05161 59.17076
## Infant 10.11332 23.41024 37.17969 39.85369 36.47047
## Male 16.39583 38.33855 52.96933 61.42726 55.02762
## [1] "Ratio"
## Class A1 Class A2 Class A3 Class A4 Class A5
## Female 0.1546644 0.1554605 0.1450304 0.1379609 0.1233605
## Infant 0.1569554 0.1475600 0.1372256 0.1244413 0.1167649
## Male 0.1512698 0.1564017 0.1462123 0.1364881 0.1262089
| Class A1 | Class A2 | Class A3 | Class A4 | Class A5 | |
|---|---|---|---|---|---|
| Female | 255.2993762 | 276.8573127 | 412.6079448 | 498.0488860 | 486.1525267 |
| Infant | 66.5161784 | 160.3199911 | 270.7406333 | 316.4129246 | 318.6929873 |
| Male | 103.7232000 | 245.3857109 | 358.1181100 | 442.6155218 | 440.2073625 |
| Female | 38.9000000 | 42.5030488 | 59.6912087 | 69.0516082 | 59.1707630 |
| Infant | 10.1133242 | 23.4102444 | 37.1796923 | 39.8536875 | 36.4704743 |
| Male | 16.3958333 | 38.3385484 | 52.9693269 | 61.4272647 | 55.0276187 |
| Female | 0.1546644 | 0.1554605 | 0.1450304 | 0.1379609 | 0.1233605 |
| Infant | 0.1569554 | 0.1475600 | 0.1372256 | 0.1244413 | 0.1167649 |
| Male | 0.1512698 | 0.1564017 | 0.1462123 | 0.1364881 | 0.1262089 |
(5)(b) (3 points) Present three graphs. Each graph should include three lines, one for each sex. The first should show mean RATIO versus CLASS; the second, mean VOLUME versus CLASS; the third, mean SHUCK versus CLASS. This may be done with the ‘base R’ interaction.plot() function or with ggplot2 using grid.arrange().
Essay Question (2 points): What questions do these plots raise? Consider aging and sex differences.
Answer: The first graph demonstrates that as abalones mature, there is a lower average shuck meat to volume ratio that is gathered from them. This raises the question of why shuck weight/volume decreases as the abalone ages. It also shows infant abalones having the lowest shuck weight for almost all classes. This could be an indication of maturity in the animal. It would raise the questions of when does the abalone mature, is it worth harvesting older abalones if they yield small amounts of meat, and what happens to the infants that never mature at A5. The second graph demonstrates that as abalones mature, they grow in average volume. This is universal throughout all sexes and classes. However, it looks like females generally start young with higher volumes than the rest and infants start young with lower volumes than the rest. This would probably indicate some sort of sexual dwarfism in the species. Females in some species generally are larger in size because of the burden of offspring-bearing. Infants being smaller in volume when they are young makes sense. Babies in most species are small. But it raises the question of why do they stop growing. Certain species of goldfish can grow to fit their environment. But with this second graph, perhaps abalones stop growing around class A4. The third graph demonstrates that as abalones mature, they grow in average shuck weight. This is understandable to me, as other livestock animals exhibit the same as they grow (pigs, chickens, and cows, for example). However, just like the first graph, it shows that the oldest abalones start to have less shuck weight compared to when they were younger. This raises the question of what happens to the abalones in their lifecycle around class A4 and A5. Overall, these three graphs show some interesting trends and raise some interesting questions.
5(c) (3 points) Present four boxplots using par(mfrow = c(2, 2) or grid.arrange(). The first line should show VOLUME by RINGS for the infants and, separately, for the adult; factor levels “M” and “F,” combined. The second line should show WHOLE by RINGS for the infants and, separately, for the adults. Since the data are sparse beyond 15 rings, limit the displays to less than 16 rings. One way to accomplish this is to generate a new data set using subset() to select RINGS < 16. Use ylim = c(0, 1100) for VOLUME and ylim = c(0, 400) for WHOLE. If you wish to reorder the displays for presentation purposes or use ggplot2 go ahead.
Essay Question (2 points): What do these displays suggest about abalone growth? Also, compare the infant and adult displays. What differences stand out?
Answer: These displays suggest that as an abalone grows it’s growth rings, it also grows in whole weigh (g) and in volume (g). However, when comparing infant to adult, infants have much less variety in their whole weights and volumes. Infants stay below 200 g of whole weight their whole lives. Adults can have over 200 g. Infants stay below 600 cm^3 in volume their whole lives. Adults can have over 600 cm^3. Those differences stand out to me the most but I am curious about the drop off in each graph around ring 15. Abalones with 15 rings tend to have the same whole weight and volume as abalones with 8 or 9 rings. I wonder if that is an indication of abalone ‘retirement’.
Conclusions
Essay Question 1) (5 points) Based solely on these data, what are plausible statistical reasons that explain the failure of the original study? Consider to what extent physical measurements may be used for age prediction.
Answer: The original study wanted to predict the age of abalone from physical measurements to avoid the necessity of counting growth rings for aging. However, the study was not successful. The study organizers stated that more information and more variables of abalone data are needed in order to draw a more solid conclusion. I agree with the study organizers that this was the main problem of the study. Rings were the original indicator of age. Yet, there were outliers and there were confusing indications that both infants and adults can have a number of rings. I think that more information is needed to clarify the present variables. Statistically, with the outliers and sometimes skewed data, it’s easy to understand why solid conclusions couldn’t be made. Also with how observational studies are, the really is no tinkering with the environment or testing. All data is from observation, which also means that observational studies rely heavily on that information. I think that the variables chosen were great but they could have added more. The study organizers mentioned looking into the food that is readily available to the abalones and tracking weather patterns and locations. Apart from those, I think that characterizing the water near abalones would also help indicate if the water is oxiginated enough. Perhaps some tend to place themselves near undersea vents of some kind. Tracking any sort of natural predators would be interesting as well. I also think looking into the possibility of abalones having any sort of symbiotic relationship with neighboring species or even abalones being invasive would prove interesting as well. Overall, I can see where this study went wrong. But I have hope that future abalone studies!
Essay Question 2) (3 points) Do not refer to the abalone data or study. If you were presented with an overall histogram and summary statistics from a sample of some population or phenomenon and no other information, what questions might you ask before accepting them as representative of the sampled population or phenomenon?
Answer: Before diving into the histogram and the summary statistics, I would want to know how the data was gathered and what methods were used to pick the sample. I would want to be confident in the origin of the data and the accuracy of the sample selection. Then I would take a look at the histogram. A histogram may be a good visualization to check the shape of the data for any skewness. I would take it into account but would want to see a scatterplot. As great as a histogram is, it does not show all trends in the data. A scatterplot would show me more. Then I would move onto the summary statistics. I would look carefully at the summary statistics to confirm any trends I saw in the histogram and scatterplot. Finally, as great as summary statistics are, they only provide a bird’s eye view on a dataset in numerical form. If I were presented with just summary statistics, I would also want to see a box plot to see a visual representation of the summary statistics. Numbers may indicate some things. But the visualization might illuminate something. I would want to see both data summarized, like the statistics, and the data detailed, like the scatterplot, to make sure that any trends can be found.
Essay Question 3) (3 points) Do not refer to the abalone data or study. What do you see as difficulties analyzing data derived from observational studies? Can causality be determined? What might be learned from such studies?
Answer: Observational studies have the issue of the necessary removal of control of the observer or researcher. They are a needed method for observing many animals since observational studies are sometimes needed for ethical concerns or logistical restraints. There is no action being caused to study a reaction. The action and reaction are already happening, possibly continuously, consecutively, or sequentially. The observer must make sense of the environment without intrusion. Nothing can be tested or adjusted. Because of this, it’s really hard for the observer to truly be sure that one thing is causing the other. As a result, most observational study theories can only be tested and proven through conducting more observational studies. I think that from this, we can learn that data must have a clear indication of causality through trials and testing to generate substantial results.