wine <- read.csv(file = "winequality-red.csv", sep=";", header = T) # Load in the dataset
knitr::kable(head(wine,100), caption = "Red Wine Dataset")
| fixed.acidity | volatile.acidity | citric.acid | residual.sugar | chlorides | free.sulfur.dioxide | total.sulfur.dioxide | density | pH | sulphates | alcohol | quality |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 7.4 | 0.700 | 0.00 | 1.90 | 0.076 | 11 | 34 | 0.9978 | 3.51 | 0.56 | 9.4 | 5 |
| 7.8 | 0.880 | 0.00 | 2.60 | 0.098 | 25 | 67 | 0.9968 | 3.20 | 0.68 | 9.8 | 5 |
| 7.8 | 0.760 | 0.04 | 2.30 | 0.092 | 15 | 54 | 0.9970 | 3.26 | 0.65 | 9.8 | 5 |
| 11.2 | 0.280 | 0.56 | 1.90 | 0.075 | 17 | 60 | 0.9980 | 3.16 | 0.58 | 9.8 | 6 |
| 7.4 | 0.700 | 0.00 | 1.90 | 0.076 | 11 | 34 | 0.9978 | 3.51 | 0.56 | 9.4 | 5 |
| 7.4 | 0.660 | 0.00 | 1.80 | 0.075 | 13 | 40 | 0.9978 | 3.51 | 0.56 | 9.4 | 5 |
| 7.9 | 0.600 | 0.06 | 1.60 | 0.069 | 15 | 59 | 0.9964 | 3.30 | 0.46 | 9.4 | 5 |
| 7.3 | 0.650 | 0.00 | 1.20 | 0.065 | 15 | 21 | 0.9946 | 3.39 | 0.47 | 10.0 | 7 |
| 7.8 | 0.580 | 0.02 | 2.00 | 0.073 | 9 | 18 | 0.9968 | 3.36 | 0.57 | 9.5 | 7 |
| 7.5 | 0.500 | 0.36 | 6.10 | 0.071 | 17 | 102 | 0.9978 | 3.35 | 0.80 | 10.5 | 5 |
| 6.7 | 0.580 | 0.08 | 1.80 | 0.097 | 15 | 65 | 0.9959 | 3.28 | 0.54 | 9.2 | 5 |
| 7.5 | 0.500 | 0.36 | 6.10 | 0.071 | 17 | 102 | 0.9978 | 3.35 | 0.80 | 10.5 | 5 |
| 5.6 | 0.615 | 0.00 | 1.60 | 0.089 | 16 | 59 | 0.9943 | 3.58 | 0.52 | 9.9 | 5 |
| 7.8 | 0.610 | 0.29 | 1.60 | 0.114 | 9 | 29 | 0.9974 | 3.26 | 1.56 | 9.1 | 5 |
| 8.9 | 0.620 | 0.18 | 3.80 | 0.176 | 52 | 145 | 0.9986 | 3.16 | 0.88 | 9.2 | 5 |
| 8.9 | 0.620 | 0.19 | 3.90 | 0.170 | 51 | 148 | 0.9986 | 3.17 | 0.93 | 9.2 | 5 |
| 8.5 | 0.280 | 0.56 | 1.80 | 0.092 | 35 | 103 | 0.9969 | 3.30 | 0.75 | 10.5 | 7 |
| 8.1 | 0.560 | 0.28 | 1.70 | 0.368 | 16 | 56 | 0.9968 | 3.11 | 1.28 | 9.3 | 5 |
| 7.4 | 0.590 | 0.08 | 4.40 | 0.086 | 6 | 29 | 0.9974 | 3.38 | 0.50 | 9.0 | 4 |
| 7.9 | 0.320 | 0.51 | 1.80 | 0.341 | 17 | 56 | 0.9969 | 3.04 | 1.08 | 9.2 | 6 |
| 8.9 | 0.220 | 0.48 | 1.80 | 0.077 | 29 | 60 | 0.9968 | 3.39 | 0.53 | 9.4 | 6 |
| 7.6 | 0.390 | 0.31 | 2.30 | 0.082 | 23 | 71 | 0.9982 | 3.52 | 0.65 | 9.7 | 5 |
| 7.9 | 0.430 | 0.21 | 1.60 | 0.106 | 10 | 37 | 0.9966 | 3.17 | 0.91 | 9.5 | 5 |
| 8.5 | 0.490 | 0.11 | 2.30 | 0.084 | 9 | 67 | 0.9968 | 3.17 | 0.53 | 9.4 | 5 |
| 6.9 | 0.400 | 0.14 | 2.40 | 0.085 | 21 | 40 | 0.9968 | 3.43 | 0.63 | 9.7 | 6 |
| 6.3 | 0.390 | 0.16 | 1.40 | 0.080 | 11 | 23 | 0.9955 | 3.34 | 0.56 | 9.3 | 5 |
| 7.6 | 0.410 | 0.24 | 1.80 | 0.080 | 4 | 11 | 0.9962 | 3.28 | 0.59 | 9.5 | 5 |
| 7.9 | 0.430 | 0.21 | 1.60 | 0.106 | 10 | 37 | 0.9966 | 3.17 | 0.91 | 9.5 | 5 |
| 7.1 | 0.710 | 0.00 | 1.90 | 0.080 | 14 | 35 | 0.9972 | 3.47 | 0.55 | 9.4 | 5 |
| 7.8 | 0.645 | 0.00 | 2.00 | 0.082 | 8 | 16 | 0.9964 | 3.38 | 0.59 | 9.8 | 6 |
| 6.7 | 0.675 | 0.07 | 2.40 | 0.089 | 17 | 82 | 0.9958 | 3.35 | 0.54 | 10.1 | 5 |
| 6.9 | 0.685 | 0.00 | 2.50 | 0.105 | 22 | 37 | 0.9966 | 3.46 | 0.57 | 10.6 | 6 |
| 8.3 | 0.655 | 0.12 | 2.30 | 0.083 | 15 | 113 | 0.9966 | 3.17 | 0.66 | 9.8 | 5 |
| 6.9 | 0.605 | 0.12 | 10.70 | 0.073 | 40 | 83 | 0.9993 | 3.45 | 0.52 | 9.4 | 6 |
| 5.2 | 0.320 | 0.25 | 1.80 | 0.103 | 13 | 50 | 0.9957 | 3.38 | 0.55 | 9.2 | 5 |
| 7.8 | 0.645 | 0.00 | 5.50 | 0.086 | 5 | 18 | 0.9986 | 3.40 | 0.55 | 9.6 | 6 |
| 7.8 | 0.600 | 0.14 | 2.40 | 0.086 | 3 | 15 | 0.9975 | 3.42 | 0.60 | 10.8 | 6 |
| 8.1 | 0.380 | 0.28 | 2.10 | 0.066 | 13 | 30 | 0.9968 | 3.23 | 0.73 | 9.7 | 7 |
| 5.7 | 1.130 | 0.09 | 1.50 | 0.172 | 7 | 19 | 0.9940 | 3.50 | 0.48 | 9.8 | 4 |
| 7.3 | 0.450 | 0.36 | 5.90 | 0.074 | 12 | 87 | 0.9978 | 3.33 | 0.83 | 10.5 | 5 |
| 7.3 | 0.450 | 0.36 | 5.90 | 0.074 | 12 | 87 | 0.9978 | 3.33 | 0.83 | 10.5 | 5 |
| 8.8 | 0.610 | 0.30 | 2.80 | 0.088 | 17 | 46 | 0.9976 | 3.26 | 0.51 | 9.3 | 4 |
| 7.5 | 0.490 | 0.20 | 2.60 | 0.332 | 8 | 14 | 0.9968 | 3.21 | 0.90 | 10.5 | 6 |
| 8.1 | 0.660 | 0.22 | 2.20 | 0.069 | 9 | 23 | 0.9968 | 3.30 | 1.20 | 10.3 | 5 |
| 6.8 | 0.670 | 0.02 | 1.80 | 0.050 | 5 | 11 | 0.9962 | 3.48 | 0.52 | 9.5 | 5 |
| 4.6 | 0.520 | 0.15 | 2.10 | 0.054 | 8 | 65 | 0.9934 | 3.90 | 0.56 | 13.1 | 4 |
| 7.7 | 0.935 | 0.43 | 2.20 | 0.114 | 22 | 114 | 0.9970 | 3.25 | 0.73 | 9.2 | 5 |
| 8.7 | 0.290 | 0.52 | 1.60 | 0.113 | 12 | 37 | 0.9969 | 3.25 | 0.58 | 9.5 | 5 |
| 6.4 | 0.400 | 0.23 | 1.60 | 0.066 | 5 | 12 | 0.9958 | 3.34 | 0.56 | 9.2 | 5 |
| 5.6 | 0.310 | 0.37 | 1.40 | 0.074 | 12 | 96 | 0.9954 | 3.32 | 0.58 | 9.2 | 5 |
| 8.8 | 0.660 | 0.26 | 1.70 | 0.074 | 4 | 23 | 0.9971 | 3.15 | 0.74 | 9.2 | 5 |
| 6.6 | 0.520 | 0.04 | 2.20 | 0.069 | 8 | 15 | 0.9956 | 3.40 | 0.63 | 9.4 | 6 |
| 6.6 | 0.500 | 0.04 | 2.10 | 0.068 | 6 | 14 | 0.9955 | 3.39 | 0.64 | 9.4 | 6 |
| 8.6 | 0.380 | 0.36 | 3.00 | 0.081 | 30 | 119 | 0.9970 | 3.20 | 0.56 | 9.4 | 5 |
| 7.6 | 0.510 | 0.15 | 2.80 | 0.110 | 33 | 73 | 0.9955 | 3.17 | 0.63 | 10.2 | 6 |
| 7.7 | 0.620 | 0.04 | 3.80 | 0.084 | 25 | 45 | 0.9978 | 3.34 | 0.53 | 9.5 | 5 |
| 10.2 | 0.420 | 0.57 | 3.40 | 0.070 | 4 | 10 | 0.9971 | 3.04 | 0.63 | 9.6 | 5 |
| 7.5 | 0.630 | 0.12 | 5.10 | 0.111 | 50 | 110 | 0.9983 | 3.26 | 0.77 | 9.4 | 5 |
| 7.8 | 0.590 | 0.18 | 2.30 | 0.076 | 17 | 54 | 0.9975 | 3.43 | 0.59 | 10.0 | 5 |
| 7.3 | 0.390 | 0.31 | 2.40 | 0.074 | 9 | 46 | 0.9962 | 3.41 | 0.54 | 9.4 | 6 |
| 8.8 | 0.400 | 0.40 | 2.20 | 0.079 | 19 | 52 | 0.9980 | 3.44 | 0.64 | 9.2 | 5 |
| 7.7 | 0.690 | 0.49 | 1.80 | 0.115 | 20 | 112 | 0.9968 | 3.21 | 0.71 | 9.3 | 5 |
| 7.5 | 0.520 | 0.16 | 1.90 | 0.085 | 12 | 35 | 0.9968 | 3.38 | 0.62 | 9.5 | 7 |
| 7.0 | 0.735 | 0.05 | 2.00 | 0.081 | 13 | 54 | 0.9966 | 3.39 | 0.57 | 9.8 | 5 |
| 7.2 | 0.725 | 0.05 | 4.65 | 0.086 | 4 | 11 | 0.9962 | 3.41 | 0.39 | 10.9 | 5 |
| 7.2 | 0.725 | 0.05 | 4.65 | 0.086 | 4 | 11 | 0.9962 | 3.41 | 0.39 | 10.9 | 5 |
| 7.5 | 0.520 | 0.11 | 1.50 | 0.079 | 11 | 39 | 0.9968 | 3.42 | 0.58 | 9.6 | 5 |
| 6.6 | 0.705 | 0.07 | 1.60 | 0.076 | 6 | 15 | 0.9962 | 3.44 | 0.58 | 10.7 | 5 |
| 9.3 | 0.320 | 0.57 | 2.00 | 0.074 | 27 | 65 | 0.9969 | 3.28 | 0.79 | 10.7 | 5 |
| 8.0 | 0.705 | 0.05 | 1.90 | 0.074 | 8 | 19 | 0.9962 | 3.34 | 0.95 | 10.5 | 6 |
| 7.7 | 0.630 | 0.08 | 1.90 | 0.076 | 15 | 27 | 0.9967 | 3.32 | 0.54 | 9.5 | 6 |
| 7.7 | 0.670 | 0.23 | 2.10 | 0.088 | 17 | 96 | 0.9962 | 3.32 | 0.48 | 9.5 | 5 |
| 7.7 | 0.690 | 0.22 | 1.90 | 0.084 | 18 | 94 | 0.9961 | 3.31 | 0.48 | 9.5 | 5 |
| 8.3 | 0.675 | 0.26 | 2.10 | 0.084 | 11 | 43 | 0.9976 | 3.31 | 0.53 | 9.2 | 4 |
| 9.7 | 0.320 | 0.54 | 2.50 | 0.094 | 28 | 83 | 0.9984 | 3.28 | 0.82 | 9.6 | 5 |
| 8.8 | 0.410 | 0.64 | 2.20 | 0.093 | 9 | 42 | 0.9986 | 3.54 | 0.66 | 10.5 | 5 |
| 8.8 | 0.410 | 0.64 | 2.20 | 0.093 | 9 | 42 | 0.9986 | 3.54 | 0.66 | 10.5 | 5 |
| 6.8 | 0.785 | 0.00 | 2.40 | 0.104 | 14 | 30 | 0.9966 | 3.52 | 0.55 | 10.7 | 6 |
| 6.7 | 0.750 | 0.12 | 2.00 | 0.086 | 12 | 80 | 0.9958 | 3.38 | 0.52 | 10.1 | 5 |
| 8.3 | 0.625 | 0.20 | 1.50 | 0.080 | 27 | 119 | 0.9972 | 3.16 | 1.12 | 9.1 | 4 |
| 6.2 | 0.450 | 0.20 | 1.60 | 0.069 | 3 | 15 | 0.9958 | 3.41 | 0.56 | 9.2 | 5 |
| 7.8 | 0.430 | 0.70 | 1.90 | 0.464 | 22 | 67 | 0.9974 | 3.13 | 1.28 | 9.4 | 5 |
| 7.4 | 0.500 | 0.47 | 2.00 | 0.086 | 21 | 73 | 0.9970 | 3.36 | 0.57 | 9.1 | 5 |
| 7.3 | 0.670 | 0.26 | 1.80 | 0.401 | 16 | 51 | 0.9969 | 3.16 | 1.14 | 9.4 | 5 |
| 6.3 | 0.300 | 0.48 | 1.80 | 0.069 | 18 | 61 | 0.9959 | 3.44 | 0.78 | 10.3 | 6 |
| 6.9 | 0.550 | 0.15 | 2.20 | 0.076 | 19 | 40 | 0.9961 | 3.41 | 0.59 | 10.1 | 5 |
| 8.6 | 0.490 | 0.28 | 1.90 | 0.110 | 20 | 136 | 0.9972 | 2.93 | 1.95 | 9.9 | 6 |
| 7.7 | 0.490 | 0.26 | 1.90 | 0.062 | 9 | 31 | 0.9966 | 3.39 | 0.64 | 9.6 | 5 |
| 9.3 | 0.390 | 0.44 | 2.10 | 0.107 | 34 | 125 | 0.9978 | 3.14 | 1.22 | 9.5 | 5 |
| 7.0 | 0.620 | 0.08 | 1.80 | 0.076 | 8 | 24 | 0.9978 | 3.48 | 0.53 | 9.0 | 5 |
| 7.9 | 0.520 | 0.26 | 1.90 | 0.079 | 42 | 140 | 0.9964 | 3.23 | 0.54 | 9.5 | 5 |
| 8.6 | 0.490 | 0.28 | 1.90 | 0.110 | 20 | 136 | 0.9972 | 2.93 | 1.95 | 9.9 | 6 |
| 8.6 | 0.490 | 0.29 | 2.00 | 0.110 | 19 | 133 | 0.9972 | 2.93 | 1.98 | 9.8 | 5 |
| 7.7 | 0.490 | 0.26 | 1.90 | 0.062 | 9 | 31 | 0.9966 | 3.39 | 0.64 | 9.6 | 5 |
| 5.0 | 1.020 | 0.04 | 1.40 | 0.045 | 41 | 85 | 0.9938 | 3.75 | 0.48 | 10.5 | 4 |
| 4.7 | 0.600 | 0.17 | 2.30 | 0.058 | 17 | 106 | 0.9932 | 3.85 | 0.60 | 12.9 | 6 |
| 6.8 | 0.775 | 0.00 | 3.00 | 0.102 | 8 | 23 | 0.9965 | 3.45 | 0.56 | 10.7 | 5 |
| 7.0 | 0.500 | 0.25 | 2.00 | 0.070 | 3 | 22 | 0.9963 | 3.25 | 0.63 | 9.2 | 5 |
| 7.6 | 0.900 | 0.06 | 2.50 | 0.079 | 5 | 10 | 0.9967 | 3.39 | 0.56 | 9.8 | 5 |
| 8.1 | 0.545 | 0.18 | 1.90 | 0.080 | 13 | 35 | 0.9972 | 3.30 | 0.59 | 9.0 | 6 |
Pre-requisites to answering the questions – provide summary statistics of “residual.sugar” and use median to divide the data into two groups A and B
# Summary Statistics
summary(wine$residual.sugar)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.900 1.900 2.200 2.539 2.600 15.500
# Use the median (2.2) to divide the data into two groups A and B
rs.group <- ifelse(wine$residual.sugar < 2.2, "A", "B")
split_df <- data.frame(wine$residual.sugar, rs.group, wine$density)
# show the data after division (top 100 rows)
print(head(split_df, 100))
## wine.residual.sugar rs.group wine.density
## 1 1.90 A 0.9978
## 2 2.60 B 0.9968
## 3 2.30 B 0.9970
## 4 1.90 A 0.9980
## 5 1.90 A 0.9978
## 6 1.80 A 0.9978
## 7 1.60 A 0.9964
## 8 1.20 A 0.9946
## 9 2.00 A 0.9968
## 10 6.10 B 0.9978
## 11 1.80 A 0.9959
## 12 6.10 B 0.9978
## 13 1.60 A 0.9943
## 14 1.60 A 0.9974
## 15 3.80 B 0.9986
## 16 3.90 B 0.9986
## 17 1.80 A 0.9969
## 18 1.70 A 0.9968
## 19 4.40 B 0.9974
## 20 1.80 A 0.9969
## 21 1.80 A 0.9968
## 22 2.30 B 0.9982
## 23 1.60 A 0.9966
## 24 2.30 B 0.9968
## 25 2.40 B 0.9968
## 26 1.40 A 0.9955
## 27 1.80 A 0.9962
## 28 1.60 A 0.9966
## 29 1.90 A 0.9972
## 30 2.00 A 0.9964
## 31 2.40 B 0.9958
## 32 2.50 B 0.9966
## 33 2.30 B 0.9966
## 34 10.70 B 0.9993
## 35 1.80 A 0.9957
## 36 5.50 B 0.9986
## 37 2.40 B 0.9975
## 38 2.10 A 0.9968
## 39 1.50 A 0.9940
## 40 5.90 B 0.9978
## 41 5.90 B 0.9978
## 42 2.80 B 0.9976
## 43 2.60 B 0.9968
## 44 2.20 B 0.9968
## 45 1.80 A 0.9962
## 46 2.10 A 0.9934
## 47 2.20 B 0.9970
## 48 1.60 A 0.9969
## 49 1.60 A 0.9958
## 50 1.40 A 0.9954
## 51 1.70 A 0.9971
## 52 2.20 B 0.9956
## 53 2.10 A 0.9955
## 54 3.00 B 0.9970
## 55 2.80 B 0.9955
## 56 3.80 B 0.9978
## 57 3.40 B 0.9971
## 58 5.10 B 0.9983
## 59 2.30 B 0.9975
## 60 2.40 B 0.9962
## 61 2.20 B 0.9980
## 62 1.80 A 0.9968
## 63 1.90 A 0.9968
## 64 2.00 A 0.9966
## 65 4.65 B 0.9962
## 66 4.65 B 0.9962
## 67 1.50 A 0.9968
## 68 1.60 A 0.9962
## 69 2.00 A 0.9969
## 70 1.90 A 0.9962
## 71 1.90 A 0.9967
## 72 2.10 A 0.9962
## 73 1.90 A 0.9961
## 74 2.10 A 0.9976
## 75 2.50 B 0.9984
## 76 2.20 B 0.9986
## 77 2.20 B 0.9986
## 78 2.40 B 0.9966
## 79 2.00 A 0.9958
## 80 1.50 A 0.9972
## 81 1.60 A 0.9958
## 82 1.90 A 0.9974
## 83 2.00 A 0.9970
## 84 1.80 A 0.9969
## 85 1.80 A 0.9959
## 86 2.20 B 0.9961
## 87 1.90 A 0.9972
## 88 1.90 A 0.9966
## 89 2.10 A 0.9978
## 90 1.80 A 0.9978
## 91 1.90 A 0.9964
## 92 1.90 A 0.9972
## 93 2.00 A 0.9972
## 94 1.90 A 0.9966
## 95 1.40 A 0.9938
## 96 2.30 B 0.9932
## 97 3.00 B 0.9965
## 98 2.00 A 0.9963
## 99 2.50 B 0.9967
## 100 1.90 A 0.9972
a. State the null Hypothesis
b. Use visualization tools to inspect the hypothesis. Do you think the hypothesis is right or not?
boxplot(wine$density ~ rs.group)
c. What test are you going to use?
d. What is the p-value?
t.test(wine$density ~ rs.group)
##
## Welch Two Sample t-test
##
## data: wine$density by rs.group
## t = -14.955, df = 1571.9, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -0.001479826 -0.001136653
## sample estimates:
## mean in group A mean in group B
## 0.9960537 0.9973619
e. What is your conclusion?
f. Does your conclusion imply that there is an association between “density” and “residual.sugar”?
Pre-requisites to answering the questions (provide summary statistics of “residual.sugar” and use 1st, 2nd, and 3rd quartiles to divide the data into four groups A, B, C, and D)
# Summary Statistics
summary(wine$residual.sugar)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.900 1.900 2.200 2.539 2.600 15.500
# split the groups up using the 1st, 2nd, and 3rd quartiles
rs.group2 <- NULL
for (i in 1:length(wine$residual.sugar)){
if(wine$residual.sugar[i] <= 1.9) rs.group2[i] <- "A"
else if(wine$residual.sugar[i] <= 2.2) rs.group2[i] <- "B"
else if(wine$residual.sugar[i] <= 2.6) rs.group2[i] <- "C"
else rs.group2[i] <- "D"
}
# preview counts in each group
table(rs.group2)
## rs.group2
## A B C D
## 464 419 361 355
# put relevant columns into dataframe and preview top 100 rows
head(data.frame(wine$residual.sugar, rs.group2, wine$density), 100)
## wine.residual.sugar rs.group2 wine.density
## 1 1.90 A 0.9978
## 2 2.60 C 0.9968
## 3 2.30 C 0.9970
## 4 1.90 A 0.9980
## 5 1.90 A 0.9978
## 6 1.80 A 0.9978
## 7 1.60 A 0.9964
## 8 1.20 A 0.9946
## 9 2.00 B 0.9968
## 10 6.10 D 0.9978
## 11 1.80 A 0.9959
## 12 6.10 D 0.9978
## 13 1.60 A 0.9943
## 14 1.60 A 0.9974
## 15 3.80 D 0.9986
## 16 3.90 D 0.9986
## 17 1.80 A 0.9969
## 18 1.70 A 0.9968
## 19 4.40 D 0.9974
## 20 1.80 A 0.9969
## 21 1.80 A 0.9968
## 22 2.30 C 0.9982
## 23 1.60 A 0.9966
## 24 2.30 C 0.9968
## 25 2.40 C 0.9968
## 26 1.40 A 0.9955
## 27 1.80 A 0.9962
## 28 1.60 A 0.9966
## 29 1.90 A 0.9972
## 30 2.00 B 0.9964
## 31 2.40 C 0.9958
## 32 2.50 C 0.9966
## 33 2.30 C 0.9966
## 34 10.70 D 0.9993
## 35 1.80 A 0.9957
## 36 5.50 D 0.9986
## 37 2.40 C 0.9975
## 38 2.10 B 0.9968
## 39 1.50 A 0.9940
## 40 5.90 D 0.9978
## 41 5.90 D 0.9978
## 42 2.80 D 0.9976
## 43 2.60 C 0.9968
## 44 2.20 B 0.9968
## 45 1.80 A 0.9962
## 46 2.10 B 0.9934
## 47 2.20 B 0.9970
## 48 1.60 A 0.9969
## 49 1.60 A 0.9958
## 50 1.40 A 0.9954
## 51 1.70 A 0.9971
## 52 2.20 B 0.9956
## 53 2.10 B 0.9955
## 54 3.00 D 0.9970
## 55 2.80 D 0.9955
## 56 3.80 D 0.9978
## 57 3.40 D 0.9971
## 58 5.10 D 0.9983
## 59 2.30 C 0.9975
## 60 2.40 C 0.9962
## 61 2.20 B 0.9980
## 62 1.80 A 0.9968
## 63 1.90 A 0.9968
## 64 2.00 B 0.9966
## 65 4.65 D 0.9962
## 66 4.65 D 0.9962
## 67 1.50 A 0.9968
## 68 1.60 A 0.9962
## 69 2.00 B 0.9969
## 70 1.90 A 0.9962
## 71 1.90 A 0.9967
## 72 2.10 B 0.9962
## 73 1.90 A 0.9961
## 74 2.10 B 0.9976
## 75 2.50 C 0.9984
## 76 2.20 B 0.9986
## 77 2.20 B 0.9986
## 78 2.40 C 0.9966
## 79 2.00 B 0.9958
## 80 1.50 A 0.9972
## 81 1.60 A 0.9958
## 82 1.90 A 0.9974
## 83 2.00 B 0.9970
## 84 1.80 A 0.9969
## 85 1.80 A 0.9959
## 86 2.20 B 0.9961
## 87 1.90 A 0.9972
## 88 1.90 A 0.9966
## 89 2.10 B 0.9978
## 90 1.80 A 0.9978
## 91 1.90 A 0.9964
## 92 1.90 A 0.9972
## 93 2.00 B 0.9972
## 94 1.90 A 0.9966
## 95 1.40 A 0.9938
## 96 2.30 C 0.9932
## 97 3.00 D 0.9965
## 98 2.00 B 0.9963
## 99 2.50 C 0.9967
## 100 1.90 A 0.9972
a. State the null Hypothesis
b. Use visualization tools to inspect the hypothesis. Do you think the hypothesis is right or not?
boxplot(wine$density ~ rs.group2)
c. What test are you going to use?
d. What is the p-value?
summary(aov(wine$density ~ rs.group2))
## Df Sum Sq Mean Sq F value Pr(>F)
## rs.group2 3 0.000996 0.0003321 112.8 <2e-16 ***
## Residuals 1595 0.004696 0.0000029
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
e. What is your conclusion?
f. Does your conclusion imply that there is an association between “density” and “residual.sugar”? Compare your result here with that in Question 1. Do you think increasing the number of groups help identify the association? Would you consider dividing the data into 10 groups so as to help the discovery of the association? Why?
Re-create the “excellent” variable with “Yes” and “No” Values
wine$excellent <- ifelse(wine$quality >= 7, "Yes", "No")
Create the contingency table
# create it
ctable <- table(data.frame(wine$excellent, rs.group2))
# preview it
print(ctable)
## rs.group2
## wine.excellent A B C D
## No 411 367 308 296
## Yes 53 52 53 59
a. Use the Chi-square test to test if these two factors are correlated or not
chisq.test(ctable)
##
## Pearson's Chi-squared test
##
## data: ctable
## X-squared = 5.5, df = 3, p-value = 0.1386
b. Use the permutation test to do the same and compare the result to that in (a)
chisq.test(ctable, simulate.p.value = T)
##
## Pearson's Chi-squared test with simulated p-value (based on 2000
## replicates)
##
## data: ctable
## X-squared = 5.5, df = NA, p-value = 0.1439
c. Can you conclude that “residual.sugar” is a significant factor contributing to the excellence of wine? Why?