Estimating Wine Quality

In this activity I will use the dataset winequality-red. Activity 7 includes the dataset whitewinws.csv

Exploring and preparing the data

wine <- read.csv("winequality-red.csv")

Examine the wine data

str(wine)
## 'data.frame':    1599 obs. of  12 variables:
##  $ fixed.acidity       : num  7.4 7.8 7.8 11.2 7.4 7.4 7.9 7.3 7.8 7.5 ...
##  $ volatile.acidity    : num  0.7 0.88 0.76 0.28 0.7 0.66 0.6 0.65 0.58 0.5 ...
##  $ citric.acid         : num  0 0 0.04 0.56 0 0 0.06 0 0.02 0.36 ...
##  $ residual.sugar      : num  1.9 2.6 2.3 1.9 1.9 1.8 1.6 1.2 2 6.1 ...
##  $ chlorides           : num  0.076 0.098 0.092 0.075 0.076 0.075 0.069 0.065 0.073 0.071 ...
##  $ free.sulfur.dioxide : num  11 25 15 17 11 13 15 15 9 17 ...
##  $ total.sulfur.dioxide: num  34 67 54 60 34 40 59 21 18 102 ...
##  $ density             : num  0.998 0.997 0.997 0.998 0.998 ...
##  $ pH                  : num  3.51 3.2 3.26 3.16 3.51 3.51 3.3 3.39 3.36 3.35 ...
##  $ sulphates           : num  0.56 0.68 0.65 0.58 0.56 0.56 0.46 0.47 0.57 0.8 ...
##  $ alcohol             : num  9.4 9.8 9.8 9.8 9.4 9.4 9.4 10 9.5 10.5 ...
##  $ quality             : int  5 5 5 6 5 5 5 7 7 5 ...

The distribution of quality ratings

hist(wine$quality)

Summary statistics of the wine data

summary(wine)
##  fixed.acidity   volatile.acidity  citric.acid    residual.sugar  
##  Min.   : 4.60   Min.   :0.1200   Min.   :0.000   Min.   : 0.900  
##  1st Qu.: 7.10   1st Qu.:0.3900   1st Qu.:0.090   1st Qu.: 1.900  
##  Median : 7.90   Median :0.5200   Median :0.260   Median : 2.200  
##  Mean   : 8.32   Mean   :0.5278   Mean   :0.271   Mean   : 2.539  
##  3rd Qu.: 9.20   3rd Qu.:0.6400   3rd Qu.:0.420   3rd Qu.: 2.600  
##  Max.   :15.90   Max.   :1.5800   Max.   :1.000   Max.   :15.500  
##    chlorides       free.sulfur.dioxide total.sulfur.dioxide    density      
##  Min.   :0.01200   Min.   : 1.00       Min.   :  6.00       Min.   :0.9901  
##  1st Qu.:0.07000   1st Qu.: 7.00       1st Qu.: 22.00       1st Qu.:0.9956  
##  Median :0.07900   Median :14.00       Median : 38.00       Median :0.9968  
##  Mean   :0.08747   Mean   :15.87       Mean   : 46.47       Mean   :0.9967  
##  3rd Qu.:0.09000   3rd Qu.:21.00       3rd Qu.: 62.00       3rd Qu.:0.9978  
##  Max.   :0.61100   Max.   :72.00       Max.   :289.00       Max.   :1.0037  
##        pH          sulphates         alcohol         quality     
##  Min.   :2.740   Min.   :0.3300   Min.   : 8.40   Min.   :3.000  
##  1st Qu.:3.210   1st Qu.:0.5500   1st Qu.: 9.50   1st Qu.:5.000  
##  Median :3.310   Median :0.6200   Median :10.20   Median :6.000  
##  Mean   :3.311   Mean   :0.6581   Mean   :10.42   Mean   :5.636  
##  3rd Qu.:3.400   3rd Qu.:0.7300   3rd Qu.:11.10   3rd Qu.:6.000  
##  Max.   :4.010   Max.   :2.0000   Max.   :14.90   Max.   :8.000

Split the dataset: 80% for training and 20% for testing

wine_train <- wine[1:1279, ]
wine_test <- wine[1280:1599, ]

Training a model on the data Regression tree using rpart Build a tree model to predict quality using all other variables

library(rpart)
m.rpart <- rpart(quality ~ ., data = wine_train)

Get basic information about the tree

m.rpart
## n= 1279 
## 
## node), split, n, deviance, yval
##       * denotes terminal node
## 
##  1) root 1279 843.43390 5.663800  
##    2) alcohol< 10.55 804 332.70020 5.386816  
##      4) sulphates< 0.575 303  93.07591 5.171617 *
##      5) sulphates>=0.575 501 217.10580 5.516966  
##       10) volatile.acidity>=0.405 372 141.51610 5.403226  
##         20) total.sulfur.dioxide>=46.5 171  48.42105 5.210526 *
##         21) total.sulfur.dioxide< 46.5 201  81.34328 5.567164 *
##       11) volatile.acidity< 0.405 129  56.89922 5.844961 *
##    3) alcohol>=10.55 475 344.64420 6.132632  
##      6) sulphates< 0.625 183 132.92900 5.743169  
##       12) volatile.acidity>=1.015 8   4.87500 4.125000 *
##       13) volatile.acidity< 1.015 175 106.14860 5.817143  
##         26) volatile.acidity>=0.385 123  70.26016 5.642276 *
##         27) volatile.acidity< 0.385 52  23.23077 6.230769 *
##      7) sulphates>=0.625 292 166.56160 6.376712  
##       14) alcohol< 11.55 161  82.26087 6.130435  
##         28) total.sulfur.dioxide>=85.5 8   2.00000 5.000000 *
##         29) total.sulfur.dioxide< 85.5 153  69.50327 6.189542  
##           58) volatile.acidity>=0.395 83  28.89157 5.963855 *
##           59) volatile.acidity< 0.395 70  31.37143 6.457143 *
##       15) alcohol>=11.55 131  62.53435 6.679389 *

Get more detailed information about the tree

summary(m.rpart)
## Call:
## rpart(formula = quality ~ ., data = wine_train)
##   n= 1279 
## 
##            CP nsplit rel error    xerror       xstd
## 1  0.19692055      0 1.0000000 1.0022141 0.04019729
## 2  0.05353544      1 0.8030795 0.8197999 0.03930843
## 3  0.02669866      2 0.7495440 0.8100993 0.03826911
## 4  0.02597167      3 0.7228454 0.8127205 0.03914630
## 5  0.02580691      4 0.6968737 0.8033330 0.03910486
## 6  0.02215993      5 0.6710668 0.7966474 0.03903822
## 7  0.01500727      6 0.6489068 0.7619556 0.03628075
## 8  0.01393327      7 0.6338996 0.7477414 0.03621780
## 9  0.01275453      8 0.6199663 0.7470788 0.03623467
## 10 0.01095554      9 0.6072118 0.7406798 0.03558996
## 11 0.01000000     10 0.5962562 0.7379698 0.03546135
## 
## Variable importance
##              alcohol     volatile.acidity              density 
##                   33                   16                   14 
##            sulphates total.sulfur.dioxide            chlorides 
##                   12                    8                    6 
##        fixed.acidity          citric.acid  free.sulfur.dioxide 
##                    5                    4                    1 
##       residual.sugar                   pH 
##                    1                    1 
## 
## Node number 1: 1279 observations,    complexity param=0.1969205
##   mean=5.6638, MSE=0.659448 
##   left son=2 (804 obs) right son=3 (475 obs)
##   Primary splits:
##       alcohol          < 10.55    to the left,  improve=0.19692050, (0 missing)
##       volatile.acidity < 0.425    to the right, improve=0.11756210, (0 missing)
##       sulphates        < 0.645    to the left,  improve=0.11517210, (0 missing)
##       density          < 0.995565 to the right, improve=0.08420511, (0 missing)
##       citric.acid      < 0.305    to the left,  improve=0.07136804, (0 missing)
##   Surrogate splits:
##       density              < 0.995585 to the right, agree=0.774, adj=0.392, (0 split)
##       chlorides            < 0.0685   to the right, agree=0.689, adj=0.162, (0 split)
##       volatile.acidity     < 0.3675   to the right, agree=0.673, adj=0.120, (0 split)
##       total.sulfur.dioxide < 12.5     to the right, agree=0.660, adj=0.084, (0 split)
##       fixed.acidity        < 6.55     to the right, agree=0.655, adj=0.072, (0 split)
## 
## Node number 2: 804 observations,    complexity param=0.02669866
##   mean=5.386816, MSE=0.4138063 
##   left son=4 (303 obs) right son=5 (501 obs)
##   Primary splits:
##       sulphates            < 0.575    to the left,  improve=0.06768421, (0 missing)
##       volatile.acidity     < 0.325    to the right, improve=0.06764124, (0 missing)
##       alcohol              < 9.85     to the left,  improve=0.06585239, (0 missing)
##       total.sulfur.dioxide < 83.5     to the right, improve=0.03973483, (0 missing)
##       fixed.acidity        < 10.85    to the left,  improve=0.03522151, (0 missing)
##   Surrogate splits:
##       density          < 0.996285 to the left,  agree=0.670, adj=0.125, (0 split)
##       volatile.acidity < 0.7975   to the right, agree=0.649, adj=0.069, (0 split)
##       fixed.acidity    < 6.15     to the left,  agree=0.633, adj=0.026, (0 split)
##       citric.acid      < 0.105    to the left,  agree=0.632, adj=0.023, (0 split)
##       chlorides        < 0.055    to the left,  agree=0.629, adj=0.017, (0 split)
## 
## Node number 3: 475 observations,    complexity param=0.05353544
##   mean=6.132632, MSE=0.7255668 
##   left son=6 (183 obs) right son=7 (292 obs)
##   Primary splits:
##       sulphates        < 0.625    to the left,  improve=0.13101510, (0 missing)
##       volatile.acidity < 0.87     to the right, improve=0.12433980, (0 missing)
##       alcohol          < 11.55    to the left,  improve=0.12135200, (0 missing)
##       citric.acid      < 0.295    to the left,  improve=0.10370610, (0 missing)
##       pH               < 3.355    to the right, improve=0.05891088, (0 missing)
##   Surrogate splits:
##       citric.acid          < 0.205    to the left,  agree=0.716, adj=0.262, (0 split)
##       fixed.acidity        < 8.15     to the left,  agree=0.695, adj=0.208, (0 split)
##       volatile.acidity     < 0.665    to the right, agree=0.691, adj=0.197, (0 split)
##       total.sulfur.dioxide < 14.5     to the left,  agree=0.665, adj=0.131, (0 split)
##       density              < 0.99493  to the left,  agree=0.644, adj=0.077, (0 split)
## 
## Node number 4: 303 observations
##   mean=5.171617, MSE=0.3071812 
## 
## Node number 5: 501 observations,    complexity param=0.02215993
##   mean=5.516966, MSE=0.4333449 
##   left son=10 (372 obs) right son=11 (129 obs)
##   Primary splits:
##       volatile.acidity     < 0.405    to the right, improve=0.08608907, (0 missing)
##       total.sulfur.dioxide < 50.5     to the right, improve=0.07568808, (0 missing)
##       fixed.acidity        < 10.95    to the left,  improve=0.06485526, (0 missing)
##       alcohol              < 9.85     to the left,  improve=0.06078331, (0 missing)
##       free.sulfur.dioxide  < 14.5     to the right, improve=0.03644560, (0 missing)
##   Surrogate splits:
##       fixed.acidity       < 10.45    to the left,  agree=0.780, adj=0.147, (0 split)
##       citric.acid         < 0.365    to the left,  agree=0.754, adj=0.047, (0 split)
##       chlorides           < 0.0595   to the right, agree=0.754, adj=0.047, (0 split)
##       free.sulfur.dioxide < 2.5      to the right, agree=0.750, adj=0.031, (0 split)
## 
## Node number 6: 183 observations,    complexity param=0.02597167
##   mean=5.743169, MSE=0.7263878 
##   left son=12 (8 obs) right son=13 (175 obs)
##   Primary splits:
##       volatile.acidity < 1.015    to the right, improve=0.16479020, (0 missing)
##       alcohol          < 11.65    to the left,  improve=0.13019140, (0 missing)
##       citric.acid      < 0.255    to the left,  improve=0.12880670, (0 missing)
##       pH               < 3.435    to the right, improve=0.11508110, (0 missing)
##       density          < 0.99548  to the right, improve=0.07174394, (0 missing)
## 
## Node number 7: 292 observations,    complexity param=0.02580691
##   mean=6.376712, MSE=0.5704166 
##   left son=14 (161 obs) right son=15 (131 obs)
##   Primary splits:
##       alcohol              < 11.55    to the left,  improve=0.13068090, (0 missing)
##       total.sulfur.dioxide < 96       to the right, improve=0.07207060, (0 missing)
##       volatile.acidity     < 0.335    to the right, improve=0.05598551, (0 missing)
##       chlorides            < 0.0785   to the right, improve=0.05341253, (0 missing)
##       density              < 0.99985  to the right, improve=0.05290635, (0 missing)
##   Surrogate splits:
##       density              < 0.995315 to the right, agree=0.695, adj=0.321, (0 split)
##       fixed.acidity        < 5.75     to the right, agree=0.610, adj=0.130, (0 split)
##       chlorides            < 0.053    to the right, agree=0.606, adj=0.122, (0 split)
##       residual.sugar       < 4.25     to the left,  agree=0.596, adj=0.099, (0 split)
##       total.sulfur.dioxide < 21.5     to the right, agree=0.596, adj=0.099, (0 split)
## 
## Node number 10: 372 observations,    complexity param=0.01393327
##   mean=5.403226, MSE=0.3804197 
##   left son=20 (171 obs) right son=21 (201 obs)
##   Primary splits:
##       total.sulfur.dioxide < 46.5     to the right, improve=0.08304207, (0 missing)
##       alcohol              < 9.85     to the left,  improve=0.05705613, (0 missing)
##       free.sulfur.dioxide  < 26.5     to the right, improve=0.04378900, (0 missing)
##       fixed.acidity        < 11       to the left,  improve=0.03353691, (0 missing)
##       chlorides            < 0.0975   to the right, improve=0.02770083, (0 missing)
##   Surrogate splits:
##       free.sulfur.dioxide < 14.5     to the right, agree=0.801, adj=0.567, (0 split)
##       residual.sugar      < 2.55     to the right, agree=0.637, adj=0.211, (0 split)
##       chlorides           < 0.0975   to the right, agree=0.610, adj=0.152, (0 split)
##       pH                  < 3.235    to the left,  agree=0.610, adj=0.152, (0 split)
##       citric.acid         < 0.255    to the right, agree=0.602, adj=0.135, (0 split)
## 
## Node number 11: 129 observations
##   mean=5.844961, MSE=0.4410793 
## 
## Node number 12: 8 observations
##   mean=4.125, MSE=0.609375 
## 
## Node number 13: 175 observations,    complexity param=0.01500727
##   mean=5.817143, MSE=0.6065633 
##   left son=26 (123 obs) right son=27 (52 obs)
##   Primary splits:
##       volatile.acidity < 0.385    to the right, improve=0.11924460, (0 missing)
##       alcohol          < 11.65    to the left,  improve=0.10922390, (0 missing)
##       citric.acid      < 0.255    to the left,  improve=0.10301000, (0 missing)
##       pH               < 3.265    to the right, improve=0.08144755, (0 missing)
##       density          < 0.99548  to the right, improve=0.07341612, (0 missing)
##   Surrogate splits:
##       citric.acid          < 0.255    to the left,  agree=0.800, adj=0.327, (0 split)
##       pH                   < 3.275    to the right, agree=0.777, adj=0.250, (0 split)
##       density              < 0.99156  to the right, agree=0.737, adj=0.115, (0 split)
##       free.sulfur.dioxide  < 35       to the left,  agree=0.731, adj=0.096, (0 split)
##       total.sulfur.dioxide < 8.5      to the right, agree=0.731, adj=0.096, (0 split)
## 
## Node number 14: 161 observations,    complexity param=0.01275453
##   mean=6.130435, MSE=0.5109371 
##   left son=28 (8 obs) right son=29 (153 obs)
##   Primary splits:
##       total.sulfur.dioxide < 85.5     to the right, improve=0.13077420, (0 missing)
##       volatile.acidity     < 0.395    to the right, improve=0.11559190, (0 missing)
##       citric.acid          < 0.335    to the left,  improve=0.06879310, (0 missing)
##       density              < 1.0009   to the right, improve=0.06347986, (0 missing)
##       chlorides            < 0.0945   to the right, improve=0.05919751, (0 missing)
##   Surrogate splits:
##       volatile.acidity < 0.8525   to the right, agree=0.957, adj=0.125, (0 split)
## 
## Node number 15: 131 observations
##   mean=6.679389, MSE=0.4773615 
## 
## Node number 20: 171 observations
##   mean=5.210526, MSE=0.2831641 
## 
## Node number 21: 201 observations
##   mean=5.567164, MSE=0.404693 
## 
## Node number 26: 123 observations
##   mean=5.642276, MSE=0.5712208 
## 
## Node number 27: 52 observations
##   mean=6.230769, MSE=0.4467456 
## 
## Node number 28: 8 observations
##   mean=5, MSE=0.25 
## 
## Node number 29: 153 observations,    complexity param=0.01095554
##   mean=6.189542, MSE=0.4542697 
##   left son=58 (83 obs) right son=59 (70 obs)
##   Primary splits:
##       volatile.acidity     < 0.395    to the right, improve=0.13294730, (0 missing)
##       chlorides            < 0.0945   to the right, improve=0.05520964, (0 missing)
##       citric.acid          < 0.525    to the left,  improve=0.05486624, (0 missing)
##       total.sulfur.dioxide < 49.5     to the right, improve=0.03973369, (0 missing)
##       sulphates            < 0.885    to the left,  improve=0.03706797, (0 missing)
##   Surrogate splits:
##       citric.acid    < 0.315    to the left,  agree=0.706, adj=0.357, (0 split)
##       residual.sugar < 1.85     to the right, agree=0.647, adj=0.229, (0 split)
##       chlorides      < 0.0775   to the right, agree=0.641, adj=0.214, (0 split)
##       sulphates      < 0.805    to the left,  agree=0.641, adj=0.214, (0 split)
##       alcohol        < 11.05    to the right, agree=0.641, adj=0.214, (0 split)
## 
## Node number 58: 83 observations
##   mean=5.963855, MSE=0.3480912 
## 
## Node number 59: 70 observations
##   mean=6.457143, MSE=0.4481633

Installing packages

install.packages("rpart.plot")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)

Use the rpart.plot package to create a visualization

library(rpart.plot)

A basic decision tree diagram

rpart.plot(m.rpart, digits = 3)

A few adjustments to the diagram

rpart.plot(m.rpart, digits = 4, fallen.leaves = TRUE, type = 3, extra = 101)

Step 4: Evaluate model performance

# generate predictions for the testing dataset
p.rpart <- predict(m.rpart, wine_test)

Compare the distribution of predicted values vs. actual values

summary(p.rpart)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.125   5.172   5.567   5.632   5.964   6.679
summary(wine_test$quality)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.000   5.000   6.000   5.525   6.000   8.000

Compare the correlation

cor(p.rpart, wine_test$quality)
## [1] 0.4901703

Function to calculate the mean absolute error

MAE <- function(actual, predicted) {
  mean(abs(actual - predicted))  
}

Mean absolute error between predicted and actual values

MAE(p.rpart, wine_test$quality)
## [1] 0.5332276

Mean absolute error between actual values and mean value

mean(wine_train$quality) 
## [1] 5.6638
MAE(5.66, wine_test$quality)
## [1] 0.652125

Improving model performance

install.packages("plyr")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)
install.packages("Cubist")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.5'
## (as 'lib' is unspecified)

Train a Cubist Model Tree

library(Cubist)
## Loading required package: lattice
m.cubist <- cubist(x = wine_train[-12], y = wine_train$quality)

Display basic information about the model tree

m.cubist
## 
## Call:
## cubist.default(x = wine_train[-12], y = wine_train$quality)
## 
## Number of samples: 1279 
## Number of predictors: 11 
## 
## Number of committees: 1 
## Number of rules: 13

Display the tree itself

summary(m.cubist)
## 
## Call:
## cubist.default(x = wine_train[-12], y = wine_train$quality)
## 
## 
## Cubist [Release 2.07 GPL Edition]  Thu Feb  5 20:27:12 2026
## ---------------------------------
## 
##     Target attribute `outcome'
## 
## Read 1279 cases (12 attributes) from undefined.data
## 
## Model:
## 
##   Rule 1: [18 cases, mean 5.2, range 3 to 7, est err 0.7]
## 
##     if
##  volatile.acidity > 0.31
##  chlorides > 0.092
##  total.sulfur.dioxide > 32
##  density > 0.99824
##  sulphates > 0.63
##  alcohol > 9.8
##     then
##  outcome = -454.5 - 26.3 chlorides + 464 density - 2.64 volatile.acidity
##            + 0.012 alcohol - 0.0003 total.sulfur.dioxide + 0.05 sulphates
## 
##   Rule 2: [541 cases, mean 5.3, range 3 to 8, est err 0.4]
## 
##     if
##  alcohol <= 9.8
##     then
##  outcome = 5
## 
##   Rule 3: [502 cases, mean 5.3, range 3 to 8, est err 0.4]
## 
##     if
##  total.sulfur.dioxide <= 119
##  alcohol <= 9.8
##     then
##  outcome = 5.2 - 1.31 volatile.acidity - 0.69 citric.acid
##            + 0.47 sulphates + 0.07 alcohol + 0.039 fixed.acidity
##            - 0.0011 total.sulfur.dioxide - 0.0032 free.sulfur.dioxide
##            - 0.4 chlorides - 0.09 pH
## 
##   Rule 4: [44 cases, mean 5.4, range 4 to 7, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  pH <= 3.17
##  sulphates <= 0.63
##  alcohol > 9.8
##     then
##  outcome = 5.7 + 0.327 alcohol + 1.44 sulphates - 1.15 pH
##            - 0.0032 total.sulfur.dioxide - 0.06 fixed.acidity
##            + 0.0101 free.sulfur.dioxide - 0.41 volatile.acidity
##            - 0.04 residual.sugar
## 
##   Rule 5: [145 cases, mean 5.6, range 4 to 7, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  total.sulfur.dioxide > 25
##  pH > 3.17
##  sulphates <= 0.63
##  alcohol > 9.8
##     then
##  outcome = 6.6 + 3.02 sulphates + 0.267 alcohol - 1.55 pH
##            - 0.0061 total.sulfur.dioxide + 0.0042 free.sulfur.dioxide
## 
##   Rule 6: [95 cases, mean 5.6, range 3 to 7, est err 0.7]
## 
##     if
##  total.sulfur.dioxide <= 17
##  sulphates <= 0.63
##  alcohol > 9.8
##     then
##  outcome = 23.4 + 0.46 alcohol - 2.95 pH - 0.174 fixed.acidity
##            - 0.95 volatile.acidity - 0.106 residual.sugar
##            + 0.18 sulphates - 11 density - 0.0003 total.sulfur.dioxide
##            + 0.0009 free.sulfur.dioxide
## 
##   Rule 7: [34 cases, mean 5.7, range 4 to 7, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  total.sulfur.dioxide > 17
##  total.sulfur.dioxide <= 25
##  pH > 3.17
##  sulphates <= 0.63
##  alcohol > 9.8
##     then
##  outcome = 20 - 0.2439 total.sulfur.dioxide + 0.1262 free.sulfur.dioxide
##            - 2.42 volatile.acidity - 2.68 pH
## 
##   Rule 8: [27 cases, mean 5.9, range 5 to 7, est err 0.6]
## 
##     if
##  chlorides > 0.092
##  total.sulfur.dioxide > 32
##  density <= 0.99824
##  sulphates > 0.63
##  alcohol > 9.8
##     then
##  outcome = 7 - 6.2 chlorides - 0.59 volatile.acidity
## 
##   Rule 9: [135 cases, mean 6.1, range 5 to 8, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  total.sulfur.dioxide <= 32
##  pH > 3.17
##  sulphates > 0.63
##     then
##  outcome = 67.7 + 0.291 alcohol + 0.164 residual.sugar - 65 density
##            - 0.69 volatile.acidity
## 
##   Rule 10: [150 cases, mean 6.1, range 5 to 8, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  chlorides <= 0.092
##  total.sulfur.dioxide > 32
##  sulphates > 0.63
##  alcohol > 9.8
##     then
##  outcome = 1.8 - 0.0144 total.sulfur.dioxide + 0.359 alcohol
##            + 1.2 sulphates + 0.072 residual.sugar
## 
##   Rule 11: [35 cases, mean 6.3, range 5 to 8, est err 0.5]
## 
##     if
##  volatile.acidity > 0.31
##  total.sulfur.dioxide <= 32
##  pH <= 3.17
##  sulphates > 0.63
##  alcohol > 9.8
##     then
##  outcome = 151.3 + 4.11 pH + 11.2 chlorides - 159 density
##            + 0.014 residual.sugar - 0.09 volatile.acidity + 0.013 alcohol
## 
##   Rule 12: [56 cases, mean 6.3, range 5 to 8, est err 0.6]
## 
##     if
##  volatile.acidity <= 0.31
##  sulphates <= 0.73
##  alcohol > 9.8
##     then
##  outcome = 327.8 + 7.64 volatile.acidity + 3.95 sulphates - 315 density
##            - 3.07 pH - 0.229 alcohol - 0.0052 total.sulfur.dioxide
##            + 0.13 residual.sugar
## 
##   Rule 13: [75 cases, mean 6.5, range 5 to 8, est err 0.4]
## 
##     if
##  volatile.acidity <= 0.31
##  sulphates > 0.73
##     then
##  outcome = 1.7 + 0.407 alcohol + 1.48 citric.acid + 0.22 volatile.acidity
##            + 0.13 sulphates - 0.1 pH
## 
## 
## Evaluation on training data (1279 cases):
## 
##     Average  |error|                0.4
##     Relative |error|               0.57
##     Correlation coefficient        0.66
## 
## 
##  Attribute usage:
##    Conds  Model
## 
##     89%    68%    alcohol
##     61%    56%    total.sulfur.dioxide
##     44%    58%    sulphates
##     37%    55%    volatile.acidity
##     21%    53%    pH
##     11%    31%    chlorides
##      2%    18%    density
##            44%    free.sulfur.dioxide
##            35%    fixed.acidity
##            31%    citric.acid
##            28%    residual.sugar
## 
## 
## Time: 0.0 secs

Generate predictions for the model

p.cubist <- predict(m.cubist, wine_test)

Summary statistics about the predictions

summary(p.cubist)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.174   5.095   5.607   5.621   6.021   7.400

Correlation between the predicted and true values

cor(p.cubist, wine_test$quality)
## [1] 0.5310026

Mean absolute error of predicted and true values

# (uses a custom function defined above)
MAE(wine_test$quality, p.cubist) 
## [1] 0.5037106