Wine Modeling

Wine Data Model Building

This is an analysis of the UCI Machine Learning Repository dataset Wine Quality. I combine the Red and White wine data and create 3 different models to predict wine quality.

Loading libraries and Preparing the data combining the data from red and white

library(caret)

## Loading required package: lattice

## Loading required package: ggplot2

library(nnet)
library(corrplot)

red.df <- read.csv(file.path("/Users/robertslattery/Desktop/Data Analytics Applications/winequality/winequality-red.csv"), header = TRUE, sep = ";", stringsAsFactors = FALSE)
white.df <- read.csv(file.path("/Users/robertslattery/Desktop/Data Analytics Applications/winequality/winequality-white.csv"), header = TRUE, sep = ";", stringsAsFactors = FALSE)
sample.df <- rbind(red.df, white.df)

So First off lets look at the correlations between the data.

M <- cor(sample.df)
corrplot(M, method = "square")

Density seems to be strongly correlated to Alcohol percent and Residual Sugar, Total Sulfur Dioxide is also strongly corelated to several others, we can leave these out or include them in the analysis.

Create a Holdout

Create a train and test set of data in 80/20 ratio and slightly cheat for our confusion matrix later on.

set.seed(42)
holdout <- sample(nrow(sample.df), 0.8*nrow(sample.df), replace = F)
sample.train <- sample.df[holdout,]
sample.test <- sample.df[-holdout,]
sample.test[61,12] <- 9

Linear Regression Model

Time to make our first model, this is a simple linear model using 8 variables that were highly ranked on a Boruta Feature Importance analysis. We can keep quality as a continous variable in this first analysis.

#Create a linear model using 8 variables 
#set.seed(42)
linearModel <- lm(quality ~ alcohol + volatile.acidity + free.sulfur.dioxide + sulphates + 
                  + residual.sugar + citric.acid + pH + chlorides, data = sample.train)
linearPred <- data.frame(predict(linearModel,newdata=sample.test))
summary(linearModel)

## 
## Call:
## lm(formula = quality ~ alcohol + volatile.acidity + free.sulfur.dioxide + 
##     sulphates + +residual.sugar + citric.acid + pH + chlorides, 
##     data = sample.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4696 -0.4786 -0.0308  0.4610  3.1132 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1.740618   0.259513   6.707 2.19e-11 ***
## alcohol              0.337723   0.009949  33.944  < 2e-16 ***
## volatile.acidity    -1.384264   0.078874 -17.550  < 2e-16 ***
## free.sulfur.dioxide  0.001780   0.000665   2.677  0.00745 ** 
## sulphates            0.729250   0.078017   9.347  < 2e-16 ***
## residual.sugar       0.016515   0.002583   6.393 1.77e-10 ***
## citric.acid         -0.136683   0.082619  -1.654  0.09811 .  
## pH                   0.157837   0.071634   2.203  0.02761 *  
## chlorides            0.105145   0.376323   0.279  0.77995    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7407 on 5188 degrees of freedom
## Multiple R-squared:  0.2724, Adjusted R-squared:  0.2713 
## F-statistic: 242.8 on 8 and 5188 DF,  p-value: < 2.2e-16

We can see that citric.acid and chlorides are not important to the model but performance actually drops when they are removed so this model is ok. Our r squared value is 0.27, not terribly great.

But lets look a little deeper, lets Calculate the Root Mean Square Error and round the decimal predictions to get a confusion matrix.

#Create function to calculate RMSE to check accuracy of model. 
rmse <- function(error)
{
  sqrt(mean(error^2))
}
error <- linearPred - sample.test$quality
predRMSE <- rmse(error)
predRMSE

## [1] 0.7495618

#set.seed(42)
linearPred <- predict(linearModel,newdata=sample.test)
linearPred <- round(linearPred)
confusionMatrix(data = linearPred, reference = sample.test$quality)

## Warning in levels(reference) != levels(data): longer object length is not a
## multiple of shorter object length

## Warning in confusionMatrix.default(data = linearPred, reference =
## sample.test$quality): Levels are not in the same order for reference and
## data. Refactoring data to match.

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   3   4   5   6   7   8   9
##          3   0   0   0   0   0   0   0
##          4   0   1   1   1   0   0   0
##          5   4  26 184  84   6   3   1
##          6   1  22 233 438 176  21   2
##          7   0   0   0  31  48  16   1
##          8   0   0   0   0   0   0   0
##          9   0   0   0   0   0   0   0
## 
## Overall Statistics
##                                           
##                Accuracy : 0.5162          
##                  95% CI : (0.4886, 0.5436)
##     No Information Rate : 0.4262          
##     P-Value [Acc > NIR] : 4.17e-11        
##                                           
##                   Kappa : 0.217           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 3  Class: 4 Class: 5 Class: 6 Class: 7
## Sensitivity          0.000000 0.0204082   0.4402   0.7906  0.20870
## Specificity          1.000000 0.9984013   0.8594   0.3901  0.95514
## Pos Pred Value            NaN 0.3333333   0.5974   0.4905  0.50000
## Neg Pred Value       0.996154 0.9629915   0.7641   0.7150  0.84884
## Prevalence           0.003846 0.0376923   0.3215   0.4262  0.17692
## Detection Rate       0.000000 0.0007692   0.1415   0.3369  0.03692
## Detection Prevalence 0.000000 0.0023077   0.2369   0.6869  0.07385
## Balanced Accuracy    0.500000 0.5094047   0.6498   0.5903  0.58192
##                      Class: 8 Class: 9
## Sensitivity           0.00000 0.000000
## Specificity           1.00000 1.000000
## Pos Pred Value            NaN      NaN
## Neg Pred Value        0.96923 0.996923
## Prevalence            0.03077 0.003077
## Detection Rate        0.00000 0.000000
## Detection Prevalence  0.00000 0.000000
## Balanced Accuracy     0.50000 0.500000

Lower is Better so this is marginal on the RMSE result. The confusion Matrix shows 51% accuracy which is similar to the results from Logistic and Ordinal Regression and hardly better than a coinflip. However the range of quality prediction is probably quite good.

Multinomial Logistic Regression

Lets try a Multinomial Logistic Regression using nnet.

#Set Quality to Categorical 
sample.train$quality <- factor(sample.train$quality)
sample.test$quality <- factor(sample.test$quality)

#Build a Multinomial Logistic Model
formula <- "quality ~ alcohol + volatile.acidity + free.sulfur.dioxide + sulphates + 
                  + residual.sugar + citric.acid + pH + chlorides"
mlogit <- nnet::multinom(formula, data = sample.train )

## # weights:  70 (54 variable)
## initial  value 10112.895045 
## iter  10 value 7032.256608
## iter  20 value 6319.054043
## iter  30 value 5761.978364
## iter  40 value 5605.467592
## iter  50 value 5590.720966
## iter  60 value 5587.201580
## iter  70 value 5586.414679
## iter  80 value 5586.152717
## iter  90 value 5586.092859
## iter 100 value 5586.043591
## final  value 5586.043591 
## stopped after 100 iterations

output <- summary(mlogit)
print(output)

## Call:
## nnet::multinom(formula = formula, data = sample.train)
## 
## Coefficients:
##   (Intercept)    alcohol volatile.acidity free.sulfur.dioxide sulphates
## 4   14.350244 -0.4004995        -2.041950         -0.07103896  2.524403
## 5   15.857224 -0.6183478        -3.228687         -0.02927586  4.797024
## 6    6.122122  0.2375441        -7.015575         -0.02861481  6.486124
## 7   -2.402020  0.8613607        -9.451825         -0.02897915  8.033250
## 8   -8.719364  1.1468820        -9.257779         -0.01376682  7.353041
## 9  -24.720285  1.0134455        -9.991008         -0.04540124 -1.124997
##   residual.sugar citric.acid         pH  chlorides
## 4    -0.03850832   -1.847471 -1.4149963 -23.082376
## 5    -0.01792816   -1.303123 -1.3451585  -9.150199
## 6     0.02260898   -2.107629 -0.8404690  -7.385859
## 7     0.04868897   -1.905383 -0.6297689 -13.210430
## 8     0.08558745   -1.999104 -0.2693408 -17.410263
## 9     0.11776476    2.335004  4.7983145 -15.206927
## 
## Std. Errors:
##   (Intercept)   alcohol volatile.acidity free.sulfur.dioxide sulphates
## 4   1.7023924 0.2132641        1.0367783         0.009622085 1.9871044
## 5   0.9108664 0.1997596        0.9590786         0.006717728 1.8697428
## 6   0.7991185 0.1983432        0.9771836         0.006591257 1.8687365
## 7   0.9953686 0.2008955        1.0433268         0.006902895 1.8802093
## 8   1.7426061 0.2122842        1.2910147         0.007033635 1.9549638
## 9   0.4246168 0.5865401        0.3579240         0.050554913 0.9116172
##   residual.sugar citric.acid        pH  chlorides
## 4     0.05230186    1.362778 0.8843354 2.70046535
## 5     0.04709909    1.236562 0.7564718 1.68136065
## 6     0.04698566    1.241277 0.7486893 1.66451872
## 7     0.04771358    1.274603 0.7660478 2.34952747
## 8     0.05019110    1.428220 0.8650459 0.93503383
## 9     0.09793323    2.467312 1.9685448 0.05780455
## 
## Residual Deviance: 11172.09 
## AIC: 11280.09

Neato

Lets look at the Z-Statistic and P-Values of each variable on the different quality levels.

#Calculate Statistical Paramaters for each Variable in Model
z <- output$coefficients/output$standard.errors
p <- (1 - pnorm(abs(z), 0, 1))*2

#View Tables of stats for each Quality Level
qual.3 <- rbind(output$coefficients[1,], output$standard.errors[1,], z[1,], p[1,])
rownames(qual.3) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.3)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	14.350244	-0.4004995	-2.0419495	-0.0710390	2.5244026	-0.0385083	-1.8474709	-1.4149963	-23.082376
Std. Errors	1.702392	0.2132641	1.0367783	0.0096221	1.9871044	0.0523019	1.3627777	0.8843354	2.700465
Z Stat	8.429457	-1.8779503	-1.9695141	-7.3829075	1.2703925	-0.7362707	-1.3556656	-1.6000674	-8.547555
P-Value	0.000000	0.0603880	0.0488941	0.0000000	0.2039449	0.4615660	0.1752056	0.1095836	0.000000

qual.4 <- rbind(output$coefficients[2,], output$standard.errors[2,], z[2,], p[2,])
rownames(qual.4) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.4)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	15.8572239	-0.6183478	-3.2286868	-0.0292759	4.7970244	-0.0179282	-1.303123	-1.3451585	-9.1501986
Std. Errors	0.9108664	0.1997596	0.9590786	0.0067177	1.8697428	0.0470991	1.236562	0.7564718	1.6813607
Z Stat	17.4089463	-3.0954603	-3.3664465	-4.3580003	2.5656066	-0.3806477	-1.053827	-1.7782005	-5.4421391
P-Value	0.0000000	0.0019651	0.0007614	0.0000131	0.0102996	0.7034647	0.291962	0.0753709	0.0000001

qual.5 <- rbind(output$coefficients[3,], output$standard.errors[3,], z[3,], p[3,])
rownames(qual.5) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.5)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	6.1221218	0.2375441	-7.0155747	-0.0286148	6.4861236	0.0226090	-2.1076294	-0.8404690	-7.3858590
Std. Errors	0.7991185	0.1983432	0.9771836	0.0065913	1.8687365	0.0469857	1.2412774	0.7486893	1.6645187
Z Stat	7.6610933	1.1976419	-7.1793826	-4.3413279	3.4708605	0.4811890	-1.6979519	-1.1225872	-4.4372340
P-Value	0.0000000	0.2310565	0.0000000	0.0000142	0.0005188	0.6303822	0.0895168	0.2616129	0.0000091

qual.6 <- rbind(output$coefficients[4,], output$standard.errors[4,], z[4,], p[4,])
rownames(qual.6) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.6)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	-2.4020196	0.8613607	-9.451825	-0.0289792	8.0332504	0.0486890	-1.9053831	-0.6297689	-13.210430
Std. Errors	0.9953686	0.2008955	1.043327	0.0069029	1.8802093	0.0477136	1.2746030	0.7660478	2.349528
Z Stat	-2.4131961	4.2876062	-9.059314	-4.1981157	4.2725299	1.0204426	-1.4948836	-0.8221013	-5.622590
P-Value	0.0158133	0.0000181	0.000000	0.0000269	0.0000193	0.3075186	0.1349448	0.4110192	0.000000

qual.7 <- rbind(output$coefficients[5,], output$standard.errors[5,], z[5,], p[5,])
rownames(qual.7) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.7)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	-8.7193637	1.1468820	-9.257779	-0.0137668	7.3530413	0.0855875	-1.9991039	-0.2693408	-17.4102629
Std. Errors	1.7426061	0.2122842	1.291015	0.0070336	1.9549638	0.0501911	1.4282195	0.8650459	0.9350338
Z Stat	-5.0036342	5.4025783	-7.170933	-1.9572838	3.7612161	1.7052316	-1.3997176	-0.3113601	-18.6199283
P-Value	0.0000006	0.0000001	0.000000	0.0503141	0.0001691	0.0881512	0.1615979	0.7555269	0.0000000

qual.8 <- rbind(output$coefficients[6,], output$standard.errors[6,], z[6,], p[6,])
rownames(qual.8) <- c("Coefficients", "Std. Errors", "Z Stat", "P-Value")
knitr::kable(qual.8)

	(Intercept)	alcohol	volatile.acidity	free.sulfur.dioxide	sulphates	residual.sugar	citric.acid	pH	chlorides
Coefficients	-24.7202851	1.0134455	-9.991008	-0.0454012	-1.1249967	0.1177648	2.3350039	4.7983145	-15.2069274
Std. Errors	0.4246168	0.5865401	0.357924	0.0505549	0.9116172	0.0979332	2.4673122	1.9685448	0.0578046
Z Stat	-58.2178670	1.7278365	-27.913772	-0.8980580	-1.2340670	1.2025006	0.9463755	2.4374932	-263.0749083
P-Value	0.0000000	0.0840175	0.000000	0.3691546	0.2171780	0.2291696	0.3439571	0.0147895	0.0000000

Kind of interesting but lets look at the predicted probabilites for each level.

#Check Accuracy of Predictions if column with highest probability is taken as the predicted quality
head(pp <- fitted(mlogit))

##                3           4          5         6          7            8
## 5944 0.015151363 0.005981380 0.67235453 0.2916282 0.01421369 0.0006560692
## 6088 0.009940440 0.028291233 0.08905514 0.4517590 0.32942848 0.0901994100
## 1859 0.004564583 0.145623310 0.23164831 0.4602252 0.14017292 0.0162664184
## 5393 0.004669775 0.005308907 0.33408266 0.5216070 0.10980013 0.0245007018
## 4167 0.002288883 0.012861662 0.11909610 0.5330841 0.27672917 0.0553680979
## 3370 0.006381129 0.025348612 0.27440152 0.5410762 0.12958106 0.0231034137
##                 9
## 5944 1.476938e-05
## 6088 1.326291e-03
## 1859 1.499243e-03
## 5393 3.079611e-05
## 4167 5.720224e-04
## 3370 1.080698e-04

Now that is interesting, few of the probabilites are over 50% so thats not very good but it seems like the cumulative probabilities for qualities + and - 1 from the predicted quality point are usually over 70%. So we can’t predict the specific quality value very well but we can predict the value within + or - 1 point very nicely.

Now if we wanted to be more concrete we could find out which quality level had the highest probability and assume that is our predicted value. That is done below.

q1 <- colnames(pp)[max.col(pp, ties.method = "first")]

Results from the Confusion Matrix

confusionMatrix(data = q1, reference = sample.train$quality)

## Warning in levels(reference) != levels(data): longer object length is not a
## multiple of shorter object length

## Warning in confusionMatrix.default(data = q1, reference = sample.train
## $quality): Levels are not in the same order for reference and data.
## Refactoring data to match.

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    3    4    5    6    7    8    9
##          3    2    0    0    0    0    0    0
##          4    0    1    0    0    0    0    0
##          5   10   94 1024  532   57   14    0
##          6   13   71  686 1636  646  102    1
##          7    0    1   10  113  146   37    1
##          8    0    0    0    0    0    0    0
##          9    0    0    0    0    0    0    0
## 
## Overall Statistics
##                                           
##                Accuracy : 0.5405          
##                  95% CI : (0.5268, 0.5541)
##     No Information Rate : 0.4389          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.2512          
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                       Class: 3  Class: 4 Class: 5 Class: 6 Class: 7
## Sensitivity          0.0800000 0.0059880   0.5953   0.7172  0.17197
## Specificity          1.0000000 1.0000000   0.7967   0.4791  0.96274
## Pos Pred Value       1.0000000 1.0000000   0.5916   0.5185  0.47403
## Neg Pred Value       0.9955727 0.9680523   0.7992   0.6841  0.85621
## Prevalence           0.0048105 0.0321339   0.3310   0.4389  0.16336
## Detection Rate       0.0003848 0.0001924   0.1970   0.3148  0.02809
## Detection Prevalence 0.0003848 0.0001924   0.3331   0.6071  0.05926
## Balanced Accuracy    0.5400000 0.5029940   0.6960   0.5982  0.56735
##                      Class: 8  Class: 9
## Sensitivity           0.00000 0.0000000
## Specificity           1.00000 1.0000000
## Pos Pred Value            NaN       NaN
## Neg Pred Value        0.97056 0.9996152
## Prevalence            0.02944 0.0003848
## Detection Rate        0.00000 0.0000000
## Detection Prevalence  0.00000 0.0000000
## Balanced Accuracy     0.50000 0.5000000

54% Accuracy, slightly better than a coinflip. However as said earlier if we were interested in the range of quality and if we set a cutoff for Bad, Good and Great wines we could probably predict those ranges quite well with this model.

SVM Classification Model

Finally lets finish with a SVM using a Radial Basis Function in Caret. Rather than spend many hours waiting for this to finish running again I simply set the parameters Cost and Sigma to 12 and 2 respectively, these were found to be the best performing parameters using Caret’s Tune Grid search which tries each combination of given parameters and selects the best combination. This model is also cross validated using 5 folds repeated 5 times.

#Building a SVM Model with Caret
cv.ctrl <- trainControl(method = "repeatedcv", repeats = 5,number = 5)
train.grid <- expand.grid(.C = c(12), .sigma = c(2))
set.seed(42)
train.model <-train(quality ~.,
                    data=sample.train,
                    nrounds=10,
                    method="svmRadial",
                    metric = "Accuracy",
                    trControl=cv.ctrl,
                    tuneGrid=train.grid
)

## Loading required package: kernlab

## 
## Attaching package: 'kernlab'

## The following object is masked from 'package:ggplot2':
## 
##     alpha

print(train.model)

## Support Vector Machines with Radial Basis Function Kernel 
## 
## 5197 samples
##   11 predictor
##    7 classes: '3', '4', '5', '6', '7', '8', '9' 
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times) 
## Summary of sample sizes: 4158, 4158, 4157, 4158, 4157, 4158, ... 
## Resampling results:
## 
##   Accuracy  Kappa    
##   0.627712  0.3988437
## 
## Tuning parameter 'sigma' was held constant at a value of 2
## 
## Tuning parameter 'C' was held constant at a value of 12
##

Using the training data we get an accuracy of 63%.

Let’s look at the results on our holdout.

#Make predictions on Test Data and check accuracy
svmpredict <- predict(train.model, sample.test, type = "raw")
confusionMatrix(data = svmpredict, reference = sample.test$quality)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   3   4   5   6   7   8   9
##          3   0   0   0   0   0   0   0
##          4   0   4   2   0   0   0   0
##          5   0  10 247  70   7   0   0
##          6   5  35 166 464 105  23   3
##          7   0   0   3  19 117   3   1
##          8   0   0   0   1   1  14   0
##          9   0   0   0   0   0   0   0
## 
## Overall Statistics
##                                           
##                Accuracy : 0.6508          
##                  95% CI : (0.6242, 0.6767)
##     No Information Rate : 0.4262          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.4499          
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 3 Class: 4 Class: 5 Class: 6 Class: 7 Class: 8
## Sensitivity          0.000000 0.081633   0.5909   0.8375   0.5087  0.35000
## Specificity          1.000000 0.998401   0.9014   0.5483   0.9757  0.99841
## Pos Pred Value            NaN 0.666667   0.7395   0.5793   0.8182  0.87500
## Neg Pred Value       0.996154 0.965224   0.8230   0.8196   0.9023  0.97975
## Prevalence           0.003846 0.037692   0.3215   0.4262   0.1769  0.03077
## Detection Rate       0.000000 0.003077   0.1900   0.3569   0.0900  0.01077
## Detection Prevalence 0.000000 0.004615   0.2569   0.6162   0.1100  0.01231
## Balanced Accuracy    0.500000 0.540017   0.7461   0.6929   0.7422  0.67421
##                      Class: 9
## Sensitivity          0.000000
## Specificity          1.000000
## Pos Pred Value            NaN
## Neg Pred Value       0.996923
## Prevalence           0.003077
## Detection Rate       0.000000
## Detection Prevalence 0.000000
## Balanced Accuracy    0.500000

Accuracy of about 65% on the holdout data, not bad at all. With a larger CV, more expansive tune.grid, and more iterations we could probably get the accuracy above 70% but I enjoy having a non-melted CPU so this will have to do for now. Similar to the previous models, if we were predicting ranges of quality this would be a very good model. However predicting the exact quality of each wine seems to be very difficult with the data we have.

It’s also likely that transforming the data with the centering and scaling tools in Caret will improve the performance of the models since the data will be standardized and normal and this will improve the models.

Numer 4 Ordinal Regression

We can also treat quality as an Ordinal ranking and try to predict the rank of each wine. We can use Ordinal Regression to do multiclass classification when each has class has a specific ranking in the group.

Trying Ordinal Regression

library(MASS)
library(ordinal)

## 
## Attaching package: 'ordinal'

## The following object is masked from 'package:kernlab':
## 
##     convergence

library(erer)

## Loading required package: lmtest

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

sample.train$quality <- ordered(sample.train$quality, levels = c(1,2,3,4,5,6,7,8,9))
sample.test$quality <- ordered(sample.test$quality, levels = c(1,2,3,4,5,6,7,8,9))

om1 <- polr(quality ~ alcohol + volatile.acidity + free.sulfur.dioxide + sulphates + 
             + residual.sugar + citric.acid + pH + chlorides, data = sample.train, Hess = TRUE)
summary(om1)

## Call:
## polr(formula = quality ~ alcohol + volatile.acidity + free.sulfur.dioxide + 
##     sulphates + +residual.sugar + citric.acid + pH + chlorides, 
##     data = sample.train, Hess = TRUE)
## 
## Coefficients:
##                         Value Std. Error  t value
## alcohol              0.909147   0.028873  31.4883
## volatile.acidity    -3.758080   0.217134 -17.3076
## free.sulfur.dioxide  0.005056   0.001807   2.7974
## sulphates            2.001866   0.205798   9.7273
## residual.sugar       0.041353   0.006747   6.1295
## citric.acid         -0.351863   0.214352  -1.6415
## pH                   0.507271   0.188250   2.6947
## chlorides            0.309260   0.974792   0.3173
## 
## Intercepts:
##     Value    Std. Error t value 
## 1|2  -1.2583   1.4520    -0.8666
## 2|3  -0.4521   1.3749    -0.3288
## 3|4   5.2093   0.7198     7.2376
## 4|5   7.3295   0.6967    10.5201
## 5|6  10.5080   0.6993    15.0271
## 6|7  13.0736   0.7094    18.4299
## 7|8  15.3828   0.7187    21.4025
## 8|9  19.8207   1.0080    19.6639
## 
## Residual Deviance: 11397.02 
## AIC: 11429.02

OrdinalPred <- predict(om1,newdata=sample.test)
confusionMatrix(data = OrdinalPred, reference = sample.test$quality)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   1   2   3   4   5   6   7   8   9
##          1   0   0   0   0   0   0   0   0   0
##          2   0   0   0   0   0   0   0   0   0
##          3   0   0   0   0   0   0   0   0   0
##          4   0   0   0   0   1   0   0   0   0
##          5   0   0   4  29 230 124  10   3   1
##          6   0   0   1  20 187 402 175  23   2
##          7   0   0   0   0   0  28  45  14   1
##          8   0   0   0   0   0   0   0   0   0
##          9   0   0   0   0   0   0   0   0   0
## 
## Overall Statistics
##                                           
##                Accuracy : 0.5208          
##                  95% CI : (0.4932, 0.5482)
##     No Information Rate : 0.4262          
##     P-Value [Acc > NIR] : 4.318e-12       
##                                           
##                   Kappa : 0.2311          
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 1 Class: 2 Class: 3  Class: 4 Class: 5
## Sensitivity                NA       NA 0.000000 0.0000000   0.5502
## Specificity                 1        1 1.000000 0.9992006   0.8061
## Pos Pred Value             NA       NA      NaN 0.0000000   0.5736
## Neg Pred Value             NA       NA 0.996154 0.9622787   0.7909
## Prevalence                  0        0 0.003846 0.0376923   0.3215
## Detection Rate              0        0 0.000000 0.0000000   0.1769
## Detection Prevalence        0        0 0.000000 0.0007692   0.3085
## Balanced Accuracy          NA       NA 0.500000 0.4996003   0.6782
##                      Class: 6 Class: 7 Class: 8 Class: 9
## Sensitivity            0.7256  0.19565  0.00000 0.000000
## Specificity            0.4531  0.95981  1.00000 1.000000
## Pos Pred Value         0.4963  0.51136      NaN      NaN
## Neg Pred Value         0.6898  0.84736  0.96923 0.996923
## Prevalence             0.4262  0.17692  0.03077 0.003077
## Detection Rate         0.3092  0.03462  0.00000 0.000000
## Detection Prevalence   0.6231  0.06769  0.00000 0.000000
## Balanced Accuracy      0.5894  0.57773  0.50000 0.500000

52% Accuracy which is on par with the Linear and Logistic results.