The question is, is it some correlation between wine price and taste. I’m not a big expert in the wine (just consumer), for me was interesting to see maybe it is some relation between wine taste and price.
My guess was, that we (not experts, just consumers) ratting worse (cheaper) wine better and better (costly) wine - worse.
the data is downloaded from Mr. P. Kudaras blog.
wine <- data.frame(w1=c(2,3,1,1,1,3,1,4,1,1),
w2=c(3,2,4,3,4,4,3,1,4,2),
w3=c(4,1,3,4,2,1,4,3,2,4),
w4=c(1,4,2,2,3,2,2,2,3,3))
answers <- data.frame(test=1:4, answers=c(1,4,2,3), mean=colMeans(wine))
# test - wine nummer in the test
# answer - real rating (1-best, 4-worst)
# mean - calculated test ratting mean
After merging both data sets together I make a plot. In order to see ratting distribution, I added some noise to the rating. This is only for picture
# libraries
library(ggplot2)
library(reshape2)
wine.melt.2 <- melt(wine)
## No id variables; using all as measure variables
colnames(wine.melt.2) <- c("Wine", "Rating")
wine.melt.2$x1 <- unclass(wine.melt.2$Wine) #just for plot
# adding some noise
set.seed(1)
spread <- rnorm(40)/10
wine.melt.2$x1 <-as.numeric(wine.melt.2$x1 + spread)
# ploting
# colours for lines and dots
cols <- c("mean"="#f04546","real"="#3591d1","points"="#62c76b")
fig <- ggplot(data=wine.melt.2, aes(x=x1)) # base
fig <- fig + geom_point(aes(y=Rating, colour="points")) # adding ratings
fig <- fig + geom_line(data=answers, aes(x=test, y=mean, group=1, colour="mean")) # adding mean line
fig <- fig + geom_line(data=answers, aes(x=test, y=answers, group=1, colour="real")) # adding real rating line
fig <- fig + geom_text(data=answers, aes(label=mean, x=test, y=mean), size=3, vjust=-1, colour="red") # adding some text
fig <- fig + theme(legend.title=element_blank()) + xlab("Wine") + ylab("Rating") # some additional
fig
Just try to test that worse wine was ratted better, and better wine - worse. First, calculating correlation between real ratting and mean.
cor(answers[, 2:3])
## answers mean
## answers 1.0000 0.7807
## mean 0.7807 1.0000
As we can see it is strong positive correlation between real ratting and test ratting mean. Checking my hipotesis about better and wors wine.
t.test(x=wine$w1, mu=answers$answers[1], alternative="greater") #1
##
## One Sample t-test
##
## data: wine$w1
## t = 2.228, df = 9, p-value = 0.02642
## alternative hypothesis: true mean is greater than 1
## 95 percent confidence interval:
## 1.142 Inf
## sample estimates:
## mean of x
## 1.8
t.test(x=wine$w2, mu=answers$answers[2], alternative="less") #4
##
## One Sample t-test
##
## data: wine$w2
## t = -3, df = 9, p-value = 0.007478
## alternative hypothesis: true mean is less than 4
## 95 percent confidence interval:
## -Inf 3.611
## sample estimates:
## mean of x
## 3
t.test(x=wine$w3, mu=answers$answers[3], alternative="greater") #2
##
## One Sample t-test
##
## data: wine$w3
## t = 2.058, df = 9, p-value = 0.03485
## alternative hypothesis: true mean is greater than 2
## 95 percent confidence interval:
## 2.087 Inf
## sample estimates:
## mean of x
## 2.8
t.test(x=wine$w4, mu=answers$answers[4], alternative="less") #3
##
## One Sample t-test
##
## data: wine$w4
## t = -2.25, df = 9, p-value = 0.0255
## alternative hypothesis: true mean is less than 3
## 95 percent confidence interval:
## -Inf 2.889
## sample estimates:
## mean of x
## 2.4
If I see these results, I can say, that worse wines (1 and 3) are rattet better and better wines (2 and 4) are rated worse.
I think, that for more reasonable results, we need:
1. to drink more wine.
2. just drink (responsible) and enjoy.