The Dataset was chosen from kaggle webpage, Name of the chosen dataset was “Spanish Wine Quality”. This dataset is related to red varients of spanish wine. This dataset contains 516 different types of red wines from Spain with 11 features that describe their price, rating, and even some flavor description.
[1] 516 11
'data.frame': 516 obs. of 11 variables:
$ winery : chr "Teso La Monja" "Artadi" "Vega Sicilia" "Vega Sicilia" ...
$ wine : chr "Tinto" "Vina El Pison" "Unico" "Unico" ...
$ year : chr "2013" "2018" "2009" "1999" ...
$ rating : num 4.9 4.9 4.8 4.8 4.8 4.8 4.8 4.8 4.8 4.8 ...
$ num_reviews: int 58 31 1793 1705 1309 1209 1201 926 643 630 ...
$ country : chr "Espana" "Espana" "Espana" "Espana" ...
$ region : chr "Toro" "Vino de Espana" "Ribera del Duero" "Ribera del Duero" ...
$ price : num 995 314 325 693 778 ...
$ type : chr "Toro Red" "Tempranillo" "Ribera Del Duero Red" "Ribera Del Duero Red" ...
$ body : int 5 4 5 5 5 5 5 5 5 5 ...
$ acidity : int 3 2 3 3 3 3 3 3 3 3 ...
winery wine year rating
Length:516 Length:516 Length:516 Min. :4.500
Class :character Class :character Class :character 1st Qu.:4.500
Mode :character Mode :character Mode :character Median :4.600
Mean :4.608
3rd Qu.:4.700
Max. :4.900
num_reviews country region price
Min. : 25.0 Length:516 Length:516 Min. : 12.00
1st Qu.: 72.0 Class :character Class :character 1st Qu.: 74.67
Median : 144.5 Mode :character Mode :character Median : 148.12
Mean : 604.4 Mean : 339.69
3rd Qu.: 405.2 3rd Qu.: 356.94
Max. :12421.0 Max. :3119.08
type body acidity
Length:516 Min. :2.000 Min. :1.000
Class :character 1st Qu.:4.000 1st Qu.:3.000
Mode :character Median :5.000 Median :3.000
Mean :4.497 Mean :2.893
3rd Qu.:5.000 3rd Qu.:3.000
Max. :5.000 Max. :3.000
NA's :11 NA's :11
[1] "Tinto"
[2] "Vina El Pison"
[3] "Unico"
[4] "Unico Reserva Especial Edicion"
[5] "El Anejon"
[6] "Don PX Convento Seleccion"
[7] "Cuesta de Las Liebres"
[8] "El Nido"
[9] "Toneles Moscatel"
[10] "Pingus"
[11] "Don PX Pedro Ximenez"
[12] "L'Ermita Velles Vinyes Priorat"
[13] "Vatan Arena Tinta de Toro"
[14] "Ribera Del Duero Gran Reserva 12 Anos"
[15] "Pesus Ribera del Duero"
[16] "Magico"
[17] "La Faraona Bierzo (Corullon)"
[18] "Gran Reserva 890"
[19] "Valbuena 5o"
[20] "Castillo Ygay Gran Reserva Especial Blanco"
[21] "La Nieta"
[22] "Malleolus de Valderramiro"
[23] "Malleolus de Sanchomartin"
[24] "Alabaster"
[25] "La Mula de la Quietud"
[26] "Terreus Paraje de Cueva Baja"
[27] "Contador Rioja"
[28] "Maria Remirez de Ganuza"
[29] "Cartago Paraje de Pozo"
[30] "Parcela El Picon Tinto"
[31] "Termanthia"
[32] "Clon De La Familia"
[33] "Aquilon Garnacha"
[34] "Quinon de Valmira"
[35] "1902 Centenary Carignan Priorat"
[36] "Tintilla de Rota"
[37] "Cirsion Rioja"
[38] "Cami Pesseroles"
[39] "Turo d'en Mota"
[40] "Priorat"
[41] "Reliquia Palo Cortado Sherry"
[42] "Anada Palo Cortado 1987"
[43] "Daphne Glorian Red"
[44] "El Regollar"
[45] "Abuelo Diego Palo Cortado"
[46] "La Bota 78 de Oloroso"
[47] "Touran"
[48] "Luthier Gran Reserva"
[49] "Sorte O Soro Val do Bibei"
[50] "Reserva Particular de Recaredo Brut Nature"
[51] "Usted"
[52] "Gran Reserva"
[53] "Regina Vides Ribera del Duero"
[54] "La Loma"
[55] "Recondita Armonia Monastrell Dulce"
[56] "Castillo Ygay Gran Reserva Especial Tinto"
[57] "Ribera del Duero"
[58] "PS (Pagos Seleccionados) Ribera del Duero"
[59] "Flor de Pingus"
[60] "Pago De Valtarrena"
[61] "El Titan del Bendito"
[62] "Dalmau Rioja"
[63] "Finca el Bosque"
[64] "Pago de Santa Cruz"
[65] "Clos Fonta Priorat"
[66] "Cuvee N Vinas Viejas"
[67] "Ribera del Duero Prestigio Pago de las Solanas"
[68] "Nit de Nin Mas d'en Cacador"
[69] "VS"
[70] "Blecua Somontano"
[71] "Frank Gehry Selection"
[72] "Amaya Arzuaga (Coleccion)"
[73] "La Creu Alta"
[74] "5o Ano Ribera del Duero Tinto"
[75] "Planots Priorat"
[76] "Capitel"
[77] "Gran Arzuaga Ribera del Duero"
[78] "Horcajo"
[79] "Aurus"
[80] "Dolc de L'Obac"
[81] "Trasnocho"
[82] "Doroteo Edicion Especial 25 Aniversario"
[83] "Tierra Alta de 2 Racimos Gran Reserva"
[84] "Reserva Especial Ribera del Duero"
[85] "Reserva Especial"
[86] "Gran Buig Priorat"
[87] "Finca Garbet"
[88] "Solera India Oloroso Rare Sherry"
[89] "Amancio"
[90] "Gines Liebana Pedro Ximenez"
[91] "Don PX Seleccion"
[92] "Reserva"
[93] "Finca Las Naves"
[94] "AAA"
[95] "Vina del Olivo"
[96] "Les Aubaguetes Priorat"
[97] "Cava Enoteca Finca La Plana Brut Nature"
[98] "Gran Reserva Penas Aladas"
[99] "Pena Lobera"
[100] "Adega do Moucho Treixadura"
[101] "Son Negre"
[102] "Casa Cisca Monastrell"
[103] "Moncerbal Bierzo (Corullon)"
[104] "Capricho"
[105] "Cerrado del Castillo Rioja"
[106] "Kalamity Rioja"
[107] "Conde de Aldama Amontillado"
[108] "Respeto"
[109] "El Garnacho Viejo de la Familia Acha"
[110] "Ribera del Duero Prestigio"
[111] "Finca Misenhora Edicion Limitada"
[112] "La Poza de Ballesteros"
[113] "Reserva Rioja (Finca Ygay)"
[114] "Idus"
[115] "Rosado de Larrainzar"
[116] "Valdafoz Bierzo (Corullon)"
[117] "Rioja Gran Reserva"
[118] "Valdegatiles Ribera del Duero"
[119] "Respublica Verdejo"
[120] "Les Tosses"
[121] "Torremilanos Coleccion Ribera del Duero"
[122] "El Carretil"
[123] "Taberner No. 1"
[124] "Vina Motulleri"
[125] "Saktih"
[126] "Cuvee Palomar"
[127] "Uno Tinto"
[128] "Macan"
[129] "1903 Centenary Grenache"
[130] "Galiano Seleccion Especial"
[131] "Molino Real"
[132] "Etern Vinyes Molt Velles"
[133] "40 Aniversario Gran Reserva"
[134] "Solera BC-200"
[135] "Corpinnat Enoteca Reserva Particular de Recaredo"
[136] "Toro"
[137] "Clio"
[138] "Garnacha"
[139] "Parcela El Nogal Tinto"
[140] "Belondrade y Lurton"
[141] "Rioja Reserva"
[142] "Prado Enea Gran Reserva"
[143] "Pedro Ximenez Tradicion 20 Years Old Vos"
[144] "Torre Muga"
[145] "Aro"
[146] "San Vicente Rioja"
[147] "Ribera del Duero TSM"
[148] "Ribera Del Duero Reserva Premium 6 Anos"
[149] "Clos de L'Obac"
[150] "Reserva Ribera del Duero"
[151] "Ribera Del Duero"
[152] "Bruto"
[153] "Culmen Reserva Rioja"
[154] "Vina Tondonia Gran Reserva"
[155] "Petit Verdot"
[156] "Clos Martinet"
[157] "Millenium Gran Reserva"
[158] "Finca Dofi"
[159] "Gran Reserva Ribera del Duero"
[160] "Seleccio Especial Vinyes Velles"
[161] "Calvario Rioja"
[162] "Ribera del Duero Una Cepa I"
[163] "Grans Muralles"
[164] "Gran Vino Albarino"
[165] "Cuatro Palmas Amontillado"
[166] "Pago Garduna"
[167] "La Cueva del Contador Rioja"
[168] "Solera 1830 Pedro Ximenez"
[169] "Que Bonito Cacareaba Blanco"
[170] "Baron de Chirel Rioja Reserva"
[171] "Emeritvs (Emeritus)"
[172] "Finca Los Hoyales Ribera del Duero"
[173] "Blanco"
[174] "La Basseta"
[175] "Territorio Luthier Reserva"
[176] "Vinas Viejas de Soria Ribera Del Duero"
[177] "Pago de Santa Cruz Gran Reserva Ribera del Duero"
[178] "As Sortes Val do Bibei Godello"
[179] "Seleccion Rioja"
[180] "Millenium Reserva"
[181] "Acediano"
[182] "Gaudium"
[183] "Pico de Luyas"
[184] "Grandes Anadas Rioja"
[185] "Mas Via Gran Reserva Brut"
[186] "Navarra Coleccion 125 Blanco"
[187] "Chardonnay Uno"
[188] "Don PX Vieja Cosecha"
[189] "Ribas de Cabrera"
[190] "Cenit"
[191] "Mas del Serral"
[192] "Estrats"
[1] "Teso La Monja"
[2] "Artadi"
[3] "Vega Sicilia"
[4] "Pago de Carraovejas"
[5] "Toro Albala"
[6] "Bodegas El Nido"
[7] "Valdespino"
[8] "Dominio de Pingus"
[9] "Alvaro Palacios"
[10] "Ordonez"
[11] "Bodegas Valduero"
[12] "Vina Sastre"
[13] "Sierra Cantabria"
[14] "Descendientes de J. Palacios"
[15] "La Rioja Alta"
[16] "Marques de Murrieta"
[17] "Vinedos de Paganos"
[18] "Emilio Moro"
[19] "Quinta de la Quietud"
[20] "Bodegas Mauro"
[21] "Bodega Contador (Benjamin Romeo)"
[22] "Remirez de Ganuza"
[23] "Bodegas San Roman"
[24] "Pago de Los Capellanes"
[25] "Bodega Numanthia"
[26] "Alto Moncayo"
[27] "Mas Doix"
[28] "Finca Moncloa"
[29] "Bodegas Roda"
[30] "Martinet"
[31] "Recaredo"
[32] "Clos Erasmus"
[33] "Barbadillo"
[34] "Gonzalez-Byass"
[35] "Bodegas Amaren"
[36] "Alvear"
[37] "Equipo Navazos"
[38] "Morca"
[39] "Territorio Luthier"
[40] "Rafael Palacios"
[41] "Terra Remota"
[42] "Dehesa de Los Canonigos"
[43] "Miguel Merino"
[44] "Gutierrez de la Vega"
[45] "Alion"
[46] "Aalto"
[47] "Carmelo Rodero"
[48] "Dominio del Bendito"
[49] "Mas d'en Gil"
[50] "Casa Castillo"
[51] "Matarromera"
[52] "Nin-Ortiz"
[53] "Vinas del Vero"
[54] "Marques de Riscal"
[55] "Arzuaga"
[56] "Bodegas Mas Alta"
[57] "Dominio de Calogia"
[58] "Tomas Postigo"
[59] "Cal Pla"
[60] "Ossian"
[61] "Cepa 21"
[62] "Bodegas Vilano"
[63] "Allende"
[64] "Costers del Siurana"
[65] "Hacienda Monasterio"
[66] "Castillo Perelada"
[67] "Osborne"
[68] "Ysios"
[69] "Marques de Grinon"
[70] "Contino"
[71] "Gramona"
[72] "Dominio del Aguila"
[73] "Hacienda Solano"
[74] "Francisco Garcia Perez"
[75] "Anima Negra"
[76] "Castano"
[77] "La Legua"
[78] "Castillo de Cuzcurrita"
[79] "Oxer Wines"
[80] "Bodegas Yuste"
[81] "Bodegas 6o Elemento - Vino Sexto Elemento"
[82] "Proyecto Garnachas de Espana"
[83] "Casal de Arman"
[84] "Vall Llach"
[85] "Pago de Larrainzar"
[86] "Ukan Winery"
[87] "Vina Real"
[88] "Dominio de Atauta"
[89] "Micro Bio (MicroBio)"
[90] "Terroir Al Limit Soc. Lda"
[91] "Finca Torremilanos"
[92] "Huerta de Albala"
[93] "Gomez Cruzado"
[94] "Castell d'Encus"
[95] "Abadia Retuerta"
[96] "Enate"
[97] "Benjamin de Rothschild - Vega Sicilia"
[98] "Bodegas Aragonesas"
[99] "Telmo Rodriguez"
[100] "Acustic Celler"
[101] "Vina Pedrosa"
[102] "Pintia"
[103] "Belondrade"
[104] "Muga"
[105] "Clos Mogador"
[106] "Bodegas Tradicion"
[107] "Senorio de San Vicente"
[108] "Francisco Barona"
[109] "Juan Gil"
[110] "Lan"
[111] "R. Lopez de Heredia"
[112] "Adama Wines"
[113] "Milsetentayseis"
[114] "Espectacle del Montsant"
[115] "Tinto Pesquera"
[116] "Ferrer Bobet"
[117] "Familia Torres"
[118] "Pazo Barrantes"
[119] "Tio Pepe"
[120] "Cruz de Alba"
[121] "Emilio Rojo"
[122] "Dominio de Es"
[123] "Jesus Madrazo"
[124] "Bodegas Naluar & Acediano"
[125] "Marques de Caceres"
[126] "Trus"
[127] "Mestres"
[128] "Chivite"
[129] "Bodega Ribas"
[130] "Vinas del Cenit"
[131] "Mas del Serral"
[132] "Cervoles"
[1] "Toro Red" "Tempranillo" "Ribera Del Duero Red"
[4] "Pedro Ximenez" "Red" "Sherry"
[7] "Priorat Red" "Rioja Red" "Rioja White"
[10] "Grenache" NA "Cava"
[13] "Verdejo" "Syrah" "Monastrell"
[16] "Mencia" "Sparkling" "Montsant Red"
[19] "Albarino" "Chardonnay"
This histogram distribution shows the price of wine, it is non-symetrical because it is right skewed distribution then the peak of the graph is at left of the initial value. The maximum range of price is between 50 to 100 in euros
This histogram distribution shows the rating of wine, it is non-symetrical because it is also right skewed distribution then the peak of the graph is at left of the middle value. There are so many types of wines for each t the ratings gets variation, the highest rating range of these wine variety is 4.6
This histogram distribution shows the number reviews of wine, it is non-symetrical because it is also right skewed distribution then the peak of the graph is at left of the initial value. The maximum range of reviews for wine is between 10 to 50 in numbers.
This histogram is also in right skewed, here I analyse which type of wine has highest rating, then the wine which has the lowest price has the highest rating.
This histogram distribution shows the score of body content which is present in the wine, the score ranges from 1 to 5. The richness and weight of the wine content is minimum in Sparkling variety.
This boxplot indicates the price of wine as per its variety, each boxplot indicates the maximum from the bottom. Dot indicates the outlier. In this Priorat Red and Rioja White are costly and it has more outliers.
This plot shows the score of acidity content present in wine, here, moreover all varieties of wine has more acidity content, this acidity content salivate your tongue to get another sip.
This scatterplot shows the price of wine with its rating, the maximum rating 4.9 ranges in between the price 200 to 300 and 1000 to 1100, in overall the wine which has the lowest price has the highest rating.
This plot shows the score of acidity content present in wine, here, moreover all varieties of wine has more acidity content, this acidity content salivate your tongue to get another sip, so one care about the price.
rating price
rating 1.000000 0.408869
price 0.408869 1.000000
Here, I using a subset function to seggregate a numeric attributes that I need, had been taken, then correlating the taken attributes that is price and rating, it is in low correlation and it is a positive correlation, it indicates the demand of wine as per the rating and price.
num_reviews price
num_reviews 1.0000000 -0.1150885
price -0.1150885 1.0000000
Here, I using a subset function to seggregate a numeric attributes that I need, had been taken, then correlating the taken attributes that is price and rating, it is a negative correlation
This ggplot is in geom_bar and it indicates the price of wine, the peak of the graph is at left of the initial value. The maximum range of price is between 50 to 100 in euros, here the lowest price of the wine varieties has the highest count.
This ggplot is in geom_point and it indicates the number reviews of wine as per its price, The maximum range of reviews for wine is between 10 to 50 in numbers according to the price.
---
title: "Analysis of Spanish Wine Quality"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
social: menu
theme: spacelab
storyboard: TRUE
source_code: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(lattice)
library(corrplot)
library(ggplot2)
library(DT)
Spwine<-read.csv(file.choose(), header = TRUE)
View(Spwine)
attach(Spwine)
Spwinedf<-Spwine[c(4,8)]
Spwinedf
Spwinedf1<-Spwine[c(5,8)]
Spwinedf1
```
# HOME
### Data Description {.tabset}
The Dataset was chosen from kaggle webpage, Name of the chosen dataset was "Spanish Wine Quality". This dataset is related to red varients of spanish wine. This dataset contains 516 different types of red wines from Spain with 11 features that describe their price, rating, and even some flavor description.
### Attribute Information {.tabset}
1. winery: Winery name
2. wine: Name of the wine
3. year: Year in which the grapes were harvested
4. rating: Average rating given to the wine by the users [from 1-5]
5. num_reviews: Number of users that reviewed the wine
6. country: Country of origin [Spain]
7. region: Region of the wine
8. price: Price in euros [€]
9. type: Wine variety
10.body: Body score, defined as the richness and weight of the wine in your mouth [from 1-5]
11.acidity: Acidity score, defined as wine's “pucker” or tartness; it's what makes a wine refreshing and your tongue salivate and want another sip
[from 1-5]
# Exploring Data
## Exploring Data {.tabset}
### Dimension
```{r}
dim(Spwine)
```
### Structure
```{r}
str(Spwine)
```
### Summary
```{r}
summary(Spwine)
```
### Unique
```{r}
unique(Spwine$wine)
unique(Spwine$winery)
unique(Spwine$type)
```
# Histogram
## Histogram {.tabset}
### Price of wine
```{r}
hist(Spwine$price, breaks = 30, col = "blue", main = "Price of Wine")
```
This histogram distribution shows the price of wine, it is non-symetrical because it is right skewed distribution then the peak of the graph is at left of the initial value. The maximum range of price is between 50 to 100 in euros
### Ratings of wine
```{r}
hist(Spwine$rating, breaks = 20, col = "pink", main = "Ratings of Wine")
```
This histogram distribution shows the rating of wine, it is non-symetrical because it is also right skewed distribution then the peak of the graph is at left of the middle value. There are so many types of wines for each t the ratings gets variation, the highest rating range of these wine variety is 4.6
### Number reviews about the wine
```{r}
hist(Spwine$num_reviews, breaks = 40, col = "orange", main = "Number reviews of wine")
```
This histogram distribution shows the number reviews of wine, it is non-symetrical because it is also right skewed distribution then the peak of the graph is at left of the initial value. The maximum range of reviews for wine is between 10 to 50 in numbers.
### Highest Rating Range
```{r}
histogram(~price|type, data = Spwine,col =c(1,2), main ="Identify the highest rating")
```
This histogram is also in right skewed, here I analyse which type of wine has highest rating, then the wine which has the lowest price has the highest rating.
### Body score of the wine
```{r}
histogram(~body|type, data = Spwine, col = c(4,2), main = "Minimum body score of wine")
```
This histogram distribution shows the score of body content which is present in the wine, the score ranges from 1 to 5. The richness and weight of the wine content is minimum in Sparkling variety.
# Boxplot
## Boxplot {.tabset}
### Price of wine
```{r}
boxplot(price~type, data = Spwine, xlab = "type of wine", ylab="price of wine",col = "red", main = "price of wine")
```
This boxplot indicates the price of wine as per its variety, each boxplot indicates the maximum from the bottom. Dot indicates the outlier. In this Priorat Red and Rioja White are costly and it has more outliers.
### Score of Acidity
```{r}
boxplot(acidity~type, data = Spwine, xlab = "type of wine", ylab = "level of acidity", col = "orange", main = "Acidity content present in wine")
```
This plot shows the score of acidity content present in wine, here, moreover all varieties of wine has more acidity content, this acidity content salivate your tongue to get another sip.
# Scatterplot
## Scatterplot {.tabset}
### Price and Ratings
```{r}
plot(rating~price, pch=19, col="skyblue")
```
This scatterplot shows the price of wine with its rating, the maximum rating 4.9 ranges in between the price 200 to 300 and 1000 to 1100, in overall the wine which has the lowest price has the highest rating.
### Acidity depends price
```{r}
plot(acidity~price, pch=20, col="red")
```
This plot shows the score of acidity content present in wine, here, moreover all varieties of wine has more acidity content, this acidity content salivate your tongue to get another sip, so one care about the price.
# Visualization
## visualiztion {.tabset}
### Correlation1
```{r}
cor(Spwinedf)
corrplot(cor(Spwinedf),method="number", shade.col= NA, tl.col = "black", tlsrt=45)
```
Here, I using a subset function to seggregate a numeric attributes that I need, had been taken, then correlating the taken attributes that is price and rating, it is in low correlation and it is a positive correlation, it indicates the demand of wine as per the rating and price.
### Correlation2
```{r}
cor(Spwinedf1)
corrplot(cor(Spwinedf1),method="number", shade.col= NA, tl.col = "black", tlsrt=45)
```
Here, I using a subset function to seggregate a numeric attributes that I need, had been taken, then correlating the taken attributes that is price and rating, it is a negative correlation
### GGplot1
```{r}
ggplot(Spwinedf)+geom_bar(mapping= aes(x=price, color="winery")) +ggtitle(label= "Quantity of Wine")
```
This ggplot is in geom_bar and it indicates the price of wine, the peak of the graph is at left of the initial value. The maximum range of price is between 50 to 100 in euros, here the lowest price of the wine varieties has the highest count.
### ggplot2
```{r}
ggplot(Spwinedf1)+geom_point(mapping= aes(x=price,y=num_reviews, color="winery")) +ggtitle(label= "Review about price")
```
This ggplot is in geom_point and it indicates the number reviews of wine as per its price, The maximum range of reviews for wine is between 10 to 50 in numbers according to the price.
# Download {.tabset}
```{r}
datatable(Spwinedf,extensions='Buttons',options=list(dom="Bftrip",buttons=c('copy','print','csv','pdf')))
```
Column {data-width=650}
----------------------------------
```{r}
datatable(Spwinedf1,extensions='Buttons',options=list(dom="Bftrip",buttons=c('copy','print','csv','pdf')))
```