used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
Ncells 582249 31.1 1328857 71 NA 669422 35.8
Vcells 1088659 8.4 8388608 64 16384 1851968 14.2
ECNM HW 3
Preparation
Clear Data
Loading Packages
Bringing in Data
Introduction
The wine industry is a dynamic and competitive market influenced by factors such as grape harvest quality, consumer preferences, and global trade trends. From boutique wineries to large scale producers, understanding and predicting sales is critical for managing production, inventory, and distribution.
Seasonal demand, marketing campaigns, and pricing strategies all play significant roles in driving wine sales. By leveraging data-driven models to predict the number of wine cases sold, producers and distributors can optimize operations, reduce waste, and adapt to changing market conditions, ensuring both efficiency and profitability in a rapidly evolving industry.
Variables
Free Sulfur dioxide: Helps in the preservation of wine.
Sulfates: Chemical compounds that are present in wine as a natural byproduct of fermentation and as an added preservative
Total Sulfur dioxide: Combined Free Sulfur dioxide with Bound sulfur dioxide which is chemically attached to other molecules in the substance, like sugars or pigments, and is not readily available for its preservative function.
Chloride: Chloride levels in wine contribute the saltiness but are generally low. The concentration of chloride in wine can affect its taste and quality, and can also impact its market appeal.
Fixed Acidity: Amount of natural acids in wine that remain in the liquid when heated.
Citric acid: Not commonly found in wine, but it is found in small quantities in grapes, making up about 5% of the total acid content. Mainly used to increase acidity and prevent ferric hazes which can form in wine from metal compounds like iron and copper.
Volatile acids: A measure of the gaseous acids in wine, and is typically associated with the smell of vinegar, usually used in small amounts.
Acidity levels: Wines with higher acidity are more likely to improve over time and develop deeper flavors and more complex aromas but too much can cause it to taste sour.
The pH scale: Measures how acidic a wine is, with lower pH numbers indicating more acidity.
1. Cleaning the Data
Understanding Missing Variables
In the Summary table above, there are variables such as Residual sugar, acidindex, alcohol, and others which have missing data. Before continuing it is important to have a clean set of data. The below graph and table gives us a visual and numeric understanding of key variables which will need cleaning.
INDEX TARGET FixedAcidity VolatileAcidity
0 0 0 0
CitricAcid ResidualSugar Chlorides FreeSulfurDioxide
0 616 638 647
TotalSulfurDioxide Density pH Sulphates
682 0 395 1210
Alcohol LabelAppeal AcidIndex STARS
653 0 0 3359
Dropping variables
We will be dropping variables which are not fully necessary to the overall model and have high amounts of missing rows.
This includes FixedAcidity, VolatileAcidity, Chlorides, FreeSulfurDioxide, and Sulphates. Some of these are already covered through Acidity Index and Total sulfur Dioxide.
# List of columns to drop
<- c("FixedAcidity", "VolatileAcidity", "Chlorides",
columns_to_drop "FreeSulfurDioxide", "Sulphates")
# Drop the columns from the dataset
<- dt[, !(names(dt) %in% columns_to_drop)] dt
Cleaning the data
Instead of simply replacing NAs with a median or average of the respected variable, we implement machine learning to predict the missing values by considering the relationships between all variables in the dataset.
- Negative numbers to postives: We’ll convert negative values to posotive values using absolute values. Many of the wine ingredients would not have a negative quantity and so turning them positive would make most sense.
- Blanks for Ingredients: We will impute these using missforest, which implements machine learning to predict the missing values by considering the relationships between all variables in the dataset.
- Blanks for Star ratings: Will be replaced with 0s.
- Zeros for ingredients: We will leave the ingrediant quantities as zeros. Having a zero quantity could be a meaningful amount of that ingredient in the wine.
# numeric
$STARS <- as.numeric(dt$STARS)
dt
#Impute missing values in STARS with 0
$STARS[is.na(dt$STARS)] <- 0 dt
$STARS <- as.factor(dt$STARS)
dt
# Convert STARS to a factor, ensuring 1 is the baseline
$STARS <- factor(dt$STARS, levels = c(1, 2, 3, 4, 0)) # 1 as baseline, 0 last dt
There are no longer any missing values
# Summary of missing values
colSums(is.na(dt2) | dt2 == "")
INDEX TARGET CitricAcid ResidualSugar
0 0 0 0
TotalSulfurDioxide Density pH Alcohol
0 0 0 0
LabelAppeal AcidIndex STARS
0 0 0
2. Data Exploration
Summary table with changes
Below is summary of the data to understand key numerical statistics such median, mean, max, kurtosis, and standard deviation.
Mean | Median | Minimum | Maximum | Kurtosis | Skew | SD | NA Count | |
---|---|---|---|---|---|---|---|---|
INDEX | 8,070 | 8,110 | 1 | 16,129 | -1.20 | 0.00 | 4,656.91 | 0 |
TARGET | 3 | 3 | 0 | 8 | -0.88 | -0.33 | 1.93 | 0 |
CitricAcid | 1 | 0 | 0 | 4 | 2.95 | 1.64 | 0.61 | 0 |
ResidualSugar | 23 | 14 | 0 | 141 | 2.46 | 1.49 | 24.36 | 0 |
TotalSulfurDioxide | 205 | 160 | 0 | 1,057 | 3.33 | 1.64 | 158.89 | 0 |
Density | 1 | 1 | 1 | 1 | 1.90 | -0.02 | 0.03 | 0 |
pH | 3 | 3 | 0 | 6 | 1.78 | 0.04 | 0.67 | 0 |
Alcohol | 11 | 10 | 0 | 26 | 1.25 | 0.19 | 3.54 | 0 |
AcidIndex | 8 | 8 | 4 | 17 | 5.19 | 1.65 | 1.32 | 0 |
Graphs
The following graphs help us better understand outliers and distribution patterns, enabling more effective data analysis.
Distribution of Key Variables
Based on the distribution graphs below, variables such as Volatile acidity, Residual sugar, and chlorides seem to be potential variables which might benefit from a transformation as they currently show skewness.
Findings across wines:
Majority of acidity, PH, density, and alcohol levels are moderate
Most wines have lower to moderate sugar levels.
People who buy cases of wine generally buy 2-6 cases, 4 being the most common.
Which variables drive cases sold?
Below we have a list of the average amount of each variable across the number of cases purchased in order to observe the relationship between the ingredients used in wine and how it might translate to increased likability by customers.
There are outliers, driven by the low number of observations in cases sold (shown in the density graph above), mainly for 8 cases sold. This would indicate the overall interpretation should be focused on the areas with higher data points (1-7 cases sold).
Below are some interesting findings:
- A higher star rating translates well into a higher quantity of cases purchased.
- Customers relatively enjoy higher alcoholic wines.
- Label appeal intrigues customers but will not necessarily always translate to sales.
- Customers enjoy less acidic wines
- Sweeter wines generally sell better but there can be exceptions (residual sugar).
Trend deep dive
Star Rating
Higher star ratings drive a larger number of cases purchased
3-4 star wines primarily sell 4-6 cases, but 2-3 star cases sell the most in quantity.
8 cases seem to be a rarity even at 4 star wines.
Label Appeal
Doesn’t necessarily drive number of cases purchased as you increase label appeal
Labels with a 3 rating have a variety of amounts sold, it has the highest proportion of 6-8 cases sold but doesnt translate into higher volume.
Acidity Index
- Generally lower acidity levels are preferred and sell in higher quantities.
Correlation Analysis
Star rating is the highest correlated indicator of the # of cases sold, generally the higher the star rating, the higher the chance of a larger amount of cases sold. Acidity level is another variable which shows a meaningful level of correlation, where generally lower acidity levels drive sales.
All other variables are not particularity correlated with # of cases sold.
3. Model Development
Training Data Rows: 10236
Testing Data Rows: 2559
Multi Linear Regression Models
Model 1
The first logistic regression model leverages all available variables from the dataset to predict the number of cases purchased. This model includes a range of predictors such as acidity levels, sweetness, label appeal, star ratings, etc. Incorporating them will allow the model to capture as much information as possible on which variables most contribute to maximizing the number of cases purchased. Model 1 will use STARS and Label Appeal as factors in order to keep 0 star ratings and have all ratings be compared to 1 (the baseline).
Model 1 will serve as the baseline for the model development.
#tr_dt$STARS <- as.numeric(tr_dt$STARS)
# Having NAs become 1
$STARS[is.na(tr_dt$STARS)] <- 0
tr_dt$STARS <- as.factor(tr_dt$STARS) tr_dt
# Convert STARS to a factor, ensuring 1 is the baseline
$STARS <- factor(tr_dt$STARS, levels = c(1, 2, 3, 4, 0)) # 1 as baseline, 0 last
tr_dt
$LabelAppeal <- as.factor(tr_dt$LabelAppeal) tr_dt
Call:
lm(formula = TARGET ~ TotalSulfurDioxide + CitricAcid + Density +
pH + Alcohol + LabelAppeal + AcidIndex + STARS, data = tr_dt)
Residuals:
Min 1Q Median 3Q Max
-3.9843 -0.9611 0.0707 0.9296 6.6723
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.863e+00 5.151e-01 9.442 <2e-16 ***
TotalSulfurDioxide 2.054e-04 8.516e-05 2.411 0.0159 *
CitricAcid 4.986e-02 2.223e-02 2.243 0.0249 *
Density -9.099e-01 5.070e-01 -1.795 0.0727 .
pH -3.665e-02 2.035e-02 -1.801 0.0718 .
Alcohol 7.115e-03 3.840e-03 1.853 0.0639 .
LabelAppeal2 -1.798e-02 2.820e-02 -0.638 0.5236
LabelAppeal3 8.196e-02 5.230e-02 1.567 0.1172
AcidIndex -1.822e-01 1.030e-02 -17.690 <2e-16 ***
STARS2 1.192e+00 3.781e-02 31.517 <2e-16 ***
STARS3 1.931e+00 4.273e-02 45.200 <2e-16 ***
STARS4 2.784e+00 6.748e-02 41.263 <2e-16 ***
STARS0 -1.324e+00 3.841e-02 -34.458 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.365 on 10223 degrees of freedom
Multiple R-squared: 0.4988, Adjusted R-squared: 0.4982
F-statistic: 847.8 on 12 and 10223 DF, p-value: < 2.2e-16
Model 1 Results
The Adjusted R-squared is 0.4982, indicating that the model explains about 49.8% of the variance in the number of cases of wine purchased, suggesting that the predictors have moderate explanatory power for this outcome. The F statistic of 847.8 with a p-value of 2.2e-16 indicate that that the predictors collectively have a statistically significant relationship with the dependent variable.
Star rating has the greatest magnitude between the predictors, this was shown in trends we observed in the graphs above. Wines rated with 4 stars have the highest predicted increase in purchases (2.784 cases) compared to wines with 1 star (the reference category). The missing values which were assigned a 0 star rating overall have a negative magnitutde and would have a negative impact on cases purchased. AcidIndex also had a high magnitude where a 1-unit increase in acidIndex is associated with a decrease in the number of cases purchased by 0.138 cases.
Both star rating and acidic index have strong levels of significance, alcohol similarly has a strong signficane level but with a lower magnitude (partly related to what it is measured by). Other variables such as density, PH, and sulfur dioxide carry statistically insignificant impacts.
Directionally, the predictors all align with what is expected.
Checking MultiColinarity
Multicollinearity which helps us understand if 2 or more predictor (independent) variables are correlated to each other, looks strong, all centered around 1.
GVIF Df GVIF^(1/(2*Df))
TotalSulfurDioxide 1.005509 1 1.002751
CitricAcid 1.002354 1 1.001176
Density 1.002148 1 1.001073
pH 1.005469 1 1.002731
Alcohol 1.008192 1 1.004088
LabelAppeal 1.008454 2 1.002107
AcidIndex 1.044207 1 1.021864
STARS 1.048251 4 1.005908
Model 2
Model 2 will similarily use all the variables but will implement logs on variables which show skewness. Instead of treating Star rating and label appeal as factors, they will be converted into numeric. As a result, all star ratings with 0 (N/A) will be taken out in order to not confuse the model that 0 is indeed equal group.
Call:
lm(formula = TARGET ~ log_CitricAcid + Alcohol + LabelAppeal +
log_AcidIndex + log_TotalSulfurDioxide + log_ResidualSugar +
STARS, data = tr_dt2_filtered)
Residuals:
Min 1Q Median 3Q Max
-3.8029 -0.6898 0.2466 0.7705 3.9526
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.970601 0.281052 14.128 < 2e-16 ***
log_CitricAcid 0.128166 0.047919 2.675 0.007497 **
Alcohol 0.013892 0.004157 3.342 0.000837 ***
LabelAppeal 0.036643 0.023748 1.543 0.122879
log_AcidIndex -1.238379 0.114116 -10.852 < 2e-16 ***
log_TotalSulfurDioxide 0.028793 0.017666 1.630 0.103175
log_ResidualSugar 0.010172 0.012900 0.789 0.430388
STARS 0.946104 0.016325 57.956 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.276 on 7540 degrees of freedom
Multiple R-squared: 0.328, Adjusted R-squared: 0.3274
F-statistic: 525.8 on 7 and 7540 DF, p-value: < 2.2e-16
Model 2 Results
The Adjusted R-squared is 0.3274, indicating that the model explains about 32.7% of the variance in the number of cases of wine purchased, suggesting that the predictors have moderate explanatory power for this outcome. The F statistic of 525.8 with a p-value of 2.2e-16 indicate that that the predictors collectively have a statistically significant relationship with the dependent variable.
Compared to model 1, Log Acid Index shows the highest magnitude of -1.24, indicating a 1 unit increase in the log Acid Index results in a decrease in wine cases purchased by 1.24 units. Similarily, for every increase in Star rating, wine cases sold increases by 0.95. Other variables such as label appeal remain with very insignificant magnitudes and statistical significance. This would imply customers are more focused on the quality of the wine rather than the wine label appearance.
While alcohol has a lower magntitude, it along with log acid index, Star rating, and log citric acid all have strong statistical signficance levels.
Directionally, the predictors all align with what is expected.
Checking MultiColinarity
Multicollinearity which helps us understand if 2 or more predictor (independent) variables are correlated to each other, looks strong, all centered around 1.
vif_linear_model
log_CitricAcid 1.003618
Alcohol 1.009864
LabelAppeal 1.000672
log_AcidIndex 1.015602
log_TotalSulfurDioxide 1.006880
log_ResidualSugar 1.002388
STARS 1.011588
Poisson Regression Models
Model 1
The first poisson regression model will leverage the findings from the linear model and focus on the most signficant and impactful variables, these include citric acid, alcohol, acid index, and stars.
Call:
glm(formula = TARGET ~ CitricAcid + Alcohol + AcidIndex + STARS,
family = poisson, data = tr_dt2_filtered)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.044411 0.048428 21.566 < 2e-16 ***
CitricAcid 0.014349 0.009854 1.456 0.1453
Alcohol 0.003431 0.001698 2.021 0.0433 *
AcidIndex -0.041518 0.005310 -7.819 5.32e-15 ***
STARS 0.246004 0.006419 38.326 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 6899.5 on 7547 degrees of freedom
Residual deviance: 5314.1 on 7543 degrees of freedom
AIC: 27789
Number of Fisher Scoring iterations: 5
Model 1 Results
The model shows an AIC is 27789. Magnitude for the variables are a bit lower compared to the linear model. A one-unit increase in Stars increases the log count of the cases sold by 0.2460. Acid index remains having one of the largest impacts after star rating. The coefficients all look directionally in line but only Acid index and Stars are significant, alcohol having partial significance as well but low magnitude. Based linear model 2, the data set is filtered to not include missing stars given model performance has imporved with this change.
Checking for Overdispesion
Mean of TARGET: 3.684817
Variance of TARGET: 2.419926
No overdispersion detected.
Checking MultiColinarity
Multicollinearity which helps us understand if 2 or more predictor (independent) variables are correlated to each other, looks strong, all centered around 1.
vif_poisson_model
CitricAcid 1.000809
Alcohol 1.009386
AcidIndex 1.009406
STARS 1.011247
Model 2
The Poisson model 2 will have all the variables similar available, this will give us an indication on how impactful the zero inflated model functions with these type of variables. Similar to past models, this data set is filtered to not include missing stars given model performance has imporved with this change.
Call:
zeroinfl(formula = TARGET ~ TotalSulfurDioxide + CitricAcid + Density +
pH + Alcohol + LabelAppeal + AcidIndex + STARS | TotalSulfurDioxide +
CitricAcid + Density + ResidualSugar, data = tr_dt2_filtered, dist = "poisson")
Pearson residuals:
Min 1Q Median 3Q Max
-1.9443 -0.3305 0.1133 0.4335 2.4248
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.281e+00 2.366e-01 5.415 6.14e-08 ***
TotalSulfurDioxide -6.146e-05 3.817e-05 -1.610 0.1073
CitricAcid 9.240e-03 1.007e-02 0.918 0.3586
Density -2.353e-01 2.317e-01 -1.015 0.3099
pH -2.846e-03 9.177e-03 -0.310 0.7564
Alcohol 3.781e-03 1.714e-03 2.205 0.0274 *
LabelAppeal 9.194e-03 9.780e-03 0.940 0.3472
AcidIndex -3.500e-02 5.495e-03 -6.370 1.89e-10 ***
STARS 2.312e-01 6.744e-03 34.278 < 2e-16 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.706050 5.423403 -0.315 0.7531
TotalSulfurDioxide -0.026020 0.004137 -6.289 3.2e-10 ***
CitricAcid -0.466060 0.338411 -1.377 0.1685
Density 0.787764 5.431048 0.145 0.8847
ResidualSugar -0.013240 0.007246 -1.827 0.0677 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 35
Log-likelihood: -1.384e+04 on 14 Df
Model 2 Results
In order to understand the model performance better, we can convert the log-likelihood to AIC
AIC = 2K − 2(LogLikelihood)
AIC = 2 x 14 − 2(-13,840) = 27,708
When comparing it with the AIC from Model 1 of 27789, we can analyze the zero inflated Poisson has performed well.
Level of signficance has remained the same, Total Sulfur Dioxide in the zero inflated model is statistically significant. Magnitude has shifted across the board but Star rating and Acid index remain the most significant variables with the highest magnitudes. For every unit increase in star rating, the expected number of wine cases sold increases by approximately 26% (\(e^{0.2311}- 1= ~.26\)).
Total sulfur dioxide on the other hand for every unit increase, the expected number of wine cases sold decreases by approximately 0.06% (\(e^{0.0006235}- 1= ~.0006\)). It is interesting as in model 1 this variable had a postive impact on cases sold while here it is negative, minimally, As an interpreation for the zero inflated model, the negatve coefficient indicates that higher total sulfur dioxide reduces the likelihood of cases of where cases sold are 0.
Negative Binomial regression Models
Model 1
Model 1 uses all the majority of the original variables, excluding sugar as it has historically performed poorly and contains transformations for acid index (a strong performing variable) and total sulfur dioxide. Similar to past models, this data set is filtered to not include missing stars given model performance has imporved with this change.
Family: nbinom2 ( log )
Formula: TARGET ~ CitricAcid + Alcohol + AcidIndex + STARS
Data: tr_dt2_filtered
AIC BIC logLik deviance df.resid
NA NA NA NA 7542
Dispersion parameter for nbinom2 family (): 4.22e+07
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.044411 0.048427 21.57 < 2e-16 ***
CitricAcid 0.014349 0.009854 1.46 0.1453
Alcohol 0.003431 0.001697 2.02 0.0433 *
AcidIndex -0.041518 0.005310 -7.82 5.32e-15 ***
STARS 0.246004 0.006419 38.33 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model 1 Results
The AIC for model 2 is 27,791 which is below the zero inflated poisson model of 27,708.
Model 1 contains similar trends to past models, leading stars and acide indix to be the most influencial variables in determining cases of win sold. In this model, a one unit increase in Stars is associated with a 27.85% increase in the expected count of the wine cases sold, (\(e^{0.246}- 1= ~0.2785\)). The significance in Stars is high along with Acid Index, similarily, alcohol holding signficance in the model as well but a much lower magnitude. The direction of the variables allign to prior models and are as expected.
Model 2
Model 2 will leverage the variables from model 1 and apply a log to Acid Index and Citric acid in the attemp to improve the model. Similar to past models, this data set is filtered to not include missing stars given model performance has imporved with this chance.
Family: nbinom2 ( log )
Formula: TARGET ~ log_CitricAcid + Alcohol + log_AcidIndex + STARS
Data: tr_dt2_filtered
AIC BIC logLik deviance df.resid
NA NA NA NA 7542
Dispersion parameter for nbinom2 family (): 2.99e+07
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.492141 0.107084 13.93 < 2e-16 ***
log_CitricAcid 0.035357 0.019428 1.82 0.0688 .
Alcohol 0.003431 0.001698 2.02 0.0433 *
log_AcidIndex -0.359546 0.047886 -7.51 5.99e-14 ***
STARS 0.246281 0.006417 38.38 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
% Error: Unrecognized object type.
Model 2 Results
The AIC for model 2 is 27,796 which is in line with model 1.
The interpretation of model 2 is alligned with model 1 in signficance, direction, and overall magntitude. A one unit increase in Stars is associated with a 27.85% increase in the expected count of the wine cases sold, (\(e^{0.246}- 1= ~0.2785\)). What did shift slightly were were the magntitudes of the both the logged variables (acid index and citric acid).
4 Choosing the best Model
Comparing the results
Excluding the linear models, the table compares four statistical models Poisson, Zero Inflated Poisson (ZIP), and two Negative Binomial (NB) models based on performance metrics: Akaike Information Criterion (AIC), Log-Likelihood, Root Mean Square Error (RMSE), and Mean Absolute Error (MAE).
The Zero-Inflated Poisson model demonstrates the best performance across all metrics, with the lowest AIC (27,708) and the highest Log-Likelihood (-13,840), indicating a the best balance between model fit and complexity. Additionally, the ZIP model achieves the lowest RMSE (1.289415) and MAE (1.011995), reflecting more accurate predictions compared to the other models. These results highlight the importance of accounting for excess zeros in the data, as the ZIP model effectively addresses this characteristic, making it the most suitable choice for predicting the number of wine cases sold in this data set.
Aspect | RMSE | MAE |
---|---|---|
Weighting of Errors | Penalizes large errors more (squares them). | Treats all errors equally. |
Sensitivity | Sensitive to outliers. | Less sensitive to outliers. |
Use Case | When large errors are critical. | When all errors are equally important. |
Model | AIC | Log-Likelihood | RMSE | MAE |
---|---|---|---|---|
Poisson Model | 27789.38 | -13889.69 | 1.291731 | 1.018105 |
Zero Inflated Poisson | 27708.00 | -13839.76 | 1.289423 | 1.011987 |
Negative Binomial | NA | NA | 1.291731 | 1.018105 |
Negative Binomial 2 | NA | NA | 1.292190 | 1.018447 |
Making predictions with the Zero Inflated Poisson Model on the evalution data
Overall, when comparing the Predicted wine cases sold vs the actual in the testing data, the model performs very well on the 3-6 cases sold but seems to overestimate cases of 7-10 cases sold.
Statistic | Predicted Value | Actual Value |
---|---|---|
Min | 3.00 | 0.00 |
1st Quartile | 4.00 | 2.00 |
Median | 4.00 | 3.00 |
Mean | 5.32 | 3.04 |
3rd Quartile | 8.00 | 4.00 |
Max | 10.00 | 8.00 |