data <- read.csv("C:\\Users\\gajaw\\OneDrive\\Desktop\\STATS\\vgsales.csv")
Converting Global_sales into a binary variable, where games with sales above a certain threshold are labeled as “High-Sales” (1) and others as “Low-Sales” (0).
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data <- data %>%
mutate(High_Sales = ifelse(Global_Sales > 20, 1, 0))
str(data)
## 'data.frame': 16598 obs. of 12 variables:
## $ Rank : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Name : chr "Wii Sports" "Super Mario Bros." "Mario Kart Wii" "Wii Sports Resort" ...
## $ Platform : chr "Wii" "NES" "Wii" "Wii" ...
## $ Year : chr "2006" "1985" "2008" "2009" ...
## $ Genre : chr "Sports" "Platform" "Racing" "Sports" ...
## $ Publisher : chr "Nintendo" "Nintendo" "Nintendo" "Nintendo" ...
## $ NA_Sales : num 41.5 29.1 15.8 15.8 11.3 ...
## $ EU_Sales : num 29.02 3.58 12.88 11.01 8.89 ...
## $ JP_Sales : num 3.77 6.81 3.79 3.28 10.22 ...
## $ Other_Sales : num 8.46 0.77 3.31 2.96 1 0.58 2.9 2.85 2.26 0.47 ...
## $ Global_Sales: num 82.7 40.2 35.8 33 31.4 ...
## $ High_Sales : num 1 1 1 1 1 1 1 1 1 1 ...
data$Platform <- as.factor(data$Platform)
data$Genre <- as.factor(data$Genre)
data$Publisher <- as.factor(data$Publisher)
# Logistic regression
model <- glm(High_Sales ~ Platform + Genre, data = data, family = "binomial")
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
# Summary
summary(model)
##
## Call:
## glm(formula = High_Sales ~ Platform + Genre, family = "binomial",
## data = data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.394e+01 6.781e+03 -0.004 0.997
## Platform3DO 7.984e-01 4.380e+04 0.000 1.000
## Platform3DS 1.366e-01 7.595e+03 0.000 1.000
## PlatformDC 3.288e-01 1.223e+04 0.000 1.000
## PlatformDS 1.737e+01 6.781e+03 0.003 0.998
## PlatformGB 1.990e+01 6.781e+03 0.003 0.998
## PlatformGBA -2.008e-01 7.298e+03 0.000 1.000
## PlatformGC -1.793e-01 7.528e+03 0.000 1.000
## PlatformGEN -1.034e-01 1.600e+04 0.000 1.000
## PlatformGG -9.933e-01 7.975e+04 0.000 1.000
## PlatformN64 -2.129e-01 8.043e+03 0.000 1.000
## PlatformNES 1.930e+01 6.781e+03 0.003 0.998
## PlatformNG 2.479e+00 1.995e+04 0.000 1.000
## PlatformPC 3.236e-01 7.207e+03 0.000 1.000
## PlatformPCFX -1.885e-01 7.975e+04 0.000 1.000
## PlatformPS 2.629e-03 7.131e+03 0.000 1.000
## PlatformPS2 1.583e+01 6.781e+03 0.002 0.998
## PlatformPS3 1.636e+01 6.781e+03 0.002 0.998
## PlatformPS4 7.567e-02 7.987e+03 0.000 1.000
## PlatformPSP 2.679e-01 7.118e+03 0.000 1.000
## PlatformPSV 3.580e-01 7.736e+03 0.000 1.000
## PlatformSAT 4.657e-01 8.791e+03 0.000 1.000
## PlatformSCD -2.441e-01 3.215e+04 0.000 1.000
## PlatformSNES 1.795e+01 6.781e+03 0.003 0.998
## PlatformTG16 7.671e-01 5.191e+04 0.000 1.000
## PlatformWii 1.825e+01 6.781e+03 0.003 0.998
## PlatformWiiU 2.363e-03 9.385e+03 0.000 1.000
## PlatformWS 3.250e-01 3.147e+04 0.000 1.000
## PlatformX360 1.638e+01 6.781e+03 0.002 0.998
## PlatformXB -1.382e-01 7.297e+03 0.000 1.000
## PlatformXOne 6.461e-03 8.617e+03 0.000 1.000
## GenreAdventure -1.601e+01 1.977e+03 -0.008 0.994
## GenreFighting -1.568e+01 2.411e+03 -0.007 0.995
## GenreMisc 4.679e-01 9.208e-01 0.508 0.611
## GenrePlatform 1.370e+00 9.159e-01 1.496 0.135
## GenrePuzzle 5.030e-02 1.252e+00 0.040 0.968
## GenreRacing 1.101e+00 1.007e+00 1.094 0.274
## GenreRole-Playing 5.652e-01 1.033e+00 0.547 0.584
## GenreShooter 4.785e-01 1.235e+00 0.388 0.698
## GenreSimulation 1.483e-01 1.239e+00 0.120 0.905
## GenreSports 7.849e-01 8.743e-01 0.898 0.369
## GenreStrategy -1.584e+01 2.634e+03 -0.006 0.995
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 308.83 on 16597 degrees of freedom
## Residual deviance: 245.86 on 16556 degrees of freedom
## AIC: 329.86
##
## Number of Fisher Scoring iterations: 22
data$predicted_prob <- predict(model, type = "response")
head(data)
## Rank Name Platform Year Genre Publisher NA_Sales
## 1 1 Wii Sports Wii 2006 Sports Nintendo 41.49
## 2 2 Super Mario Bros. NES 1985 Platform Nintendo 29.08
## 3 3 Mario Kart Wii Wii 2008 Racing Nintendo 15.85
## 4 4 Wii Sports Resort Wii 2009 Sports Nintendo 15.75
## 5 5 Pokemon Red/Pokemon Blue GB 1996 Role-Playing Nintendo 11.27
## 6 6 Tetris GB 1989 Puzzle Nintendo 23.20
## EU_Sales JP_Sales Other_Sales Global_Sales High_Sales predicted_prob
## 1 29.02 3.77 8.46 82.74 1 0.007322416
## 2 3.58 6.81 0.77 40.24 1 0.036343972
## 3 12.88 3.79 3.31 35.82 1 0.010018136
## 4 11.01 3.28 2.96 33.00 1 0.007322416
## 5 8.89 10.22 1.00 31.37 1 0.029849393
## 6 2.26 4.22 0.58 30.26 1 0.018053043
Logistic regression model is used to predict High_sales based on Platform and Genre
The predict function calculates the probability of high sales for each game.
**Insight**: The predicted_prob column gives the model's probability estimates for a game achieving high sales High_Sales =1. For example, the first game has a predicted probability of 0.0073, indicating a low likelihood of high sales despite its actual sales being high. This could suggest that factors outside the model's current variables significantly influence sales.
**Significance**: Low probabilities for actual high-selling games suggest potential limitations in our model. This discrepancy could indicate that important predictors, like marketing, brand popularity, or game ratings, might not be included, leading to low predicted probabilities even when a game is a top-seller.
**Further Questions**:
What additional variables could better explain high sales outcomes?
Would including regional sales proportions (e.g., sales distribution across regions) improve model accuracy?
exp_intercept <- exp(coef(model)[1])
cat("Exponentiated Intercept:", exp_intercept, "\n")
## Exponentiated Intercept: 3.997273e-11
The intercept represents the log-odds of a game achieving “High Sales” when all other variables (Platform and Genre) are at their reference levels.
If exp_intercept is close to zero, the odds of high sales in the reference categories are very low. If it’s closer to 1, there’s a moderate baseline chance of high sales, even before considering platform or genre.
library(broom)
coefficients <- tidy(model)
print(coefficients)
## # A tibble: 42 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -23.9 6781. -0.00353 0.997
## 2 Platform3DO 0.798 43804. 0.0000182 1.00
## 3 Platform3DS 0.137 7595. 0.0000180 1.00
## 4 PlatformDC 0.329 12234. 0.0000269 1.00
## 5 PlatformDS 17.4 6781. 0.00256 0.998
## 6 PlatformGB 19.9 6781. 0.00293 0.998
## 7 PlatformGBA -0.201 7298. -0.0000275 1.00
## 8 PlatformGC -0.179 7528. -0.0000238 1.00
## 9 PlatformGEN -0.103 15996. -0.00000647 1.00
## 10 PlatformGG -0.993 79751. -0.0000125 1.00
## # ℹ 32 more rows
platform_wii_coef <- coefficients %>% filter(term == "PlatformWii")
estimate <- platform_wii_coef$estimate
std_error <- platform_wii_coef$std.error
lower_bound <- estimate - 1.96 * std_error
upper_bound <- estimate + 1.96 * std_error
cat("95% CI for PlatformWii coefficient: [", lower_bound, ",", upper_bound, "]\n")
## 95% CI for PlatformWii coefficient: [ -13271.56 , 13308.06 ]
summary(model)
##
## Call:
## glm(formula = High_Sales ~ Platform + Genre, family = "binomial",
## data = data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.394e+01 6.781e+03 -0.004 0.997
## Platform3DO 7.984e-01 4.380e+04 0.000 1.000
## Platform3DS 1.366e-01 7.595e+03 0.000 1.000
## PlatformDC 3.288e-01 1.223e+04 0.000 1.000
## PlatformDS 1.737e+01 6.781e+03 0.003 0.998
## PlatformGB 1.990e+01 6.781e+03 0.003 0.998
## PlatformGBA -2.008e-01 7.298e+03 0.000 1.000
## PlatformGC -1.793e-01 7.528e+03 0.000 1.000
## PlatformGEN -1.034e-01 1.600e+04 0.000 1.000
## PlatformGG -9.933e-01 7.975e+04 0.000 1.000
## PlatformN64 -2.129e-01 8.043e+03 0.000 1.000
## PlatformNES 1.930e+01 6.781e+03 0.003 0.998
## PlatformNG 2.479e+00 1.995e+04 0.000 1.000
## PlatformPC 3.236e-01 7.207e+03 0.000 1.000
## PlatformPCFX -1.885e-01 7.975e+04 0.000 1.000
## PlatformPS 2.629e-03 7.131e+03 0.000 1.000
## PlatformPS2 1.583e+01 6.781e+03 0.002 0.998
## PlatformPS3 1.636e+01 6.781e+03 0.002 0.998
## PlatformPS4 7.567e-02 7.987e+03 0.000 1.000
## PlatformPSP 2.679e-01 7.118e+03 0.000 1.000
## PlatformPSV 3.580e-01 7.736e+03 0.000 1.000
## PlatformSAT 4.657e-01 8.791e+03 0.000 1.000
## PlatformSCD -2.441e-01 3.215e+04 0.000 1.000
## PlatformSNES 1.795e+01 6.781e+03 0.003 0.998
## PlatformTG16 7.671e-01 5.191e+04 0.000 1.000
## PlatformWii 1.825e+01 6.781e+03 0.003 0.998
## PlatformWiiU 2.363e-03 9.385e+03 0.000 1.000
## PlatformWS 3.250e-01 3.147e+04 0.000 1.000
## PlatformX360 1.638e+01 6.781e+03 0.002 0.998
## PlatformXB -1.382e-01 7.297e+03 0.000 1.000
## PlatformXOne 6.461e-03 8.617e+03 0.000 1.000
## GenreAdventure -1.601e+01 1.977e+03 -0.008 0.994
## GenreFighting -1.568e+01 2.411e+03 -0.007 0.995
## GenreMisc 4.679e-01 9.208e-01 0.508 0.611
## GenrePlatform 1.370e+00 9.159e-01 1.496 0.135
## GenrePuzzle 5.030e-02 1.252e+00 0.040 0.968
## GenreRacing 1.101e+00 1.007e+00 1.094 0.274
## GenreRole-Playing 5.652e-01 1.033e+00 0.547 0.584
## GenreShooter 4.785e-01 1.235e+00 0.388 0.698
## GenreSimulation 1.483e-01 1.239e+00 0.120 0.905
## GenreSports 7.849e-01 8.743e-01 0.898 0.369
## GenreStrategy -1.584e+01 2.634e+03 -0.006 0.995
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 308.83 on 16597 degrees of freedom
## Residual deviance: 245.86 on 16556 degrees of freedom
## AIC: 329.86
##
## Number of Fisher Scoring iterations: 22
95% CI for PlatformWii coefficient: [ -13271.56 , 13308.06 ]
Because this confidence interval includes zero, we cannot conclude that the PlatformWii coefficient has a statistically significant effect on high sales. In other words, the effect could be positive, negative, or even nonexistent.
The wide range also indicates high variability and uncertainty around this estimate, suggesting that PlatformWii may not be a reliable predictor of high sales in the current model.
Insignificance of Most Coefficients: High p-values for nearly all coefficients imply that neither platform nor genre significantly predicts high sales in this model.
Model Fit: The moderate reduction in deviance and an AIC of 329.86 indicate that the model explains some, but limited, variance in high sales. This reinforces that the current variables alone are not sufficient to build a robust predictive model for high sales.
Further investigation could involve adding relevant variables, testing interaction effects, or using more complex models (e.g., random forests) to improve predictive accuracy and model interpretability. Exploring additional data on market trends or customer demographics may also yield better insights
exp_wii_coef <- exp(estimate)
cat("Exponentiated PlatformWii Coefficient:", exp_wii_coef, "\n")
## Exponentiated PlatformWii Coefficient: 84181270
For PlatformWii, the exponentiated coefficient suggests that games on the Wii platform have a higher/lower likelihood of high sales compared to the reference platform.
library(ggplot2)
coef_data <- tidy(model, conf.int = TRUE)
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
## collapsing to unique 'x' values
coef_data <- coef_data %>% filter(term != "(Intercept)")
ggplot(coef_data, aes(x = term, y = estimate)) +
geom_point() +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.2) +
theme_minimal() +
coord_flip() +
labs(
title = "Confidence Intervals for Logistic Regression Coefficients",
x = "Coefficients",
y = "Estimate (Log-Odds)"
) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
The plot illustrates the range of uncertainty for each coefficient. Some intervals are very wide, crossing zero, which suggests high variability and non-significance for those coefficients. If the interval crosses zero, it indicates that the effect is not statistically significant at the 95% confidence level.
exp_coef <- exp(coefficients$estimate)
coefficients <- cbind(coefficients, exp_coef)
Using exponentiated coefficients helps clarify the practical impact of each variable on sales potential. This approach transforms log-odds into a more intuitive percentage change, which can be useful for interpreting trends in game sales across different platforms and genres.