Nick Climaco
2023-05-08
Do developed countries have a significantly different electricity demand than developing countries?
https://www.kaggle.com/datasets/pralabhpoudel/world-energy-consumption
https://www.theglobaleconomy.com/rankings/human_development/
\(H_0\): electricity demand is equal
\(H_1\): electricity demand is not equal
\(\alpha\) = 0.05
##
## Welch Two Sample t-test
##
## data: electricity_demand by status
## t = 8.247, df = 1666.6, p-value = 3.257e-16
## alternative hypothesis: true difference in means between group developed and group developing is not equal to 0
## 95 percent confidence interval:
## 121.9851 198.1150
## sample estimates:
## mean in group developed mean in group developing
## 237.5535 77.5035
## Df Sum Sq Mean Sq F value Pr(>F)
## status 1 20574690 20574690 84.92 <2e-16 ***
## Residuals 3963 960110844 242269
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1148 observations deleted due to missingness
##
## Call:
## lm(formula = avg_electricity_demand ~ year + status, data = ml_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -59.108 -12.762 -3.931 11.225 61.249
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6737.4198 732.4478 -9.198 5.34e-13 ***
## year 3.4704 0.3651 9.505 1.66e-13 ***
## statusdeveloping -154.7933 6.5315 -23.700 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25.71 on 59 degrees of freedom
## Multiple R-squared: 0.917, Adjusted R-squared: 0.9142
## F-statistic: 326 on 2 and 59 DF, p-value: < 2.2e-16
trainIndex <- createDataPartition(class_df$status, p = 0.7, list = FALSE)
train_data <- class_df[trainIndex, ]
test_data <- class_df[-trainIndex, ]## Confusion Matrix and Statistics
##
## Actual
## Predictions developed developing
## developed 317 61
## developing 6 859
##
## Accuracy : 0.9461
## 95% CI : (0.932, 0.958)
## No Information Rate : 0.7401
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.8672
##
## Mcnemar's Test P-Value : 4.191e-11
##
## Sensitivity : 0.9814
## Specificity : 0.9337
## Pos Pred Value : 0.8386
## Neg Pred Value : 0.9931
## Prevalence : 0.2599
## Detection Rate : 0.2550
## Detection Prevalence : 0.3041
## Balanced Accuracy : 0.9576
##
## 'Positive' Class : developed
##
In this project,
Analyzed electricity demand data for two groups of countries in the last 30 years
Explored different data visualization techniques, including bar plots, line plots, and violin plots, to understand the distribution and trends in the data.
Used statistical analysis techniques such as hypothesis testing, t-tests, and ANOVA to explore the relationships between different variables and draw inference from the data.
Overall, We found that electricity demand increased over time and that developed countries had statistically higher demand than developing countries. Our findings may have important insights for policymakers and energy companies looking to understand global trends in electricity consumption and plan for the future.