library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)
library(ggrepel)
setwd("C:/Users/kaitl/OneDrive/Documents/590_Working")

#update data types of dataframe
energy <- read_delim("./590_FinalData1.csv", delim = ",", col_types = "nccnncnnnnnnnn")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
energy1 <- energy
energy1[energy1 == '..'] <- NA

Null hypothesis:

  1. The average proportion of renewable energy that makes up the total final energy consumption is larger than the year before it.

    1. power level = (1 - beta error) = chance that you will reject the null hypothesis (and prove the hypothesis). I am going to use 0.8 for both alpha levels because we want the power level to be high.
    1. alpha level = I think 1 - 0.95 = 0.05 will work fine for this analysis because the type 1 error (basically that renewable energy proportions decrease) will not horrifically alter my results and ultimate decisions based on the data.

    2. minimum effect size = I am very unsure how to calculate this measure. Mean of control - mean of tested group / std dev

    3. Enough to perform a neyman-pearson hypothesis? No

    4. fisher.test(select(ctr_table, renenergy, not_ren))

    5. fisher test:

      # ctr_table <- energy1 |>
      #   group_by(year, country_name) |>
      #   summarize(renenergy = sum(ren_energy_output),
      #             not_ren = sum(total_elec_output) - sum(ren_energy_output))
      # 
      # view(ctr_table)
      #fisher.test(select(ctr_table, renenergy, not_ren))

    suggests that there is significance between the two values

  2. Countries with full access to electricity have a higher proportion of renewable energy output/consumption than countries without full electricity access.

    1. power level = 0.8 ^ explanation above.

    2. alpha level = 0.05 because the probability of the type I error (that some countries without full electricity access have a larger chance of using more renewable energy) is okay to 5%

    3. minimum effect size =

    4. Enough to perform a neyman-pearson hypothesis? No

    5. Fisher test:

      # ctr_table <- energy1 |>
      #   group_by(year, country_name) |>
      #   summarize(total_consumption = sum(TFEC),
      #             not_ren = sum(TFEC) - sum(ren_energy_output))
      # 
      # ctr_table
      #fisher.test(select(ctr_table, total_consumption, not_ren))
      1. suggests that there is no significance between the two values

two visualizations: