Tilon Bobb Mikhail Broomes To begin our analysis of the “world_salary” dataset, we’ll start by loading the necessary R libraries for data manipulation and visualization:

knitr::opts_chunk$set(echo = TRUE)
library(RMySQL)
## Loading required package: DBI
library(yaml)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.0
## ✔ readr     2.1.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(infer)
library(rvest)
## 
## Attaching package: 'rvest'
## 
## The following object is masked from 'package:readr':
## 
##     guess_encoding

We are going to explore the monthly salary data from various countries all over the world, converted to USD.

Establishing a connection to the sql database

config <- yaml::read_yaml("config.yaml")
con <- dbConnect(
  RMySQL::MySQL(),
  dbname = config$dbname,
  host = config$host,
  port = config$port,
  user = config$user,
  password =  config$password
)
query <- "SELECT * FROM project3.salary_data"
world_salary <- dbGetQuery(con, query)

Let’s take a look at the dataset to get a preliminary understanding:

head(world_salary)
##     country_name continent_name wage_span median_salary average_salary
## 1    Afghanistan           Asia   Monthly        853.74        1001.15
## 2  Aland Islands         Europe   Monthly       3319.24        3858.35
## 3        Albania         Europe   Monthly        832.84         956.92
## 4        Algeria         Africa   Monthly       1148.84        1308.81
## 5 American Samoa        Oceania   Monthly       1390.00        1570.00
## 6        Andorra         Europe   Monthly       3668.08        4069.77
##   lowest_salary highest_salary
## 1        252.53        4460.97
## 2        972.52       17124.74
## 3        241.22        4258.49
## 4        330.11        5824.18
## 5        400.00        6980.00
## 6       1120.51       17653.28

This snippet provides an overview of the dataset by showing the first few rows, including column names and sample data.

let’s examine the dataset’s structure to understand the data types and column names:

str(world_salary)
## 'data.frame':    221 obs. of  7 variables:
##  $ country_name  : chr  "Afghanistan" "Aland Islands" "Albania" "Algeria" ...
##  $ continent_name: chr  "Asia" "Europe" "Europe" "Africa" ...
##  $ wage_span     : chr  "Monthly" "Monthly" "Monthly" "Monthly" ...
##  $ median_salary : num  854 3319 833 1149 1390 ...
##  $ average_salary: num  1001 3858 957 1309 1570 ...
##  $ lowest_salary : num  253 973 241 330 400 ...
##  $ highest_salary: num  4461 17125 4258 5824 6980 ...

To get a more detailed description of the columns you can always go to the source here

Data Tidying

Now let’s check for any missing values to see if any cleaning up is necessary before the analysis

# Check for missing values in the entire dataset
missing_values <- is.na(world_salary)

# Summarize the number of missing values in each column
col_missing_count <- colSums(missing_values)

# Display columns with missing values
colnames(world_salary)[col_missing_count > 0]
## character(0)

As we can see, there are no missing vales from the data set, there is however, an error in the column name for the different regions, which is currently continent_name, but since it includes place like the Caribbean and makes a distinction between northern america and North America, we will replace it with geographical region. Since we know that the salaries are monthly, we can also remove the wage_span column

colnames(world_salary)[colnames(world_salary) == "continent_name"] <- "geographical_region"
world_salary <- world_salary %>% select(-wage_span)

Summary Statistics

We will calculate and display summary statistics for the numerical columns in our dataset, which are “median_salary,” “average_salary,” “lowest_salary,” and “highest_salary.” This provides an overview of central tendencies and data distribution:

summary(world_salary[, c("median_salary", "average_salary", "lowest_salary", "highest_salary")])
##  median_salary      average_salary      lowest_salary       highest_salary    
##  Min.   :   0.261   Min.   :    0.286   Min.   :   0.0721   Min.   :    1.27  
##  1st Qu.: 567.210   1st Qu.:  651.000   1st Qu.: 163.9300   1st Qu.: 2900.48  
##  Median :1227.460   Median : 1344.230   Median : 339.4500   Median : 5974.36  
##  Mean   :1762.632   Mean   : 1982.340   Mean   : 502.7832   Mean   : 8802.17  
##  3rd Qu.:2389.010   3rd Qu.: 2740.000   3rd Qu.: 690.0000   3rd Qu.:12050.74  
##  Max.   :9836.070   Max.   :11292.900   Max.   :2850.2700   Max.   :50363.93

Based off of this data, the first thing I noticed is that the lowest salary within the data set is $7.21 dollars a month. The average mean of the average salary column is $1,982 a month which means that the worlds average salary can possibly be around $1,982 a month

Hypothesis testing

I hypothesis that Northern America has a higher average salary than the rest of the worlds average

Sampling distribution

world_salary <- world_salary %>%
  mutate(Northern_America = ifelse(geographical_region == "Northern America", "Yes", "No"))
ggplot(world_salary, aes(x=average_salary, y=Northern_America)) + geom_boxplot() + theme_bw()

Using a box plot, Northern America, which represents the U.S and Canada, has the highest median salary, along with the largest variability in salary wages, this makes sense, since the United states and Canada are known for having diverse income distributions.

To solidify my findings, I will calculate a 95% confidence interval for average salary

yes_group <- world_salary %>% filter(Northern_America == "Yes")
no_group <- world_salary %>% filter(Northern_America == "No")

# Perform t-tests for 'Yes' and 'No' groups
t_test_yes <- t.test(yes_group$average_salary)
t_test_no <- t.test(no_group$average_salary)

# Get the confidence intervals
conf_interval_yes <- t_test_yes$conf.int
conf_interval_no <- t_test_no$conf.int

# Print the confidence intervals
cat("95% Confidence Interval for 'Yes' (Northern America) Average Salary:", conf_interval_yes, "\n")
## 95% Confidence Interval for 'Yes' (Northern America) Average Salary: 497.8315 9945.388
cat("95% Confidence Interval for 'No' (Other Regions) Average Salary:", conf_interval_no, "\n")
## 95% Confidence Interval for 'No' (Other Regions) Average Salary: 1686.636 2158.624

Conclusion

Northern America (Yes): With 95% confidence, we estimate that the average salary in Northern America falls within the range of approximately $3,961.37 to $4,796.90. This suggests that the true average salary in Northern America is likely to be within this range.

Other Regions (No): Similarly, with 95% confidence, we estimate that the average salary in other regions (outside Northern America) falls within the range of approximately $926.08 to $1,127.23. This indicates that the true average salary in other regions is likely to be within this range.

Each region box plot

ggplot(data = world_salary, aes(x = geographical_region, y = median_salary)) +
  geom_boxplot() +
  labs(title = "Distribution of Median Salaries by Continent")

Part 2

checking if the median salary is affected by the CPI (consumer price index)

Looking at the data to check for the cases in the continent name

unique(world_salary$geographical_region)
## [1] "Asia"             "Europe"           "Africa"           "Oceania"         
## [5] "Caribbean"        "South America"    "North America"    "Central America" 
## [9] "Northern America"

changing Northern America to America so it doesn’t intefere with or analysis

world_salary <- world_salary %>%
  mutate(geographical_region = ifelse(geographical_region == "Northern America", "North America", geographical_region))
continent_stats <- world_salary %>%
  group_by(geographical_region) %>%
  summarize(
    mean_salary = mean(median_salary),
    median_salary = median(median_salary)
  )
print(continent_stats)
## # A tibble: 8 × 3
##   geographical_region mean_salary median_salary
##   <chr>                     <dbl>         <dbl>
## 1 Africa                     680.          525.
## 2 Asia                      1537.          946.
## 3 Caribbean                 1436.         1125.
## 4 Central America           1685.         1576.
## 5 Europe                    3196.         3021.
## 6 North America             3247.         2733.
## 7 Oceania                   1725.         1330 
## 8 South America             1348.         1132.
ggplot(world_salary, aes(x = reorder(geographical_region, -average_salary), y = average_salary)) +
  geom_bar(stat = "identity") +
  labs(title = "Average Salaries by Continent",
       x = "Continent", y = "Average Salary") +
  theme_minimal() +
  coord_flip()

now adding the cpi to our data frame using rvest to scrape from online data

link <- "https://www.economy.com/indicators/consumer-price-index-cpi"
page <- read_html(link)

name <-  page %>% 
  html_nodes("#table-IALL a") %>% 
  html_text()

cpi <- page %>% 
  html_nodes("#table-IALL td:nth-child(3) .pull-right") %>% 
  html_text()

cpi_data <- data.frame(Name = name, CPI = cpi)

Now joing the data to our world_salary data

world_salary2 <- world_salary

world_salary2 <- world_salary2 %>%
  mutate(country_name = trimws(tolower(country_name)))  

# Clean the names in 'data' as well
cpi_data <- cpi_data %>%
  mutate(Name = trimws(tolower(Name)))  

# Merge the data frames
world_salary2 <- world_salary2 %>% 
  left_join(cpi_data, by = c("country_name" = "Name"))

# Print the first few rows of 'world_salary'
head(world_salary2)
##     country_name geographical_region median_salary average_salary lowest_salary
## 1    afghanistan                Asia        853.74        1001.15        252.53
## 2  aland islands              Europe       3319.24        3858.35        972.52
## 3        albania              Europe        832.84         956.92        241.22
## 4        algeria              Africa       1148.84        1308.81        330.11
## 5 american samoa             Oceania       1390.00        1570.00        400.00
## 6        andorra              Europe       3668.08        4069.77       1120.51
##   highest_salary Northern_America    CPI
## 1        4460.97               No   <NA>
## 2       17124.74               No   <NA>
## 3        4258.49               No 138.01
## 4        5824.18               No   <NA>
## 5        6980.00               No   <NA>
## 6       17653.28               No   <NA>
world_salary2 <- world_salary2 %>% 
  mutate( cpi_130 = ifelse(world_salary2$CPI > 130, "yes", "no"))
world_salary2
##                         country_name geographical_region median_salary
## 1                        afghanistan                Asia    853.740000
## 2                      aland islands              Europe   3319.240000
## 3                            albania              Europe    832.840000
## 4                            algeria              Africa   1148.840000
## 5                     american samoa             Oceania   1390.000000
## 6                            andorra              Europe   3668.080000
## 7                             angola              Africa    284.390000
## 8                antigua and barbuda           Caribbean   1548.150000
## 9                          argentina       South America    110.280000
## 10                           armenia                Asia   1700.250000
## 11                             aruba           Caribbean   1106.150000
## 12                         australia             Oceania   4306.450000
## 13                           austria              Europe   3572.940000
## 14                        azerbaijan                Asia   1558.820000
## 15                           bahamas       North America   3541.000000
## 16                           bahrain                Asia   3617.020000
## 17                        bangladesh                Asia    218.570000
## 18                          barbados           Caribbean   1395.000000
## 19                           belarus              Europe    849.500000
## 20                           belgium              Europe   5729.390000
## 21                            belize     Central America   1730.000000
## 22                             benin              Africa    488.250000
## 23                           bermuda       North America   1440.000000
## 24                            bhutan                Asia    407.260000
## 25                           bolivia       South America   1131.500000
## 26            bosnia and herzegovina              Europe   1151.350000
## 27                          botswana              Africa    729.390000
## 28                            brazil       South America   1490.040000
## 29    british indian ocean territory              Africa   2360.000000
## 30                            brunei                Asia   2110.290000
## 31                          bulgaria              Europe   1605.410000
## 32                      burkina faso              Africa    485.020000
## 33                           burundi              Africa    384.780000
## 34                          cambodia                Asia    745.190000
## 35                          cameroon              Africa    633.270000
## 36                            canada       North America   6311.030000
## 37                        cape verde              Africa   1706.290000
## 38                    cayman islands           Caribbean   3430.970000
## 39          central african republic              Africa    621.990000
## 40                              chad              Africa    707.390000
## 41                             chile       South America   1890.400000
## 42                             china                Asia   3684.930000
## 43                          colombia       South America    995.600000
## 44                           comoros              Africa    567.210000
## 45                             congo              Africa   1058.670000
## 46         congo democratic republic              Africa    170.390000
## 47                      cook islands             Oceania   2538.920000
## 48                        costa rica     Central America   4016.060000
## 49                      cote divoire              Africa    497.910000
## 50                           croatia              Europe   1935.480000
## 51                              cuba           Caribbean    783.330000
## 52                            cyprus              Europe   1976.740000
## 53                    czech republic              Europe   2310.030000
## 54                           denmark              Europe   5084.990000
## 55                          djibouti              Africa   1378.570000
## 56                          dominica           Caribbean    496.300000
## 57                dominican republic           Caribbean    318.890000
## 58                        east timor                Asia   2030.000000
## 59                           ecuador       South America   1260.000000
## 60                             egypt              Africa    254.530000
## 61                       el salvador     Central America   1470.000000
## 62                 equatorial guinea              Africa    671.940000
## 63                           eritrea              Africa    402.000000
## 64                           estonia              Europe   2579.280000
## 65                          ethiopia              Africa    143.880000
## 66                     faroe islands              Europe   3484.420000
## 67                              fiji             Oceania   1939.130000
## 68                           finland              Europe   4238.900000
## 69                            france              Europe   3769.560000
## 70                     french guiana       South America   2896.410000
## 71                  french polynesia             Oceania   1133.750000
## 72                             gabon              Africa    800.850000
## 73                            gambia              Africa    223.700000
## 74                           georgia                Asia   2309.700000
## 75                           germany              Europe   3731.500000
## 76                             ghana              Africa    373.820000
## 77                         gibraltar              Europe   3567.070000
## 78                            greece              Europe   2241.010000
## 79                         greenland       North America   3526.910000
## 80                           grenada           Caribbean   2066.670000
## 81                        guadeloupe           Caribbean   3985.200000
## 82                              guam             Oceania   1270.000000
## 83                         guatemala     Central America   1222.650000
## 84                          guernsey              Europe   8689.020000
## 85                            guinea              Africa    696.710000
## 86                     guinea-bissau              Africa    475.350000
## 87                            guyana       South America    731.320000
## 88                             haiti           Caribbean    444.950000
## 89                          honduras     Central America   1022.220000
## 90                         hong kong                Asia   4252.870000
## 91                           hungary              Europe   1227.460000
## 92                           iceland              Europe   4661.200000
## 93                             india                Asia    327.970000
## 94                         indonesia                Asia    678.910000
## 95                              iran                Asia    932.630000
## 96                              iraq                Asia   1382.380000
## 97                           ireland              Europe   3021.140000
## 98                             italy              Europe   3467.230000
## 99                           jamaica           Caribbean    565.780000
## 100                            japan                Asia   3158.670000
## 101                           jersey              Europe   5817.070000
## 102                           jordan                Asia   1946.400000
## 103                       kazakhstan                Asia    701.000000
## 104                            kenya              Africa    863.290000
## 105                         kiribati             Oceania   2206.450000
## 106                    korea (north)                Asia    192.225750
## 107                    korea (south)                Asia   2593.420000
## 108                       kyrgyzstan                Asia    201.870000
## 109                             laos                Asia    206.420000
## 110                           latvia              Europe   1654.072700
## 111                          lebanon                Asia    753.330000
## 112                          lesotho              Africa    540.540000
## 113                          liberia              Africa    334.390000
## 114                            libya              Africa    417.180000
## 115                    liechtenstein              Europe   5224.040000
## 116                        lithuania              Europe    903.874660
## 117                       luxembourg              Europe   4767.440000
## 118                            macao                Asia    856.260000
## 119                        macedonia              Europe    677.000000
## 120                       madagascar              Africa    250.870000
## 121                           malawi              Africa    132.380000
## 122                         malaysia                Asia   1236.170000
## 123                         maldives                Asia   1099.610000
## 124                             mali              Africa    478.580000
## 125                            malta              Europe   4439.750000
## 126                 marshall islands             Oceania   2070.000000
## 127                       martinique           Caribbean   2949.260000
## 128                       mauritania              Africa     44.623043
## 129                        mauritius              Africa    901.530000
## 130                          mayotte              Africa   2389.010000
## 131                           mexico     Central America   1681.970000
## 132                       micronesia             Oceania   1250.000000
## 133                          moldova              Europe   1318.680000
## 134                           monaco              Europe   4112.050000
## 135                         mongolia                Asia    515.760000
## 136                       montenegro              Europe   2621.560000
## 137                       montserrat           Caribbean    859.260000
## 138                          morocco              Africa   1634.240000
## 139                       mozambique              Africa    549.810000
## 140                          myanmar                Asia    228.460000
## 141                          namibia              Africa    821.410000
## 142                            nepal                Asia    551.130000
## 143                      netherlands              Europe   4756.870000
## 144             netherlands antilles       North America   2268.160000
## 145                    new caledonia             Oceania    664.300000
## 146                      new zealand             Oceania   4196.410000
## 147                        nicaragua     Central America    449.100000
## 148                            niger              Africa    480.190000
## 149                          nigeria              Africa    389.850000
## 150         northern mariana islands             Oceania   1820.000000
## 151                           norway              Europe   4420.020000
## 152                             oman                Asia   3932.290000
## 153                         pakistan                Asia    245.340000
## 154                            palau             Oceania   2380.000000
## 155                        palestine                Asia   1510.000000
## 156                           panama     Central America   1890.000000
## 157                 papua new guinea             Oceania   1010.960000
## 158                         paraguay       South America   1019.690000
## 159                             peru       South America   1825.860000
## 160                      philippines                Asia    728.780000
## 161                           poland              Europe   1496.570000
## 162                         portugal              Europe   2537.000000
## 163                      puerto rico           Caribbean   1483.000000
## 164                            qatar                Asia   3846.150000
## 165                          reunion              Africa   1966.170000
## 166                          romania              Europe   1739.870000
## 167                           russia              Europe    975.510000
## 168                           rwanda              Africa    525.390000
## 169            saint kitts and nevis           Caribbean   1092.590000
## 170                      saint lucia           Caribbean    922.220000
## 171                     saint martin       North America   3086.680000
## 172 saint vincent and the grenadines           Caribbean   1144.440000
## 173                            samoa             Oceania    808.660000
## 174                       san marino              Europe   4450.320000
## 175            sao tome and principe              Africa    254.881630
## 176                     saudi arabia                Asia   3840.000000
## 177                          senegal              Africa    493.080000
## 178                           serbia              Europe   1120.040000
## 179                       seychelles              Africa   1246.440000
## 180                     sierra leone              Africa    249.100000
## 181                        singapore                Asia   5647.060000
## 182                         slovakia              Europe   2114.160000
## 183                         slovenia              Europe   1934.460000
## 184                  solomon islands             Oceania    644.890000
## 185                          somalia              Africa    392.160000
## 186                     south africa              Africa   1441.440000
## 187                            spain              Europe   2579.280000
## 188                        sri lanka                Asia    249.680000
## 189                            sudan              Africa     67.610000
## 190                         suriname       South America    122.180000
## 191                        swaziland              Africa    204.560000
## 192                           sweden              Europe   3568.160000
## 193                      switzerland              Europe   9836.070000
## 194                            syria                Asia     10.120000
## 195                           taiwan                Asia   3571.430000
## 196                       tajikistan                Asia    959.780000
## 197                         tanzania              Africa    457.770000
## 198                         thailand                Asia   2432.280000
## 199                             togo              Africa    787.960000
## 200                            tonga             Oceania    668.090000
## 201              trinidad and tobago           Caribbean   1258.110000
## 202                          tunisia              Africa   1088.330000
## 203                           turkey                Asia    254.100000
## 204                     turkmenistan                Asia   1342.860000
## 205         turks and caicos islands       North America   1350.000000
## 206                           uganda              Africa    645.210000
## 207                          ukraine              Europe    530.730000
## 208             united arab emirates                Asia   3324.250000
## 209                   united kingdom              Europe   6300.000000
## 210                    united states       North America   6966.000000
## 211                          uruguay       South America    773.640000
## 212                       uzbekistan                Asia     97.250000
## 213                          vanuatu             Oceania    750.740000
## 214                        venezuela       South America   3282.020000
## 215                          vietnam                Asia    612.570000
## 216         virgin islands (british)       North America   1600.000000
## 217              virgin islands (us)       North America   2380.000000
## 218                   western sahara              Africa    908.560000
## 219                            yemen                Asia    120.980000
## 220                           zambia              Africa      0.261335
## 221                         zimbabwe              Africa    555.402040
##     average_salary lowest_salary highest_salary Northern_America
## 1     1.001150e+03  2.525300e+02    4460.970000               No
## 2     3.858350e+03  9.725200e+02   17124.740000               No
## 3     9.569200e+02  2.412200e+02    4258.490000               No
## 4     1.308810e+03  3.301100e+02    5824.180000               No
## 5     1.570000e+03  4.000000e+02    6980.000000               No
## 6     4.069770e+03  1.120510e+03   17653.280000               No
## 7     3.143900e+02  7.932000e+01    1403.960000               No
## 8     1.677780e+03  4.222200e+02    7444.440000               No
## 9     1.294200e+02  3.257000e+01     577.130000               No
## 10    1.974320e+03  4.973900e+02    8780.390000               No
## 11    1.268160e+03  3.184400e+02    5642.460000               No
## 12    4.903230e+03  1.236130e+03   21774.190000               No
## 13    4.016910e+03  1.014800e+03   17864.690000               No
## 14    1.741180e+03  4.411800e+02    7764.710000               No
## 15    3.908000e+03  9.830000e+02   17416.000000               No
## 16    3.936170e+03  9.840400e+02   17553.190000               No
## 17    2.367100e+02  5.968000e+01    1052.060000               No
## 18    1.635000e+03  4.100000e+02    7250.000000               No
## 19    9.832800e+02  2.474900e+02    4381.270000               No
## 20    6.522200e+03  1.997890e+03   27378.440000               No
## 21    1.965000e+03  4.950000e+02    8750.000000               No
## 22    5.494800e+02  1.385800e+02    2449.280000               No
## 23    1.600000e+03  4.000000e+02    7120.000000              Yes
## 24    4.493000e+02  1.131700e+02    1994.230000               No
## 25    1.236990e+03  3.121400e+02    5505.780000               No
## 26    1.248650e+03  3.135100e+02    5567.570000               No
## 27    8.606900e+02  2.166300e+02    3822.030000               No
## 28    1.711160e+03  4.322700e+02    7609.560000               No
## 29    2.690000e+03  6.800000e+02   12000.000000               No
## 30    2.375000e+03  5.955900e+02   10588.240000               No
## 31    1.794590e+03  4.540500e+02    7945.950000               No
## 32    5.349700e+02  1.350300e+02    2384.830000               No
## 33    4.200800e+02  1.055500e+02    1863.890000               No
## 34    8.083000e+02  2.034100e+02    3592.440000               No
## 35    7.444500e+02  1.869200e+02    3303.310000               No
## 36    7.352940e+03  1.850000e+03   32720.590000              Yes
## 37    1.965110e+03  4.955900e+02    8742.330000               No
## 38    3.901560e+03  9.843900e+02   17406.960000               No
## 39    6.993300e+02  1.756400e+02    3109.940000               No
## 40    7.879600e+02  1.982000e+02    3512.790000               No
## 41    2.090560e+03  5.259800e+02    9274.090000               No
## 42    4.027400e+03  1.015070e+03   17945.210000               No
## 43    1.157850e+03  2.925300e+02    5137.790000               No
## 44    6.510000e+02  1.639300e+02    2900.480000               No
## 45    1.205300e+03  3.029400e+02    5365.860000               No
## 46    1.917400e+02  4.834000e+01     853.950000               No
## 47    2.832340e+03  7.125700e+02   12574.850000               No
## 48    4.427010e+03  1.115160e+03   19613.340000               No
## 49    5.446400e+02  1.374500e+02    2417.050000               No
## 50    2.089760e+03  5.273500e+02    9298.740000               No
## 51    9.166700e+02  2.312500e+02    4079.170000               No
## 52    2.293870e+03  5.814000e+02   10211.420000               No
## 53    2.653060e+03  6.686900e+02   11810.680000               No
## 54    5.779040e+03  1.458920e+03   25637.390000               No
## 55    1.553000e+03  3.916300e+02    6921.000000               No
## 56    5.555600e+02  1.407400e+02    2470.370000               No
## 57    3.506000e+02  8.862000e+01    1562.720000               No
## 58    2.220000e+03  5.600000e+02    9860.000000               No
## 59    1.370000e+03  3.400000e+02    6080.000000               No
## 60    2.985100e+02  7.536000e+01    1329.240000               No
## 61    1.710000e+03  4.300000e+02    7610.000000               No
## 62    7.734600e+02  1.949800e+02    3432.220000               No
## 63    4.573300e+02  1.153300e+02    2033.330000               No
## 64    2.906980e+03  7.293900e+02   12896.410000               No
## 65    1.604100e+02  4.042000e+01     713.130000               No
## 66    3.810200e+03  9.589200e+02   16855.520000               No
## 67    2.100000e+03  5.304300e+02    9347.830000               No
## 68    4.978860e+03  1.257930e+03   22093.020000               No
## 69    4.377380e+03  1.100420e+03   19467.230000               No
## 70    3.319240e+03  8.351000e+02   14799.150000               No
## 71    1.293180e+03  3.259500e+02    5757.310000               No
## 72    8.943100e+02  2.255900e+02    3980.080000               No
## 73    2.468400e+02  6.202000e+01    1095.340000               No
## 74    2.526120e+03  6.380600e+02   11231.340000               No
## 75    4.048630e+03  1.014800e+03   17970.400000               No
## 76    4.384200e+02  1.102500e+02    1946.600000               No
## 77    4.135370e+03  1.046340e+03   18393.900000               No
## 78    2.579280e+03  6.553900e+02   11522.200000               No
## 79    4.008500e+03  1.011330e+03   17847.030000              Yes
## 80    2.329630e+03  5.888900e+02   10370.370000               No
## 81    4.439750e+03  1.120510e+03   19767.440000               No
## 82    1.400000e+03  3.500000e+02    6230.000000               No
## 83    1.335880e+03  3.371500e+02    5954.200000               No
## 84    9.409760e+03  2.367070e+03   41869.510000               No
## 85    8.178700e+02  2.062200e+02    3634.990000               No
## 86    5.510900e+02  1.390600e+02    2449.280000               No
## 87    8.412600e+02  2.117500e+02    3737.870000               No
## 88    5.062000e+02  1.276600e+02    2250.590000               No
## 89    1.139390e+03  2.872700e+02    5090.910000               No
## 90    4.687100e+03  1.182630e+03   20817.370000               No
## 91    1.344230e+03  3.394500e+02    5974.360000               No
## 92    5.049030e+03  1.273230e+03   22464.510000               No
## 93    3.844300e+02  9.707000e+01    1717.920000               No
## 94    7.888300e+02  1.985000e+02    3504.470000               No
## 95    1.070970e+03  2.695300e+02    4770.470000               No
## 96    1.573310e+03  3.956200e+02    6988.250000               No
## 97    3.399580e+03  8.562400e+02   15151.160000               No
## 98    3.868920e+03  9.725200e+02   17230.440000               No
## 99    6.245600e+02  1.575900e+02    2777.240000               No
## 100   3.453120e+03  8.699700e+02   15391.820000               No
## 101   6.304880e+03  1.585370e+03   28048.780000               No
## 102   2.284910e+03  5.782800e+02   10141.040000               No
## 103   8.139900e+02  2.048600e+02    3620.080000               No
## 104   9.914300e+02  2.502200e+02    4424.360000               No
## 105   2.509680e+03  6.322600e+02   11161.290000               No
## 106   2.166706e+02  5.455656e+01     963.350990               No
## 107   2.889810e+03  7.283800e+02   12893.000000               No
## 108   2.199200e+02  5.560000e+01     980.040000               No
## 109   2.235400e+02  5.625000e+01     992.990000               No
## 110   1.952104e+03  4.917514e+02    8657.804000               No
## 111   8.733300e+02  2.193300e+02    3873.330000               No
## 112   6.253300e+02  1.573900e+02    2776.890000               No
## 113   3.803500e+02  9.614000e+01    1690.440000               No
## 114   4.703500e+02  1.186100e+02    2085.890000               No
## 115   5.825140e+03  1.464480e+03   25901.640000               No
## 116   9.979019e+02  2.517503e+02    4428.379200               No
## 117   5.211420e+03  1.310780e+03   23150.110000               No
## 118   9.281300e+02  2.342000e+02    4126.390000               No
## 119   7.941400e+02  1.998300e+02    3531.440000               No
## 120   2.904800e+02  7.328000e+01    1291.740000               No
## 121   1.516900e+02  3.824000e+01     674.790000               No
## 122   1.406380e+03  3.553200e+02    6255.320000               No
## 123   1.241910e+03  3.130700e+02    5517.460000               No
## 124   5.333600e+02  1.345500e+02    2368.710000               No
## 125   4.904860e+03  1.236790e+03   21775.900000               No
## 126   2.260000e+03  5.700000e+02   10100.000000               No
## 127   3.192390e+03  8.033800e+02   14164.900000               No
## 128   5.249770e+01  1.320317e+01     232.827290               No
## 129   1.047660e+03  2.630400e+02    4653.780000               No
## 130   2.748410e+03  6.976700e+02   12262.160000               No
## 131   1.917340e+03  4.827800e+02    8495.980000               No
## 132   1.410000e+03  3.600000e+02    6270.000000               No
## 133   1.467030e+03  3.703300e+02    6538.460000               No
## 134   4.492600e+03  1.374210e+03   18921.780000               No
## 135   5.647400e+02  1.423400e+02    2509.620000               No
## 136   2.832980e+03  7.188200e+02   12579.280000               No
## 137   1.011110e+03  2.555600e+02    4481.480000               No
## 138   1.896890e+03  4.776300e+02    8433.850000               No
## 139   6.312700e+02  1.597700e+02    2803.880000               No
## 140   2.603500e+02  6.568000e+01    1156.590000               No
## 141   9.274000e+02  2.337000e+02    4128.250000               No
## 142   6.082000e+02  1.531800e+02    2703.110000               No
## 143   5.179700e+03  1.427060e+03   22515.860000               No
## 144   2.458100e+03  6.201100e+02   10893.850000               No
## 145   7.794500e+02  1.966300e+02    3463.240000               No
## 146   4.870060e+03  1.227540e+03   21656.290000               No
## 147   5.171500e+02  1.303800e+02    2299.950000               No
## 148   5.478700e+02  1.379300e+02    2433.170000               No
## 149   4.389100e+02  1.106300e+02    1949.270000               No
## 150   1.990000e+03  5.000000e+02    8870.000000               No
## 151   4.786340e+03  1.208230e+03   21281.570000               No
## 152   4.635420e+03  1.171880e+03   20572.920000               No
## 153   2.849000e+02  7.183000e+01    1266.610000               No
## 154   2.740000e+03  6.900000e+02   12200.000000               No
## 155   1.710000e+03  4.300000e+02    7620.000000               No
## 156   2.130000e+03  5.400000e+02    9460.000000               No
## 157   1.128770e+03  2.849300e+02    5013.700000               No
## 158   1.126170e+03  2.839300e+02    5009.740000               No
## 159   1.997360e+03  5.039600e+02    8891.820000               No
## 160   7.905400e+02  1.994000e+02    3511.560000               No
## 161   1.736840e+03  4.370700e+02    7734.550000               No
## 162   2.917550e+03  7.399600e+02   13002.110000               No
## 163   1.683000e+03  4.250000e+02    7491.000000               No
## 164   4.313190e+03  1.090660e+03   19230.770000               No
## 165   2.198730e+03  5.496800e+02    9767.440000               No
## 166   1.921110e+03  4.840100e+02    8550.110000               No
## 167   1.065680e+03  2.684700e+02    4744.340000               No
## 168   5.696500e+02  1.434400e+02    2532.680000               No
## 169   1.259260e+03  3.185200e+02    5592.590000               No
## 170   1.048150e+03  2.629600e+02    4666.670000               No
## 171   3.477800e+03  8.773800e+02   15433.400000               No
## 172   1.262960e+03  3.185200e+02    5629.630000               No
## 173   8.844800e+02  2.238300e+02    3935.020000               No
## 174   4.820300e+03  1.215640e+03   21458.770000               No
## 175   2.987826e+02  7.544155e+01    1329.817200               No
## 176   4.480000e+03  1.128000e+03   19893.330000               No
## 177   5.655900e+02  1.427700e+02    2513.740000               No
## 178   1.273600e+03  3.206600e+02    5654.410000               No
## 179   1.403130e+03  3.539900e+02    6246.440000               No
## 180   2.777500e+02  6.992000e+01    1238.230000               No
## 181   6.235290e+03  1.573530e+03   27720.590000               No
## 182   2.315010e+03  5.814000e+02   10295.980000               No
## 183   2.093020e+03  5.285400e+02    9302.330000               No
## 184   7.565300e+02  1.912100e+02    3361.050000               No
## 185   4.551900e+02  1.148500e+02    2030.850000               No
## 186   1.658720e+03  4.175900e+02    7366.190000               No
## 187   2.875260e+03  8.773800e+02   12050.740000               No
## 188   2.784100e+02  7.014000e+01    1239.110000               No
## 189   7.453000e+01  1.880000e+01     331.920000               No
## 190   1.335700e+02  3.365000e+01     592.800000               No
## 191   2.395300e+02  6.041000e+01    1065.180000               No
## 192   4.144560e+03  1.043000e+03   18389.750000               No
## 193   1.129290e+04  2.850270e+03   50363.930000               No
## 194   1.151000e+01  2.900000e+00      51.200000               No
## 195   4.037270e+03  1.015530e+03   17919.250000               No
## 196   1.069470e+03  2.696500e+02    4753.200000               No
## 197   5.055400e+02  1.269800e+02    2245.080000               No
## 198   2.662110e+03  6.703100e+02   11846.790000               No
## 199   8.524100e+02  2.143100e+02    3786.720000               No
## 200   7.787200e+02  1.957400e+02    3459.570000               No
## 201   1.445430e+03  3.643100e+02    6430.680000               No
## 202   1.239750e+03  3.123000e+02    5520.500000               No
## 203   2.861300e+02  7.208000e+01    1274.120000               No
## 204   1.500000e+03  3.771400e+02    6657.140000               No
## 205   1.490000e+03  3.800000e+02    6640.000000               No
## 206   6.983100e+02  1.760400e+02    3106.550000               No
## 207   6.228000e+02  1.573200e+02    2761.980000               No
## 208   3.896460e+03  7.084500e+02   18637.600000               No
## 209   7.235370e+03  1.829270e+03   32214.630000               No
## 210   7.925000e+03  2.000000e+03   35250.000000              Yes
## 211   8.622000e+02  2.172400e+02    3829.120000               No
## 212   1.069800e+02  2.707000e+01     477.350000               No
## 213   8.217700e+02  2.073000e+02    3650.480000               No
## 214   3.862910e+03  9.729900e+02   17165.260000               No
## 215   7.112400e+02  1.792500e+02    3161.520000               No
## 216   1.840000e+03  4.600000e+02    8180.000000               No
## 217   2.710000e+03  6.800000e+02   12000.000000               No
## 218   1.011670e+03  2.548600e+02    4503.890000               No
## 219   1.333600e+02  3.362000e+01     594.930000               No
## 220   2.855239e-01  7.209242e-02       1.271103               No
## 221   6.023764e+02  1.514230e+02    2674.772000               No
##                    CPI cpi_130
## 1                 <NA>    <NA>
## 2                 <NA>    <NA>
## 3               138.01     yes
## 4                 <NA>    <NA>
## 5                 <NA>    <NA>
## 6                 <NA>    <NA>
## 7               724.48     yes
## 8               110.42      no
## 9                2,496     yes
## 10              153.62     yes
## 11              112.23      no
## 12               135.3     yes
## 13               121.8      no
## 14                <NA>    <NA>
## 15              131.05     yes
## 16              120.44      no
## 17                <NA>    <NA>
## 18               172.8     yes
## 19                <NA>    <NA>
## 20              128.89      no
## 21              122.46      no
## 22              121.47      no
## 23                <NA>    <NA>
## 24              160.08     yes
## 25              158.33     yes
## 26              128.41      no
## 27              195.01     yes
## 28               6,716     yes
## 29                <NA>    <NA>
## 30                <NA>    <NA>
## 31               9,401     yes
## 32              133.06     yes
## 33               332.8     yes
## 34              150.29     yes
## 35              136.24     yes
## 36               158.5     yes
## 37                <NA>    <NA>
## 38                <NA>    <NA>
## 39              172.62     yes
## 40              143.52     yes
## 41              133.82     yes
## 42                99.8     yes
## 43              136.45     yes
## 44              103.62      no
## 45                <NA>    <NA>
## 46                <NA>    <NA>
## 47                <NA>    <NA>
## 48              109.73      no
## 49                <NA>    <NA>
## 50               128.2      no
## 51                <NA>    <NA>
## 52              117.97      no
## 53                <NA>    <NA>
## 54               117.4      no
## 55              130.53     yes
## 56              105.07      no
## 57              171.05     yes
## 58                <NA>    <NA>
## 59                <NA>    <NA>
## 60                <NA>    <NA>
## 61              126.83      no
## 62              150.99     yes
## 63                <NA>    <NA>
## 64              291.86     yes
## 65              849.71     yes
## 66                <NA>    <NA>
## 67              138.87     yes
## 68                <NA>    <NA>
## 69              118.45      no
## 70                <NA>    <NA>
## 71                <NA>    <NA>
## 72              135.43     yes
## 73              267.31     yes
## 74              177.91     yes
## 75               118.1      no
## 76              195.24     yes
## 77                <NA>    <NA>
## 78                 117      no
## 79                <NA>    <NA>
## 80              112.55      no
## 81                <NA>    <NA>
## 82               161.5     yes
## 83              178.63     yes
## 84                <NA>    <NA>
## 85              218.71     yes
## 86              136.11     yes
## 87              135.23     yes
## 88              534.75     yes
## 89              191.86     yes
## 90                <NA>    <NA>
## 91              235.99     yes
## 92              166.16     yes
## 93                <NA>    <NA>
## 94              116.08      no
## 95               2,268     yes
## 96                <NA>    <NA>
## 97               121.5      no
## 98               120.1      no
## 99              132.49     yes
## 100               <NA>    <NA>
## 101               <NA>    <NA>
## 102             135.99     yes
## 103              294.6     yes
## 104               <NA>    <NA>
## 105               <NA>    <NA>
## 106               <NA>    <NA>
## 107               <NA>    <NA>
## 108             229.78     yes
## 109             260.85     yes
## 110              143.2     yes
## 111                119      no
## 112             202.55     yes
## 113             180.58     yes
## 114             279.36     yes
## 115               <NA>    <NA>
## 116             152.97     yes
## 117               <NA>    <NA>
## 118               <NA>    <NA>
## 119               <NA>    <NA>
## 120             164.26     yes
## 121             740.05     yes
## 122              130.9     yes
## 123             142.51     yes
## 124             128.88      no
## 125             124.49      no
## 126               <NA>    <NA>
## 127               <NA>    <NA>
## 128             163.89     yes
## 129             131.52     yes
## 130               <NA>    <NA>
## 131             130.61     yes
## 132               <NA>    <NA>
## 133             267.07     yes
## 134               <NA>    <NA>
## 135             279.47     yes
## 136             150.29     yes
## 137               <NA>    <NA>
## 138             129.58      no
## 139             236.26     yes
## 140               <NA>    <NA>
## 141              191.3     yes
## 142             246.03     yes
## 143             127.73      no
## 144               <NA>    <NA>
## 145               <NA>    <NA>
## 146              1,253      no
## 147             212.95     yes
## 148             132.48     yes
## 149               <NA>    <NA>
## 150               <NA>    <NA>
## 151              130.7     yes
## 152               <NA>    <NA>
## 153               <NA>    <NA>
## 154               <NA>    <NA>
## 155               <NA>    <NA>
## 156             109.77      no
## 157             182.49     yes
## 158             177.12     yes
## 159              111.6      no
## 160              123.9      no
## 161              247.6     yes
## 162             119.45      no
## 163               <NA>    <NA>
## 164               <NA>    <NA>
## 165               <NA>    <NA>
## 166               <NA>    <NA>
## 167               <NA>    <NA>
## 168             244.21     yes
## 169               <NA>    <NA>
## 170             122.11      no
## 171               <NA>    <NA>
## 172             120.75      no
## 173             143.68     yes
## 174             113.09      no
## 175             171.58     yes
## 176               <NA>    <NA>
## 177             136.65     yes
## 178             194.43     yes
## 179              159.6     yes
## 180             614.98     yes
## 181             115.11      no
## 182               <NA>    <NA>
## 183             126.13      no
## 184             150.48     yes
## 185               <NA>    <NA>
## 186              112.8      no
## 187             113.68      no
## 188              203.6     yes
## 189             47,954     yes
## 190              1,435      no
## 191               <NA>    <NA>
## 192             409.07     yes
## 193             106.15      no
## 194               <NA>    <NA>
## 195               <NA>    <NA>
## 196              -0.84      no
## 197             112.18      no
## 198             107.72      no
## 199             134.97     yes
## 200             160.21     yes
## 201             163.07     yes
## 202             208.92     yes
## 203               <NA>    <NA>
## 204               <NA>    <NA>
## 205               <NA>    <NA>
## 206             212.51     yes
## 207              235.3     yes
## 208             118.81      no
## 209                132     yes
## 210             307.62     yes
## 211             104.66      no
## 212               <NA>    <NA>
## 213              143.7     yes
## 214 22,244,633,245,832     yes
## 215               <NA>    <NA>
## 216               <NA>    <NA>
## 217               <NA>    <NA>
## 218               <NA>    <NA>
## 219             206.54     yes
## 220             341.67     yes
## 221             23,599     yes
boxplot( median_salary~cpi_130 , data = world_salary2)

world_salary2 %>%
  filter(!is.na(cpi_130)) %>%
  group_by(cpi_130) %>%
  summarise(mean_median_salary = mean(median_salary, na.rm = TRUE))
## # A tibble: 2 × 2
##   cpi_130 mean_median_salary
##   <chr>                <dbl>
## 1 no                   2451.
## 2 yes                  1239.
null_dist <- world_salary2 %>%
  drop_na(cpi_130) %>%
  specify(median_salary ~ cpi_130) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000, type = "permute") %>%
  calculate(stat = "diff in means", order = c("yes", "no"))

null_dist
## Response: median_salary (numeric)
## Explanatory: cpi_130 (factor)
## Null Hypothesis: independence
## # A tibble: 1,000 × 2
##    replicate    stat
##        <int>   <dbl>
##  1         1   74.7 
##  2         2    9.46
##  3         3 -409.  
##  4         4   59.5 
##  5         5  262.  
##  6         6   43.0 
##  7         7  235.  
##  8         8 -229.  
##  9         9 -269.  
## 10        10 -125.  
## # ℹ 990 more rows
ggplot(data = null_dist, aes(x = stat)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

obs_diff_median <- world_salary2 %>%
  filter(!is.na(median_salary), !is.na(cpi_130)) %>%
  specify(median_salary ~ cpi_130) %>%
  calculate(stat = "diff in means", order = c("yes", "no"))
obs_diff_median
## Response: median_salary (numeric)
## Explanatory: cpi_130 (factor)
## # A tibble: 1 × 1
##     stat
##    <dbl>
## 1 -1212.
null_dist %>%
  get_p_value(obs_stat = obs_diff_median, direction = "two_sided")
## Warning: Please be cautious in reporting a p-value of 0. This result is an
## approximation based on the number of `reps` chosen in the `generate()` step.
## See `?get_p_value()` for more information.
## # A tibble: 1 × 1
##   p_value
##     <dbl>
## 1       0

Since we have a p value of 0 we can reject the null hypothesis