This analysis serves to explore the various parameters relating to cost of living, through a comparison of this in various countries. This includes variables such as rent index, groceries index, and rank.
Loading libraries
#loaded the 4 libraries below
library(readr)
library(tidyverse)
library(formattable)
library(dplyr)
library(psych)
library(ggplot2)
library(reshape2)
Importing csv file
cost = read_csv("Cost_Country.csv")
head(cost)
## # A tibble: 6 × 8
## Rank Country `Cost of Living Index` `Rent Index` Cost of Living Plus Re…¹
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1 Switzerland 101. 46.5 74.9
## 2 2 Bahamas 85 36.7 61.8
## 3 3 Iceland 83 39.2 62
## 4 4 Singapore 76.7 67.2 72.1
## 5 5 Barbados 76.6 19 48.9
## 6 6 Norway 76 26.2 52.1
## # ℹ abbreviated name: ¹`Cost of Living Plus Rent Index`
## # ℹ 3 more variables: `Groceries Index` <dbl>, `Restaurant Price Index` <dbl>,
## # `Local Purchasing Power Index` <dbl>
dim(cost)
## [1] 121 8
There are 121 observations (rows) and 8 variables (columns).
Checking for missing values
missing_values = sum(is.na(cost))
missing_values
## [1] 0
#summary of data cleaning
print(paste("There are", (missing_values), " missing values in the data set, therefore, no further cleaning or omitting of null values is needed."))
## [1] "There are 0 missing values in the data set, therefore, no further cleaning or omitting of null values is needed."
summary statistics
summary(cost)
## Rank Country Cost of Living Index Rent Index
## Min. : 1 Length:121 Min. : 18.80 Min. : 2.40
## 1st Qu.: 31 Class :character 1st Qu.: 30.20 1st Qu.: 8.50
## Median : 61 Mode :character Median : 39.50 Median :12.40
## Mean : 61 Mean : 43.56 Mean :16.05
## 3rd Qu.: 91 3rd Qu.: 52.80 3rd Qu.:20.10
## Max. :121 Max. :101.10 Max. :67.20
## Cost of Living Plus Rent Index Groceries Index Restaurant Price Index
## Min. :11.10 Min. : 17.50 Min. :12.80
## 1st Qu.:19.80 1st Qu.: 31.60 1st Qu.:21.60
## Median :27.00 Median : 40.50 Median :33.10
## Mean :30.36 Mean : 44.23 Mean :36.47
## 3rd Qu.:37.00 3rd Qu.: 53.70 3rd Qu.:47.20
## Max. :74.90 Max. :109.10 Max. :97.00
## Local Purchasing Power Index
## Min. : 2.30
## 1st Qu.: 34.80
## Median : 50.60
## Mean : 65.09
## 3rd Qu.: 99.40
## Max. :182.50
describe(cost)
## vars n mean sd median trimmed mad min
## Rank 1 121 61.00 35.07 61.0 61.00 44.48 1.0
## Country* 2 121 61.00 35.07 61.0 61.00 44.48 1.0
## Cost of Living Index 3 121 43.56 16.15 39.5 42.05 15.12 18.8
## Rent Index 4 121 16.05 11.41 12.4 14.15 6.97 2.4
## Cost of Living Plus Rent Index 5 121 30.36 13.26 27.0 28.81 11.56 11.1
## Groceries Index 6 121 44.23 17.06 40.5 42.22 14.08 17.5
## Restaurant Price Index 7 121 36.47 18.26 33.1 34.51 18.09 12.8
## Local Purchasing Power Index 8 121 65.09 39.57 50.6 61.68 31.88 2.3
## max range skew kurtosis se
## Rank 121.0 120.0 0.00 -1.23 3.19
## Country* 121.0 120.0 0.00 -1.23 3.19
## Cost of Living Index 101.1 82.3 0.87 0.38 1.47
## Rent Index 67.2 64.8 1.82 3.97 1.04
## Cost of Living Plus Rent Index 74.9 63.8 1.08 0.80 1.21
## Groceries Index 109.1 91.6 1.09 0.99 1.55
## Restaurant Price Index 97.0 84.2 0.92 0.33 1.66
## Local Purchasing Power Index 182.5 180.2 0.77 -0.28 3.60
Identifying cheapest country to live in
cheapest_country = cost %>%
filter(cost$`Cost of Living Index` == min(`Cost of Living Index`))
cheapest_country
## # A tibble: 1 × 8
## Rank Country `Cost of Living Index` `Rent Index` Cost of Living Plus Rent …¹
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 121 Pakistan 18.8 2.8 11.1
## # ℹ abbreviated name: ¹`Cost of Living Plus Rent Index`
## # ℹ 3 more variables: `Groceries Index` <dbl>, `Restaurant Price Index` <dbl>,
## # `Local Purchasing Power Index` <dbl>
It seems to be that the cheapest country to live in, according to the living index, is Pakistan. They have a rent index of 2.8, groceries index of 17.5, and local purchasing power index of 29.1
Identifying most expensive country to live in
exp_country = cost %>%
filter(cost$`Cost of Living Index` == max(`Cost of Living Index`))
exp_country
## # A tibble: 1 × 8
## Rank Country `Cost of Living Index` `Rent Index` Cost of Living Plus Re…¹
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1 Switzerland 101. 46.5 74.9
## # ℹ abbreviated name: ¹`Cost of Living Plus Rent Index`
## # ℹ 3 more variables: `Groceries Index` <dbl>, `Restaurant Price Index` <dbl>,
## # `Local Purchasing Power Index` <dbl>
Switzerland is the most expensive country to live according to the data set, which I am not surprised by. In comparison to the cheapest country, the rent index is at 46.5, the grocery index is 109.1, and the local purchasing power is 158.7.
Correlations between Indices
cor_matrix = cost %>%
select(`Cost of Living Index`, `Rent Index`, `Groceries Index`, `Local Purchasing Power Index`) %>%
cor(use = "complete.obs")
cor_matrix
## Cost of Living Index Rent Index Groceries Index
## Cost of Living Index 1.0000000 0.8208850 0.9584520
## Rent Index 0.8208850 1.0000000 0.7709442
## Groceries Index 0.9584520 0.7709442 1.0000000
## Local Purchasing Power Index 0.6926879 0.6839118 0.6406340
## Local Purchasing Power Index
## Cost of Living Index 0.6926879
## Rent Index 0.6839118
## Groceries Index 0.6406340
## Local Purchasing Power Index 1.0000000
Visualizing the correlations
cor_melt = melt(cor_matrix)
ggplot(cor_melt, aes(Var1, Var2, fill = value)) +
geom_tile() +
geom_text(aes(label = round(value, 2)), color = "black") +
scale_fill_gradient2(low = "red",
high = "maroon", mid = "white",
midpoint = 0) +
theme_minimal() +
labs(title = "Correlation Heatmap of the Indices above", x = "Index", y = "Index")
Analysis: The heat map indicates that almost all the variables have a strong (or not weak) correlation. This can indicate that the variables closest to a correlation of 1, increase together as though they were complementary. “Local purchasing power” seems to be the variable that is least closest to the value of 1 when comparing it to the others, this could indicate that as the other variables increase in price, individual purchasing power decreases which would make sense. As peoples rent increases (or other costs), their disposable income decreases – reducing their purchasing power.