Our Problem Statement

  1. Over the years, as the cost of living in Singapore rises, word on the street is that Singaporeans have been complaininng about the rising healthcare cost. Is there a upward cost trend only for Singapore or is this a general trend across mulitiple countries? Our guess is healthcare cost is increasing, Singapore, a city state with an aging population, should be increasing too, but at what rate comparing to other similar countries in the region?

  2. Singapore governments has introduced a couple of related packages (Pioneer Generation, Merdeka Generation, CareShield, Medifund). Is the Singapore government spending significantly more on healthcare over the years?

  3. These packages however, are getting rolled out for specific age groups, and the financial burden for the elderly might fall upon their children as the government subsidies are largely focused on outpatient services provided in Polyclinics. Other subsidies like the Community Health Assistance Schemes (CHAS) are targeted at low income groups, and in patient services are supported by subsidies like Medifund after means testing occurs. For the average joe in Singapore (based on household expenditure and income percentile), is this government support sufficient to cushion the rise? Is the Singapore government spending enough on healthcare, or are the citizen bearing the brunt of the rise cost of healthcare? Who is paying more, you or the govenment?

For our data science project, we activated the following packages, using the Tidyverse approach.

library(directlabels)
library(tidyverse)
## ── Attaching packages ─────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.7
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0
## ── Conflicts ────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(dplyr)
library(reshape2)
## 
## Attaching package: 'reshape2'
## The following object is masked from 'package:tidyr':
## 
##     smiths
library(ggplot2)
library(rvest)
## Loading required package: xml2
## 
## Attaching package: 'rvest'
## The following object is masked from 'package:purrr':
## 
##     pluck
## The following object is masked from 'package:readr':
## 
##     guess_encoding
library(ggthemes)

Import - Part 1 : Singapore government expenditure comparing to other countries

We would be looking into Singapore expenditure using Current Health Expenditure (CHE) 1 - per person in USD per capita 2 - as a percentage of Gross Domestic Product (GDP)

We will choose countries of the same income group (High Income) and same region for comparison across countries.

At the same time, we also want to see how efficient is Singapore healthcare compared to other countries. Healthcare efficiency is to rank the life expectancy of the countries in relative to the CHE per person in USD per capita. We will use a scatterplot to show relationship between healthcare expenditure and national life expectancy in order to gain perspective on how to efficiently increase the quality of health of that country

Now, we will set the environment and import our dataset.

ExpUSD <- read.csv("CHE_USD.csv")
ExpGDP <- read.csv("CHE_GDP.csv")
countryname <- read.csv("countrylist.csv")

url <- "https://en.wikipedia.org/wiki/List_of_countries_by_life_expectancy"
countryefficiency <- read_html(url)
countryefficiency
## {xml_document}
## <html class="client-nojs" lang="en" dir="ltr">
## [1] <head>\n<meta http-equiv="Content-Type" content="text/html; charset= ...
## [2] <body class="mediawiki ltr sitedir-ltr mw-hide-empty-elt ns-0 ns-sub ...
efficiency_table <- html_nodes(countryefficiency, "table")
eff <- html_table(efficiency_table, fill=TRUE)[[1]] #1st element

This section aims to show the difference between healthcare expenditure over the years of similar countries to Singapore. Specifically, we would be looking into Singapore expenditure using Current Health Expenditure (CHE) per capita and as a percentage of Gross Domestic Product (GDP) for comparison across country (high income country of the same region).

Current Health Expenditure: Total expenditure on health is the sum of general government health expenditure and private health expenditure in a given year, calculated in national currency units in current prices

Gross Domestic Product: Total value of goods produced and services provided in a country during one year.

In the CHE_USD.csv dataset, CHE per capita USD is average expenditure on health per person in US dollar for 191 countries. Source is http://apps.who.int/nha/database/Select/Indicators/en.

In CHE_GDP.csv dataset, CHE % of GDP is the share of spending on health in each country relative to the size of its economy for a specific year for 191 countries. In comparison to CHE per capita USD, it allows for a comparison on not just an absolute cost value. Source is http://apps.who.int/nha/database/Select/Indicators/en.

The data from the above 2 dataset spans from year 2000 to 2016.

In countrylist.csv, it lists all the countries and their current incomes and regions. This list allows us to join the data from other dataset and filter same income group (High Income Countries) and same region (East Asia & Pacific) source is from https://datahelpdesk.worldbank.org/knowledgebase/articles/906519

For efficiency of healthcare system, we use web scrapping technique to get the data from wikipedia website. We measure a country’s healthcare efficiency as the longevity of that country citizen and the absolute cost of healthcare. We took the life expectancy for all countries listed, use the “Both Sexes Life Expectancy” figure and compared with the CHE USD per capita and did a scatterplot - showing ONLY the top 20 countries.

glimpse(ExpUSD)
## Observations: 192
## Variables: 20
## $ Countries  <fct> , Algeria, Angola, Benin, Botswana, Burkina Faso, B...
## $ Indicators <fct> , Current Health Expenditure (CHE) per Capita in US...
## $ X          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X2000      <fct> Value, 61, 13, 16, 195, 8, 8, 26, 62, 12, 10, 46, 1...
## $ X2001      <fct> Value, 67, 29, 17, 206, 8, 9, 28, 66, 10, 12, 43, 2...
## $ X2002      <fct> Value, 66, 29, 17, 145, 9, 8, 31, 71, 11, 18, 48, 1...
## $ X2003      <fct> Value, 75, 35, 22, 241, 12, 7, 39, 87, 13, 18, 57, ...
## $ X2004      <fct> Value, 92, 50, 24, 305, 18, 10, 42, 96, 13, 28, 63,...
## $ X2005      <fct> Value, 100, 54, 24, 281, 18, 13, 43, 98, 15, 32, 62...
## $ X2006      <fct> Value, 116, 69, 25, 269, 21, 16, 44, 115, 16, 35, 6...
## $ X2007      <fct> Value, 150, 92, 28, 300, 28, 18, 51, 130, 19, 35, 6...
## $ X2008      <fct> Value, 206, 135, 30, 321, 30, 18, 61, 130, 18, 38, ...
## $ X2009      <fct> Value, 207, 120, 31, 350, 32, 22, 58, 133, 20, 36, ...
## $ X2010      <fct> Value, 228, 97, 31, 381, 34, 26, 59, 148, 17, 36, 6...
## $ X2011      <fct> Value, 286, 122, 35, 471, 35, 25, 49, 158, 19, 39, ...
## $ X2012      <fct> Value, 334, 122, 39, 499, 33, 22, 63, 172, 19, 35, ...
## $ X2013      <fct> Value, 330, 144, 36, 378, 43, 22, 67, 187, 16, 47, ...
## $ X2014      <fct> Value, 358, 132, 34, 398, 40, 16, 71, 187, 18, 47, ...
## $ X2015      <fct> Value, 291, 109, 31, 389, 34, 14, 64, 159, 22, 36, ...
## $ X2016      <fct> Value, 260, 95, 30, 380, 41, 18, 64, 159, 16, 32, 5...
glimpse(ExpGDP)
## Observations: 264
## Variables: 63
## $ Country.Name   <fct> Aruba, Afghanistan, Angola, Albania, Andorra, A...
## $ Country.Code   <fct> ABW, AFG, AGO, ALB, AND, ARB, ARE, ARG, ARM, AS...
## $ Indicator.Name <fct> Current health expenditure (% of GDP), Current ...
## $ Indicator.Code <fct> SH.XPD.CHEX.GD.ZS, SH.XPD.CHEX.GD.ZS, SH.XPD.CH...
## $ X1960          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1961          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1962          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1963          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1964          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1965          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1966          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1967          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1968          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1969          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1970          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1971          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1972          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1973          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1974          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1975          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1976          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1977          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1978          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1979          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1980          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1981          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1982          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1983          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1984          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1985          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1986          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1987          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1988          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1989          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1990          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1991          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1992          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1993          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1994          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1995          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1996          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1997          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1998          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X1999          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X2000          <dbl> NA, NA, 2.503650, 6.761323, 9.348101, 4.200040,...
## $ X2001          <dbl> NA, NA, 5.191235, 6.515849, 9.363032, 4.339081,...
## $ X2002          <dbl> NA, 9.443391, 4.241619, 6.434332, 9.425646, 4.3...
## $ X2003          <dbl> NA, 8.941259, 4.703964, 6.353730, 9.178672, 4.1...
## $ X2004          <dbl> NA, 9.808473, 5.008040, 6.559773, 9.192787, 3.8...
## $ X2005          <dbl> NA, 9.948289, 3.976992, 6.342830, 9.778755, 3.4...
## $ X2006          <dbl> NA, 10.622766, 3.600971, 5.932672, 9.713320, 3....
## $ X2007          <dbl> NA, 9.904674, 3.216291, 6.129000, 9.808338, 3.5...
## $ X2008          <dbl> NA, 10.256496, 3.499375, 5.582805, 10.432809, 3...
## $ X2009          <dbl> NA, 9.818487, 3.584522, 5.102306, 10.895869, 4....
## $ X2010          <dbl> NA, 8.569671, 2.741499, 5.034761, 11.566204, 3....
## $ X2011          <dbl> NA, 8.561907, 2.843139, 5.634350, 11.825641, 3....
## $ X2012          <dbl> NA, 7.897168, 2.664953, 5.860722, 12.090213, 4....
## $ X2013          <dbl> NA, 8.805964, 2.992699, 6.343861, 14.158376, 4....
## $ X2014          <dbl> NA, 9.652356, 2.799603, 6.956531, 12.030786, 4....
## $ X2015          <dbl> NA, 10.297595, 2.946978, 6.824439, 11.974776, 4...
## $ X2016          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X2017          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ X2018          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
glimpse(countryname)
## Observations: 263
## Variables: 5
## $ Country.Code <fct> ABW, AFG, AGO, ALB, AND, ARB, ARE, ARG, ARM, ASM,...
## $ Region       <fct> Latin America & Caribbean, South Asia, Sub-Sahara...
## $ IncomeGroup  <fct> High income, Low income, Lower middle income, Upp...
## $ SpecialNotes <fct> Mining is included in agriculture
## Electricty and ...
## $ TableName    <fct> Aruba, Afghanistan, Angola, Albania, Andorra, Ara...
glimpse(eff)
## Observations: 183
## Variables: 9
## $ `Country and regions`              <chr> "Japan", "Switzerland", "Si...
## $ `Both sexesrank`                   <int> 1, 2, 3, 4, 4, 6, 6, 8, 9, ...
## $ `Both sexes lifeexpectancy`        <dbl> 83.7, 83.4, 83.1, 82.8, 82....
## $ `Female rank`                      <int> 1, 6, 2, 7, 3, 10, 7, 9, 12...
## $ `Female life expectancy`           <dbl> 86.8, 85.3, 86.1, 84.8, 85....
## $ `Male rank`                        <int> 6, 1, 10, 3, 9, 2, 6, 5, 4,...
## $ `Male life expectancy`             <dbl> 80.5, 81.3, 80.0, 80.9, 80....
## $ `Both sexesrank (HALE)`            <int> 1, 4, 2, 15, 9, 7, 5, 5, 12...
## $ `Both sexes lifeexpectancy (HALE)` <dbl> 74.9, 73.1, 73.9, 71.9, 72....

Tidy & Transform

  1. filter out unnecessary data rows
  2. remove “,” in the value fields and change them to numeric fields
  3. remove all null values
  4. rename the field of both files for ease of reading and joining purpose
ExpUSD <- ExpUSD[2:192,]
FList <- c("X2000", "X2001", "X2002", "X2003", "X2004", "X2005", "X2006", "X2007", "X2008", "X2009", "X2010", "X2011", "X2012", "X2013", "X2014", "X2015", "X2016")
for ( i in FList ) {
  
  ExpUSD[,i] <- gsub(",","",ExpUSD[,i])
  ExpUSD[,i] <- as.numeric(ExpUSD[,i])

  }
ExpUSD[sapply(ExpUSD, function(x) all(is.na(x)))] <- NULL
ExpGDP[sapply(ExpGDP, function(x) all(is.na(x)))] <- NULL

names(ExpUSD) <- c("Country.Name", "Indicators", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015", "2016")

names(ExpGDP) <- c("Country.Name", "Country.Code","Indicator.Name","Indicator.Code", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015")

names(countryname) <- c("Country.Code", "Region", "IncomeGroup", "SpecialNotes", "Country.Name")

names(eff) <- c("Country.Name", "BothSexRank", "BothSexExp", "FRank", "FLifeExp","MRank","MLifeExp","BSexRankH","BSexExpH")
eff$Country.Name <- as.factor(eff$Country.Name)

Model

For the preparation of the model, we created 2 joined tables (by USD, by GDP), so that we can filter the relevant group of countries based on their income and region. Filter those high income and East Asia Region Countries for similar comparison to Singapore

Convert the current wide data to long format to prepare for modeling

allcountryUSD <- ExpUSD %>% left_join(countryname)
## Joining, by = "Country.Name"
## Warning: Column `Country.Name` joining factors with different levels,
## coercing to character vector
highAPcountryUSD <- allcountryUSD %>%
  filter(IncomeGroup == "High income" & Region == "East Asia & Pacific") %>%
  select(- Indicators, - Region, - IncomeGroup, - SpecialNotes)

highAPcountryLongUSD <- melt(highAPcountryUSD,
                          id.vars=c("Country.Name", "Country.Code"))
                          
allcountryGDP <- ExpGDP %>% left_join(countryname)
## Joining, by = c("Country.Name", "Country.Code")
## Warning: Column `Country.Name` joining factors with different levels,
## coercing to character vector
## Warning: Column `Country.Code` joining factors with different levels,
## coercing to character vector
allcountryGDP <- allcountryGDP %>%
  filter(complete.cases(allcountryGDP))
highAPcountryGDP <- allcountryGDP %>%
  filter(IncomeGroup == "High income" & Region == "East Asia & Pacific" & !is.na(2000)| Country.Name == "World") %>%
  select(- Indicator.Code, - Indicator.Name, - Region, - IncomeGroup, -SpecialNotes)
highAPcountryLongGDP <- melt(highAPcountryGDP,
                  # ID variables - all the variables to keep but not split apart on
                  id.vars=c("Country.Name", "Country.Code"))

allcountryeff <- ExpUSD %>% left_join(eff)
## Joining, by = "Country.Name"
## Warning: Column `Country.Name` joining factors with different levels,
## coercing to character vector
allcountryeff45 <- allcountryeff %>%
  filter(BothSexRank<=20)

Visualize Part 1 (Singapore government expenditure comparing to other countries)

Figure 1 : To visualize, we created line chart to see the per capita spending and their trend across the countries over the years. X axis - categorical data (Year), Y axis - numerical data (US$ spending per capita)

Figure 2 : To visualize, we created line chart to see how much % of their GDP is spent on healthcare for each country. X axis - categorical data (Year), Y axis - numerical data (%age of GDP spent on healthcare)

Figure 3 : To visualize we created a scatterplot chart to show the correlaton of 2 factors (Life expectancy vs US$ spending on healthcare). X axis - Life Expectancy (by Age), Y axis - %age of GDP spent on healthcare

ggplot(data=highAPcountryLongUSD, mapping=aes(x=variable, y=value, color=Country.Name)) +
  geom_line(aes(group = Country.Name)) +
  theme(legend.position="none") +
  geom_dl(aes(label = Country.Name), method = list(dl.combine( "last.points"), cex = 0.8)) +
  xlab("Year") +
  ylab("US$ per capita") +
  labs(title="Figure 1 : Health Care Expenditure - East Asia & Pacific Region", 
       subtitle="Current Health Expenditure (CHE) per Capita in US$") 

ggplot(data=highAPcountryLongGDP, mapping=aes(x=variable, y=value, color=Country.Name)) +
  geom_line(aes(group = Country.Name)) +
  theme(legend.position="none") +
  geom_dl(aes(label = Country.Name), method = list(dl.combine( "last.points"), cex = 0.8)) +
  xlab("Year") +
  ylab("Healthcare % of Total GDP") +
  labs(title="Figure 2 : Public Healthcare Expenditure as share of GDP", 
       subtitle="High Income - East Asia Pacific Region") 

ggplot(data=allcountryeff45) +
  geom_point(mapping=aes(x=BothSexExp, y=allcountryeff45$`2016`, color=Country.Name), 
                         size=6, alpha=4/10, show.legend = FALSE) +
  geom_text(aes(label=Country.Name,x=BothSexExp, y=allcountryeff45$`2016`)) +
  labs(title="Figure 3 : HealthCare Efficiency Chart - Life Expectancy vs CHE per capital", 
       subtitle="Top 20 Efficient Healthcare Countries") +
  xlab("Life Expectancy") +
  ylab("US$ per capita") +
  scale_x_reverse() +
  scale_y_reverse()

Interpretation of the Results - Part 1

Figure 1. From 2000 to 2016 all countries had an increase in their CHE apart from Brunei who had minimal increases and decreases over the years. Beginning in 2000, all countries faced a small dip in CHE. However, from 2003 to 2005 Singapore had a slight dip before a huge increase in comparison to Palau and Brunei increasing from below 1000 to approximately 2500.

Notably, New Zealand also had an increase in CHE that was much higher than that of Singapore surpassing that of Japan in 2014 only. From 2004 to 2005 Australia increased her expenditure to surpass Japan. Both Australia and New Zealand had largely similar growth rates possibly suggesting that geographical proximity influences the CHE per capita costs since Singapore, Palau and Brunei had relatively close expenditures in 2000 as compared to the other three countries in the East Asia Pacific Region before 2005.

Figure 2. From 2000- 2015, all countries showed overall increases in public healthcare expenditure as a share of GDP. Comparing the percentage of GDP expenditure of the East Asia & Pacific Region to the world, most countries are spending a smaller share as compared to the rest of the world except for Japan after 2010 and Palau after 2005. East Asia countries like South Korea and Japan saw larger increases over the years as compare to other countries. Singapore and Brunei interestingly has a similar healthcare expenditure trend, which could be explained by the proximity of countries and close governmental ties.

Figure 3. Typically, you will expect to experience improved health outcomes (e.g longevity) as a country spends more on their health expenditures. Singapore appears in this chart as we filter the top 20 countries with the highest life expectancy. Upon closer look, Singapore is ranked 3 top in terms of life expectancy. However it only spends >$2000 per captia, it managed to achieve a life expectancy of 84 years old. Japan, Switerland - similar around the same life expectancy, spends 1.5x and 3x more than Singapore. In conclusion, Singapore is ranked top in terms of healthcare efficiency.

Import - Part 2 : Trend of Singapore government Health Expenditure over the years

We would be looking how the Singapore government budget has changed over the years. Singapore government’s public expenditure is finance by government revenue. These revenue comes from collection of taxes, fees, charges and and returns from reserve investment. Singapore government can spend the public expenditure on many needs - Defence, Education, Economy, Healthcare, etc..

We seek to determine the change in public expenditure trend over the years 1997 to 2016, and any significance changes over the years

Then, we set the environment and imported our dataset.

govtexp <- read.csv("GovtExpOrig.csv")

This section aims to show the trend in Total Public Expenditure by ministries from Year 1997 to Year 2016. Source is https://www.tablebuilder.singstat.gov.sg/publicfacing/createDataTable.action?refId=10304

The Singapore government spends on 4 main & subgroups of expenditures Social Development, Education, Health, National Development, The Environment And Water Resources, Culture, Community And Youth, Social And Family Development, Communications And Information, Manpower (Financial Security), Security And External Relations, Defence, Home Affair, Foreign Affairs, Economic Development, Transport, Trade And Industry, Manpower (Excluding Financial Security), Communications And Information, Government Administration, Finance, Law, Organs Of State, and the Prime Minister’s Office.

In this comparsion, we will be looking at 4 more significant expenditure, mainly Defence, Education, Home Affairs and Health. We will compare the trend over the years.

glimpse(govtexp)
## Observations: 44
## Variables: 21
## $ Variables <fct>  Total Operating Expenditure ,      Social Developme...
## $ X1997     <fct> 14,079.60, 5,479.60, 3,347.80, 896, 441.1, 318.4,  n...
## $ X1998     <fct> 14,651.50, 5,433.30, 3,167.40, 992.4, 402, 314.2,  n...
## $ X1999     <fct> 14,867.50, 5,409.50, 3,256.90, 935.8, 357.6, 329.4, ...
## $ X2000     <fct> 18,414.90, 6,653.80, 4,276.90, 1,071.50, 324.5, 390....
## $ X2001     <fct> 18,536.20, 7,769.90, 4,766.60, 1,445.50, 397.5, 413....
## $ X2002     <fct> 19,358.90, 7,945.60, 4,824.40, 1,450.90, 409.5, 447....
## $ X2003     <fct> 19,990.70, 8,614.80, 4,996.80, 1,904.20, 413.8, 453....
## $ X2004     <fct> 20,355, 8,499.60, 4,974.70, 1,604, 376.5, 478.8,  na...
## $ X2005     <fct> 21,444.70, 8,777.60, 5,215.30, 1,680.40, 335.6, 408....
## $ X2006     <fct> 23,924.60, 10,519.90, 6,351.70, 1,839.50, 671.4, 413...
## $ X2007     <fct> 25,952.10, 11,474.60, 6,785.60, 2,019.50, 899.9, 453...
## $ X2008     <fct> 28,733.60, 13,200.20, 7,476.50, 2,378.50, 805.1, 605...
## $ X2009     <fct> 30,908.90, 14,714.20, 7,837.90, 2,920.20, 959.3, 656...
## $ X2010     <fct> 33,270.10, 16,458.90, 8,998.70, 3,258, 789.5, 720,  ...
## $ X2011     <fct> 35,150, 18,056.30, 9,697.80, 3,488.80, 1,428, 726.4,...
## $ X2012     <fct> 36,420.80, 18,495.60, 9,637.30, 4,066.10, 948.8, 812...
## $ X2013     <fct> 39,724.90, 20,943.20, 10,664.90, 5,043.90, 673.3, 95...
## $ X2014     <fct> 42,685.20, 22,611.80, 10,712.40, 5,872.40, 857.9, 1,...
## $ X2015     <fct> 48,090.40, 26,258.20, 11,235.70, 7,519.80, 1,331.60,...
## $ X2016     <fct> 52,128.90, 28,939.50, 11,812.20, 8,199.40, 2,284.10,...

Tidy & Transform

  1. Remove all the main total expenditure
  2. Field cleansing – remove , in values
  3. Field type – change to numeric, character
  4. Remove all NA data
  5. Field conversion – change value to % of total
  6. Filter – remove expenditures less than 8% (Top 4 expenditure)
govtexp <- govtexp[c(1:1,3:10, 12:14, 16:19, 21:24),]
FList <- c("X1997", "X1998", "X1999","X2000", "X2001", "X2002", "X2003", "X2004", "X2005", "X2006", "X2007", "X2008", "X2009", "X2010", "X2011", "X2012", "X2013", "X2014", "X2015", "X2016")

for ( i in FList ) 
{
  govtexp[,i] <- gsub(",","",govtexp[,i])
  govtexp[,i] <- suppressWarnings(as.numeric(govtexp[,i]))
}
govtexp$Variables <- as.character(govtexp$Variables)

govtexp <- govtexp %>%
  filter( trimws(Variables) != "Total Operating Expenditure")

govtexp <- govtexp %>%
  filter(! is.na(X1997))

# convert to percent
FList <- c("X1997", "X1998", "X1999","X2000", "X2001", "X2002", "X2003", "X2004", "X2005", "X2006", "X2007", "X2008", "X2009", "X2010", "X2011", "X2012", "X2013", "X2014", "X2015", "X2016")
for ( i in FList ) 
{
  govtexp[,i] <- govtexp[,i] / sum(govtexp[,i]) * 100
}

names(govtexp) <- c("Variables", "1997","1998","1999","2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016")

Model

Filter the those expenditure that is >8% of the total expenditure. Listed out the top 4 expenditures (Defence, Education, Home Affairs, Healthcare)

Convert the current wide data to long format to prepare for modeling

govtexp <- govtexp %>%
  filter(govtexp$`2016` > 8)

longgovtexp <- melt(govtexp,id.vars=c("Variables"))

Visualize Part 2 (Trend of Singapore government Health Expenditure over the years)

To visualize we create a line chart to see the trend of the spending of the 4 main expenditure.

 ggplot(data=longgovtexp, mapping=aes(x=variable, y=value, color=Variables)) + 
  geom_line(aes(group = Variables)) +
  theme(legend.position="none") +
  geom_dl(aes(label = Variables), method="smart.grid", cex = 0.5) +
  xlab("Year") +
  ylab("%age of Government Total Expenditure") +
  labs(title="Figure 4 : Percentage of Singapore Government Expenditure", 
       subtitle="Year 1997 to 2016") 

Interpretation of the Results - Part 2

Figure 4 : The top 4 expenditure for the Singapore government from 1997 to 2016 are Defence, Education, Home Affairs and Health.

The top 2 expenditures, education and healthcare have over the years diverged to around 25%. In 1997, Home Affairs and Health were around 6%, however after 2008, expenditure in healthcare saw a gradual rise to 10% by 2010 and a rapid increase to 16% by 2015.

This shows the reprioritization of the Singapore government’s efforts in social spending, specifically for healthcare, and less on education, domestic and international security.

Import - Part 3 : Trend for healthcare expenditure in Singapore household over the years

In Hh_TotExp.csv dataset : We will be looking at expenditure for the individual household level over the years. The Singapore Department of Statistics (DOS)undertakes the Household Expenditure Survey (HES) once in 5 years to collect detailed information from resident households in Singapore. The data spans across the 3 years of collection, Yr2002, Yr2007, Yr2012. Source is from https://www.tablebuilder.singstat.gov.sg

We will compare Average household expenditure by type of good and services and find the trend of percentage for healthcare expenditure growth over the years compare to other expenditures.

In Hh_PerExp.csv dataset : We will be looking at percentage spent on healthcare over the 5 different income quintile over 3 years of survey data (2002,2007,2012). Knowing the upward trend of healthcare expenditure as a whole, we want to find the impact over the different income groups. Source is from https://data.gov.sg/dataset/household-healthcare-expenditure

Then, we set the environment and imported our dataset.

householdExp <- read.csv("Hh_TotExp.csv")
householddata <- read.csv("Hh_PerExp.csv")
incomeavona <- read.csv("Hh_PerExp.csv")

Dataset 1 : The data imported includes Average Monthly Household Expenditure by types of Goods and Services ($).

The types include FOOD AND NON-ALCOHOLIC BEVERAGES ALCOHOLIC BEVERAGES AND TOBACCO CLOTHING AND FOOTWEAR HOUSING AND UTILITIES FURNISHINGS, HOUSEHOLD EQUIPMENT AND ROUTINE HOUSEHOLD MAINTENANCE HEALTH TRANSPORT COMMUNICATION RECREATION AND CULTURE EDUCATIONAL SERVICES FOOD SERVING SERVICES ACCOMMODATION SERVICES MISCELLANEOUS GOODS AND SERVICES

DataSet 2 : The data shows the Percentage expenditure on healthcare expressed as a total value of 100% (all expenditure) Income Quintile is Based on ranking of all resident households by their monthly household income from all sources (including employer CPF contributions) per household member.

glimpse(householdExp)
## Observations: 42
## Variables: 3
## $ Year  <int> 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 2002, 20...
## $ Item  <fct>  FOOD AND NON-ALCOHOLIC BEVERAGES ,  ALCOHOLIC BEVERAGES...
## $ Value <dbl> 332.2, 52.2, 127.1, 255.1, 217.2, 171.1, 614.6, 171.6, 4...
glimpse(householddata)
## Observations: 18
## Variables: 4
## $ period_start                         <int> 2002, 2002, 2002, 2002, 2...
## $ period_end                           <int> 2003, 2003, 2003, 2003, 2...
## $ income_quintile                      <fct> All Households, 1st quint...
## $ percentage_expenditure_on_healthcare <dbl> 4.7, 5.7, 4.8, 4.8, 4.8, ...
glimpse(incomeavona)
## Observations: 18
## Variables: 4
## $ period_start                         <int> 2002, 2002, 2002, 2002, 2...
## $ period_end                           <int> 2003, 2003, 2003, 2003, 2...
## $ income_quintile                      <fct> All Households, 1st quint...
## $ percentage_expenditure_on_healthcare <dbl> 4.7, 5.7, 4.8, 4.8, 4.8, ...

Tidy & Transform

  1. Change field name to consolidated group names
householdExp$Value <- gsub(",","",householdExp$Value)

householdExp$Item <- gsub(" FOOD AND NON-ALCOHOLIC BEVERAGES ","FOOD",householdExp$Item)
householdExp$Item <- gsub(" ALCOHOLIC BEVERAGES AND TOBACCO ","FOOD",householdExp$Item)
householdExp$Item <- gsub(" CLOTHING AND FOOTWEAR ","CLOTHING AND FOOTWEAR",householdExp$Item)
householdExp$Item <- gsub(" HOUSING AND UTILITIES ","HOUSING RELATED",householdExp$Item)
householdExp$Item <- gsub(" FURNISHINGS, HOUSEHOLD EQUIPMENT AND ROUTINE HOUSEHOLD MAINTENANCE ","HOUSING RELATED",householdExp$Item)
householdExp$Item <- gsub(" HEALTH ","HEALTH",householdExp$Item)
householdExp$Item <- gsub(" TRANSPORT ","TRANSPORT",householdExp$Item)
householdExp$Item <- gsub(" COMMUNICATION ","COMMUNICATION",householdExp$Item)
householdExp$Item <- gsub(" RECREATION AND CULTURE ","RECREATION AND CULTURE",householdExp$Item)
householdExp$Item <- gsub(" EDUCATIONAL SERVICES ","EDUCATION",householdExp$Item)
householdExp$Item <- gsub(" FOOD SERVING SERVICES ","FOOD",householdExp$Item)
householdExp$Item <- gsub(" ACCOMMODATION SERVICES ","HOUSING RELATED",householdExp$Item)
householdExp$Item <- gsub(" MISCELLANEOUS GOODS AND SERVICES ","OTHERS",householdExp$Item)
householdExp$Item <- gsub(" NON-ASSIGNABLE EXPENDITURE ","OTHERS",householdExp$Item)
householdExp$Value <- as.numeric(householdExp$Value)
householdExp$Year <- as.factor(householdExp$Year)
  
names(householddata) <- c("PStart", "PEnd", "Quintile", "Percentage")

smallincomeavona <- incomeavona %>%
  filter(income_quintile  != "All Households") 
smallincomeavona$income_quintile <- gsub("quintile",""                                        ,smallincomeavona$income_quintile)
obj_aov_x1 <- aov(percentage_expenditure_on_healthcare
                  ~ income_quintile, data = smallincomeavona)
summary(obj_aov_x1)
##                 Df Sum Sq Mean Sq F value Pr(>F)  
## income_quintile  4  3.304  0.8260   5.057 0.0172 *
## Residuals       10  1.633  0.1633                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model

Dataset 1 : Sum the total for the same categories using Group by Functions, so as to present the total value for that category As the observations are independent and largely normal, the data for household expenditure is used for the purpose of this one-way analysis of variance. For the post-hoc comparison, we use Tukey’s Honest Significant Difference (HSD) method.

householdExpGrp <- householdExp %>%
  group_by(Year, Item) %>%
  summarize(TotValue = sum(Value))

TukeyHSD(obj_aov_x1, which = 'income_quintile', ordered = FALSE)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = percentage_expenditure_on_healthcare ~ income_quintile, data = smallincomeavona)
## 
## $income_quintile
##                    diff        lwr        upr     p adj
## 2nd -1st  -8.666667e-01 -1.9526684  0.2193351 0.1381551
## 3rd -1st  -7.000000e-01 -1.7860017  0.3860017 0.2829865
## 4th -1st  -8.666667e-01 -1.9526684  0.2193351 0.1381551
## 5th -1st  -1.466667e+00 -2.5526684 -0.3806649 0.0085372
## 3rd -2nd   1.666667e-01 -0.9193351  1.2526684 0.9849584
## 4th -2nd  -8.881784e-16 -1.0860017  1.0860017 1.0000000
## 5th -2nd  -6.000000e-01 -1.6860017  0.4860017 0.4148524
## 4th -3rd  -1.666667e-01 -1.2526684  0.9193351 0.9849584
## 5th -3rd  -7.666667e-01 -1.8526684  0.3193351 0.2144317
## 5th -4th  -6.000000e-01 -1.6860017  0.4860017 0.4148524

Visualize Part 3 (Trend for healthcare expenditure in Singapore household over the years)

Figure 5 : To visualize, we created a bar chart to show the 3 years growth for each category; sorted by the highest to lowest.

Figure 6 : To visualize, we created a clustered group (By Year) chart to show the percentage each income quartile spend on healthcare for each of the 3 years.

Figure 7 : To visualise, the plot shows the confidence levels within the different income quintiles to ascertain statistical significance between the different income quintiles.

ggplot(householdExpGrp, aes(y=TotValue, x=reorder(Item, TotValue), fill=Year, level=TotValue)) + 
  geom_bar(aes(group = Year), stat="identity", position=position_dodge()) +
  ggtitle("Figure 5 : Singapore Average Monthly Household Expenditure") +
  ylab("Total Value - SGD") +
  xlab("Types of Good and Services By Year") +
  coord_flip()

ggplot(householddata, aes(y=Percentage, x=Quintile, fill=Quintile)) + 
  geom_bar(stat="identity", position=position_dodge()) +
  geom_text(aes(label=Percentage), vjust=0) +
  scale_x_discrete(name ="Quintile", labels=c("1st","2nd","3rd", "4th", "5th","All")) +
  ggtitle("Figure 6 : Singapore Household Expenditure On Healthcare By Percentile") +     
  theme_hc() + 
  facet_wrap(~PStart) 

plot(TukeyHSD(obj_aov_x1, which = 'income_quintile'), cex.axis = 0.7, cex.main = 0.1) +
  title("Figure7:", line=1.5, adj=0)

## integer(0)

Interpretation of the Results - Part 3

Figure 5 : Household expenditure on Healthcare saw an increase inline with the government spending from 180 in 2002 to 250 in 2012. However, rate of increase at a slower pace compared to other expenditure (Food, Housing & Transport, Education)

Possible reason of higher rate for other services 1. Lifestyle change. Singapore have more food choices from hawker center to restuarants, spending more budget eating out & also spending more on overseas travel holidays. 2. Housing/Transport. Higher cost of home ownership and car 3. Education. Singapore spending more on tuition and upgrading of skills

Increases to monthly household expenditure in general is inevitable due to inflation but large spikes suggest that there is a shift in priorities.

Figure 6 : Across the 3 measured years, there were minimal increases or decreases in healthcare expenditure across the different income quintiles. Expectedly the 1st quintile households spend the most percentage of their income while the 5th quintile spends the least. However it is interesting to note that the for all household (despite increasing healthcare cost), the percentage spent in 2012 is less than that of both 2002 and 2007.

It is possible that

  1. the government was able to reduce the individual burden on the average Singaporean via public insurance financing scheme

  2. the average Singaporean are well educated to cover using private insurance financing scheme

To ensure that income quitile truly affects the amount of cost a person has to pay, a one-way analysis of variance was used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups.

Figure7: ANOVA test tells you whether you have an overall difference between your groups, but it does not tell you which specific groups differed, however, post hoc tests do. As the p-value of less than 0.05 suggest a significance but we are unable to figure out which income quintiles are satistically different from the other. After the Tukey Honestly Significant test is performed, we can see that the 1st and 5th income quartile is statistically significant from each other unlike the rest. From the plot, the differences in the mean levels of the 1st and 5th quintile also does not cross 0.

Tidy & Transform

Using the R based package, we extract and combine the daily adjusted stock prices across the different stocks.

sg <- cbind(HawPar$H02.SI.Adjusted, RafflesMed$BSL.SI.Adjusted, Healthway$`5NG.SI.Adjusted`, ThomsonMed$A50.SI.Adjusted,
            QMDental$QC7.SI.Adjusted)
mnc <-cbind(SanofiSA$SAN.Adjusted,GSK$GSK.Adjusted,Merck$MRK.Adjusted,NovartisAG$NOVN.Adjusted,Scherig$SGP.Adjusted)

Visualize Part 4 (Trend for healthcare company stocks over the years)

The daily adjusted stock prices of the individual companies are shown in Figure 8 and Figure 9 below.

plot.xts(sg, main="Figure 8: Singapore Healthcare Companies")

addLegend("topleft", lty=rep(1,5), lwd= rep(1,5),legend.names =c( "HawPar", "Raffles Medical", "Healthway", "Thomson Medical", "Q&M Dental"), col=rainbow((10)))

plot.xts(mnc, main="Figure 9: MNC Healthcare Companies in Singapore")

addLegend("topleft", lty=rep(1,5), lwd= rep(1,5),legend.names =c( "SAN", "GSK", "Merck", "NovartisAG", "Scherig"), col=rainbow((10))) 

Stock prices in Singapore are largely the same, hovering below $2 except for Haw Par.

The Multi National companies situated in Singapore but listed in other stock exchanges however, have a much higher price. Although not directly influencing the Singapore market, the global trade that Singapore is part of is also benefits Singapore, and the exports are helpful in driving the trade in Singapore but not the domestic economy.

Conclusion

Overall, the health of Singapore’s healthcare is driven less so by the private sector but more so by the public sector.

Implications

Prof. Roh’s Note: “This is where you provide the significance of the findings. Unlike the other sections, where your goal is to describe the results that you found (what the data told you). This is where you chime in and proactively discuss the meaning of the results.”

Singapore, afterall does not seem to be as efficient an economy in certain aspects.

Limitations and Future Directions

1 - Total health expenditure generally includes preventive and curative health services, family planning, nutrition activities and emergency aid.

2 - # South East Asia does not equate South Asia does not equate East Asia East Asia specifically refers to China, Japan and South Korea while the Pacific refers to Australia and New Zealand. Although not as close in proximity to Singapore and not like a city state, data for comparison to Hong Kong is unavailable. The presence of metropolitan cities like that of Beijing and Shanghai in China, Tokyo and Osaka in Japan, Sydney and Melbourne in Australia, Auckland and Wellington in New Zealand provide similar proxies, but the data for such cities are unavailable and it would be unfair to compare specific cities within countries when CHE is used as the variable to represent government healthcare expenditure.