Introduction

This report analyzes regional disparities in healthcare access and financial burden in India. We focus on:
1. Health insurance coverage across regions.
2. Hospitalization costs in public and private facilities.
3. The relationship between health insurance coverage and hospitalization costs.

Download the RMarkdown File for Analysis

Click here to download the RMarkdown File

Setup and Data Loading

In this section, we load the necessary libraries and the dataset.

Libraries loading

# Install necessary packages (if not already installed)
if(!require(readxl)) install.packages("readxl")
## Loading required package: readxl
if(!require(tidyverse)) install.packages("tidyverse")
## Loading required package: tidyverse
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
if(!require(ggplot2)) install.packages("ggplot2")

# Load libraries
library(readxl)
library(tidyverse)
library(ggplot2)

dataset loading

# Load the Excel file
data <- read_excel("LASI_India_State_UT_Factsheet-Final Checked 23.08.22.xlsx")

Preview the first few rows

head(data)
## # A tibble: 6 × 191
##   Indicators                `Mean Household Size1` 0-14 Years of Household pop…¹
##   <chr>                                      <dbl>                         <dbl>
## 1 INDIA                                        5.7                          26.7
## 2 Andaman & Nicobar Islands                    5.3                          22.7
## 3 Andhra Pradesh                               4.6                          22  
## 4 Arunachal Pradesh                            5.4                          32  
## 5 Assam                                        5.3                          27  
## 6 Bihar                                        6.6                          35.6
## # ℹ abbreviated name: ¹​`0-14 Years of Household population (%)`
## # ℹ 188 more variables: `15-44 Years of Household population (%)` <dbl>,
## #   `45-59 Years of Household population (%)` <dbl>,
## #   `60-69 Years of Household population (%)` <dbl>,
## #   `70-79 Years of Household population (%)` <dbl>,
## #   `80+ Years of Household population (%)` <dbl>,
## #   `60-74 Years of Household population (%)` <dbl>, …

List of variables in the dataset

names(data)
##   [1] "Indicators"                                                                                                                   
##   [2] "Mean Household Size1"                                                                                                         
##   [3] "0-14 Years of Household population (%)"                                                                                       
##   [4] "15-44 Years of Household population (%)"                                                                                      
##   [5] "45-59 Years of Household population (%)"                                                                                      
##   [6] "60-69 Years of Household population (%)"                                                                                      
##   [7] "70-79 Years of Household population (%)"                                                                                      
##   [8] "80+ Years of Household population (%)"                                                                                        
##   [9] "60-74 Years of Household population (%)"                                                                                      
##  [10] "75+ Years of Household population (%)"                                                                                        
##  [11] "Sex Ratio (Females per 1000 Males ) All ages"                                                                                 
##  [12] "Sex Ration (Females per 1000 Males) 60 + population"                                                                          
##  [13] "Death Rate ( per thousand population ) All ages2"                                                                             
##  [14] "Death Rate ( per thousand population ) 60 + population2"                                                                      
##  [15] "Households with improved sanitation (%) 4"                                                                                    
##  [16] "Households practicing open defecation (%)5"                                                                                   
##  [17] "Households with water facility inside dwelling/own yard (%)"                                                                  
##  [18] "Households with improved drinking water source (%)6"                                                                          
##  [19] "Households with electricity (%)"                                                                                              
##  [20] "Households using clean cooking fuel (%)7"                                                                                     
##  [21] "Households exposed to indoor pollution (%)8"                                                                                  
##  [22] "Households with pucca house (%)9"                                                                                             
##  [23] "Household Monthly Per Capita Consumption Expenditure (MPCE) in INR 10"                                                        
##  [24] "Household Per Capita Food Expenditure as a share of MPCE (%)"                                                                 
##  [25] "Household Per Capita Health Expenditure as a share of MPCE (%)"                                                               
##  [26] "Households owning current residence (%)"                                                                                      
##  [27] "Households owning television, refrigerator, mobile phone and any motorized vehicle (%) 11"                                    
##  [28] "Households who had taken any loan (%)"                                                                                        
##  [29] "Per Capita Annual Household Income (in INR) 12"                                                                               
##  [30] "Mean income from agricultural and allied activities by Source (in INR) 13"                                                    
##  [31] "Mean income from non-agricultural business or self-employed activities by Source (in INR)"                                    
##  [32] "Mean income from wages/salaries by Source (in INR) 14"                                                                        
##  [33] "Mean income from pension by Source (in INR)  15"                                                                              
##  [34] "Mean income from government/public transfers by Source (in INR) 16"                                                           
##  [35] "Households covered by any health insurance (%) 17"                                                                            
##  [36] "Households covered by Rashtriya Swasthya Bima Yojana (RSBY) & allied schemes (%) 18"                                          
##  [37] "Households covered by Central Government Health Scheme (CGHS)/Employee State Insurance Scheme (ESIS) (%)"                     
##  [38] "Households covered by medical reimbursement/health insurance from an employer (%)"                                            
##  [39] "Households covered by privately purchased commercial health insurance (%)"                                                    
##  [40] "Literate (%)"                                                                                                                 
##  [41] "No schooling (%)"                                                                                                             
##  [42] "Less than 5 years school complete (%)"                                                                                        
##  [43] "5-9 years school complete (%)"                                                                                                
##  [44] "10 or more years school complete (%)"                                                                                         
##  [45] "Currently married (%)"                                                                                                        
##  [46] "Widowed (%)"                                                                                                                  
##  [47] "Ever worked (%)19"                                                                                                            
##  [48] "Currently working (%)20"                                                                                                      
##  [49] "Agricultural and allied activities (%)21"                                                                                     
##  [50] "Non-agricultural business activities (%)22"                                                                                   
##  [51] "Wage and salary workers (%)23"                                                                                                
##  [52] "Agricultural and allied activities21"                                                                                         
##  [53] "Non-agricultural business activities22"                                                                                       
##  [54] "Wage and salary workers23"                                                                                                    
##  [55] "Mean income from all sources 25"                                                                                              
##  [56] "Persons seeking job (%)26"                                                                                                    
##  [57] "Covered under work related pension scheme (%)27"                                                                              
##  [58] "Covered under Provident Fund (%)27"                                                                                           
##  [59] "Officially retired from organized sector of employment (%)28"                                                                 
##  [60] "Currently receiving retirement pension (%)28"                                                                                 
##  [61] "Living alone (%)"                                                                                                             
##  [62] "Living with spouse and/or others (%)"                                                                                         
##  [63] "Living with spouse and children (%)"                                                                                          
##  [64] "Living with children and others (%)"                                                                                          
##  [65] "Living with others only (%)"                                                                                                  
##  [66] "Satisfied with current living arrangement (%)29"                                                                              
##  [67] "Shares most of Personal Matters with Spouse/Partner (for age 60 and above having spouse) (%)"                                 
##  [68] "Children/Grand children (%)"                                                                                                  
##  [69] "Received financial support from Family/Friends during Past 12 months (%)"                                                     
##  [70] "Provided financial support from Family/Friends during past 12 months (%)"                                                     
##  [71] "Having family members who are unable to carry out basic daily activities (%)31"                                               
##  [72] "Role in Decision Making in Marriage of son or daughter (%)"                                                                   
##  [73] "Role in Decision Making in Buying and selling of property (%)"                                                                
##  [74] "Role in Decision Making in Education of family member/s (%)"                                                                  
##  [75] "Experienced any ill-treatment during the Last One Year (for age 60 and above only)  (%)"                                      
##  [76] "Persons reporting satisfied with their own life (%)34"                                                                        
##  [77] "Awareness of Indira Gandhi National Old Age Pension Scheme (%)"                                                               
##  [78] "Awareness of Indira Gandhi Widow Pension Scheme (%)"                                                                          
##  [79] "Receiving Benefits from Indira Gandhi National Old Age Pension Scheme (%)"                                                    
##  [80] "Receiving Benefits from Indira Gandhi Widow Pension Scheme (%)"                                                               
##  [81] "Aware of any concession given by government to elderly (%)36"                                                                 
##  [82] "Received any concession or benefit (%)37"                                                                                     
##  [83] "Aware of “Maintenance and Welfare of Parents and Senior Citizens Act” (%)38"                                                  
##  [84] "Currently smoking (%)39"                                                                                                      
##  [85] "Currently consuming tobacco (%)40"                                                                                            
##  [86] "Prevalence of heavy episodic drinking (%)41"                                                                                  
##  [87] "Physically active (%)42"                                                                                                      
##  [88] "Yoga practice, meditation, asana and pranayama (%)43"                                                                         
##  [89] "Poor Self Rated Health (SRH) (%)45"                                                                                           
##  [90] "Cardiovascular diseases (CVDs) (%)46"                                                                                         
##  [91] "Hypertension or high blood pressure (%)"                                                                                      
##  [92] "Chronic heart diseases (%)"                                                                                                   
##  [93] "Stroke (%)"                                                                                                                   
##  [94] "Diabetes or high blood sugar (%)"                                                                                             
##  [95] "High Cholesterol (%)"                                                                                                         
##  [96] "Anaemia (%)"                                                                                                                  
##  [97] "Chronic lung diseases (%)47"                                                                                                  
##  [98] "Chronic Obstructive Pulmonary Disease (COPD) (%)"                                                                             
##  [99] "Asthma (%)"                                                                                                                   
## [100] "Bone/Joint diseases (%)48"                                                                                                    
## [101] "Arthritis (%)"                                                                                                                
## [102] "Osteoporosis (%)"                                                                                                             
## [103] "Neurological or psychiatric problems (%)49"                                                                                   
## [104] "Depression (%)"                                                                                                               
## [105] "Alzheimer’s disease and dementia (%)"                                                                                         
## [106] "Psychiatric problems (%)50"                                                                                                   
## [107] "Neurological problems (%)51"                                                                                                  
## [108] "Prevalence of diagnosed Cancer or Malignant Tumor (%) 52"                                                                     
## [109] "Prevalence of diseases or conditions related to urogenital systems (%) 53"                                                    
## [110] "Eye or vision related conditions or problems (%) 54"                                                                          
## [111] "Cataract (%)"                                                                                                                 
## [112] "Glaucoma (%)"                                                                                                                 
## [113] "Refractive error (%)55"                                                                                                       
## [114] "Hearing or ear-related problems (%)"                                                                                          
## [115] "Common oral health problems (%)56"                                                                                            
## [116] "Dental caries (%)"                                                                                                            
## [117] "Periodontal disease (%)57"                                                                                                    
## [118] "Partial edentulism (%)"                                                                                                       
## [119] "Complete edentulism (%)"                                                                                                      
## [120] "Injuries (%)58"                                                                                                               
## [121] "Fall (%)"                                                                                                                     
## [122] "Health problems due to natural and man-made disaster (%)60"                                                                   
## [123] "Permanent physical disability (%)"                                                                                            
## [124] "Psychological and mental health problems (%)"                                                                                 
## [125] "Chronic illness (%)"                                                                                                          
## [126] "Any endemic disease (%)61"                                                                                                    
## [127] "Any vector-borne disease (%)62"                                                                                               
## [128] "Malaria (%)"                                                                                                                  
## [129] "Dengue (%)"                                                                                                                   
## [130] "Chikungunya (%)"                                                                                                              
## [131] "Any water-borne disease (%)63"                                                                                                
## [132] "Diarrhoea/Gastroenteritis (%)"                                                                                                
## [133] "Typhoid (%)"                                                                                                                  
## [134] "Jaundice/Hepatitis (%)"                                                                                                       
## [135] "Tuberculosis (%)"                                                                                                             
## [136] "Urinary Tract Infection (%)"                                                                                                  
## [137] "Angina (symptom based) (%) 65"                                                                                                
## [138] "Sleep problems (%) 66"                                                                                                        
## [139] "Any reproductive health problem (%) 67"                                                                                       
## [140] "Undergone Hysterectomy (%)"                                                                                                   
## [141] "Undergone Pap Smear Test (%)68"                                                                                               
## [142] "Undergone Mammography (%) 68"                                                                                                 
## [143] "Prevalence of Chronic Conditions Hypertension among Family Members (%)"                                                       
## [144] "Prevalence of Chronic Conditions Diabetes among Family Members (%)"                                                           
## [145] "Prevalence of Chronic Conditions Heart disease among Family Members (%)"                                                      
## [146] "Prevalence of Chronic Conditions Stroke among Family Members (%)"                                                             
## [147] "Prevalence of Chronic Conditions Cancer among Family Members (%)"                                                             
## [148] "Any Activities of Daily Living (ADL) Limitations  (%)71"                                                                      
## [149] "Any Instrumental Activities of Daily Living (IADL) Limitations (%) 72"                                                        
## [150] "Persons who need helpers for ADL and IADL limitations (%)73"                                                                  
## [151] "Any aid or supportive device using (%)74"                                                                                     
## [152] "Hearing aid using (%)75"                                                                                                      
## [153] "Spectacles/Contact lenses using (%)75"                                                                                        
## [154] "Denture using  (%)75"                                                                                                         
## [155] "Walker/Walking stick using (%)75"                                                                                             
## [156] "Mean score for immediate word recall 76"                                                                                      
## [157] "Mean score of delayed word recall 77"                                                                                         
## [158] "Prevalence of depression based on CIDI-SF (%)78"                                                                              
## [159] "Hypertension (%)79"                                                                                                           
## [160] "Undiagnosed hypertension (%)80"                                                                                               
## [161] "Untreated hypertension (%)81"                                                                                                 
## [162] "Undertreated hypertension (%)82"                                                                                              
## [163] "Adequately treated hypertension (%)83"                                                                                        
## [164] "Low vision Measured Prevalence of Vision Test (%)84"                                                                          
## [165] "Low near vision Measured Prevalence of Vision Test (%)85"                                                                     
## [166] "Low distance vision Measured Prevalence of Vision Test (%)86"                                                                 
## [167] "Blindness Measured Prevalence of Vision Test (%) 87"                                                                          
## [168] "Underweight by Anthropometric Indicators   (%)88"                                                                             
## [169] "Overweight by Anthropometric Indicators  (%)88"                                                                               
## [170] "Obesity by Anthropometric Indicators   (%)88"                                                                                 
## [171] "High-risk waist circumference (%)89"                                                                                          
## [172] "Metabolic risk: Prevalence of high-risk waist-hip ratio (%)90"                                                                
## [173] "Mean grip strength in dominant hand (kg)91"                                                                                   
## [174] "Hospitalization in past 12 months (%)"                                                                                        
## [175] "Type of Facility Visited during the Last Hospitalization in the Past 12 Months by Public facility (%)92"                      
## [176] "Type of Facility Visited during the Last Hospitalization in the Past 12 Months by Private facility (%)93"                     
## [177] "Sought out-patient care in the past 12 months (%)"                                                                            
## [178] "Persons who consumed any medicine without consulting healthcare provider (%)94"                                               
## [179] "Type of Facility Visited for the Last Out-Patient Visit in the Past 12 Months by Public facility (%)"                         
## [180] "Type of Facility Visited for the Last Out-Patient Visit in the Past 12 Months by Private facility (%)"                        
## [181] "Mean expenditure on last hospitalization in the Past 12 Months by Type of Facilities Visited (in INR) 95"                     
## [182] "Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95"   
## [183] "Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95"  
## [184] "Sources of Finance for Health Care Services during the Last Hospitalization by Income (%) 96"                                 
## [185] "Sources of Finance for Health Care Services during the Last Hospitalization by Savings (%)"                                   
## [186] "Loans (banks/friends/relatives) /selling assets and properties (%)"                                                           
## [187] "Insurance coverage/reimbursement from employer (%)"                                                                           
## [188] "Mean expenditure on last out-patient visitin the Past 12 Months by Type of Facilities Visited (in INR) 97"                    
## [189] "Mean expenditure on last out-patient visit (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 97" 
## [190] "Mean expenditure on last out-patient visit (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 97"
## [191] "Health insurance coverage (%)"

Chosen variables of interest:

  1. Health insurance coverage (%).
  2. Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR).
  3. Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR).
  4. Indicator (regions)

Create a new dataset with only the selected variables

data_2 <- data %>%
  select( Indicators ,`Health insurance coverage (%)`,
         `Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`,
         `Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)

Preview the first few rows

head(data_2)
## # A tibble: 6 × 4
##   Indicators                Health insurance coverage (…¹ Mean expenditure on …²
##   <chr>                     <chr>                                          <dbl>
## 1 INDIA                     20.7                                           52022
## 2 Andaman & Nicobar Islands [0.1]                                         127099
## 3 Andhra Pradesh            37.799999999999997                             34054
## 4 Arunachal Pradesh         6.5                                            60415
## 5 Assam                     52.7                                           37131
## 6 Bihar                     1.2                                            32608
## # ℹ abbreviated names: ¹​`Health insurance coverage (%)`,
## #   ²​`Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`
## # ℹ 1 more variable:
## #   `Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95` <dbl>

Exclude the ‘INDIA’ row from from the indicators(region) variable

Note: The “INDIA” row, representing national averages, was excluded from regional analyses to prevent bias. The national data is used separately for comparison purposes.

# Exclude the 'INDIA' row from regional analysis
regional_data_2 <- data_2 %>% filter(Indicators != "INDIA")
head(regional_data_2)
## # A tibble: 6 × 4
##   Indicators                Health insurance coverage (…¹ Mean expenditure on …²
##   <chr>                     <chr>                                          <dbl>
## 1 Andaman & Nicobar Islands [0.1]                                         127099
## 2 Andhra Pradesh            37.799999999999997                             34054
## 3 Arunachal Pradesh         6.5                                            60415
## 4 Assam                     52.7                                           37131
## 5 Bihar                     1.2                                            32608
## 6 Chandigarh                13.1                                           13448
## # ℹ abbreviated names: ¹​`Health insurance coverage (%)`,
## #   ²​`Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`
## # ℹ 1 more variable:
## #   `Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95` <dbl>

Structure of the new dataset

str(regional_data_2)
## tibble [36 × 4] (S3: tbl_df/tbl/data.frame)
##  $ Indicators                                                                                                                 : chr [1:36] "Andaman & Nicobar Islands" "Andhra Pradesh" "Arunachal Pradesh" "Assam" ...
##  $ Health insurance coverage (%)                                                                                              : chr [1:36] "[0.1]" "37.799999999999997" "6.5" "52.7" ...
##  $ Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95: num [1:36] 127099 34054 60415 37131 32608 ...
##  $ Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95 : num [1:36] 2105 3914 17131 8606 7618 ...

Statistical summary of the new dataset

summary(regional_data_2)
##   Indicators        Health insurance coverage (%)
##  Length:36          Length:36                    
##  Class :character   Class :character             
##  Mode  :character   Mode  :character             
##                                                  
##                                                  
##                                                  
##  Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95
##  Min.   : 13448                                                                                                             
##  1st Qu.: 26020                                                                                                             
##  Median : 34655                                                                                                             
##  Mean   : 43444                                                                                                             
##  3rd Qu.: 48675                                                                                                             
##  Max.   :127099                                                                                                             
##  Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95
##  Min.   :  360                                                                                                             
##  1st Qu.: 5043                                                                                                             
##  Median :11119                                                                                                             
##  Mean   :13282                                                                                                             
##  3rd Qu.:15724                                                                                                             
##  Max.   :69347

Data Cleaning and Preparation

We clean the dataset by converting relevant columns to numeric.

# Convert 'Health insurance coverage (%)' to numeric and clean up any brackets
regional_data_2 <- regional_data_2 %>%
  mutate(`Health insurance coverage (%)` = as.numeric(gsub("[\\[\\]]", "", `Health insurance coverage (%)`)))
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `Health insurance coverage (%) = as.numeric(gsub("[\\[\\]]", "",
##   `Health insurance coverage (%)`))`.
## Caused by warning:
## ! NAs introduced by coercion
# Preview summary statistics
summary(regional_data_2)
##   Indicators        Health insurance coverage (%)
##  Length:36          Min.   : 0.60                
##  Class :character   1st Qu.: 6.50                
##  Mode  :character   Median :20.10                
##                     Mean   :23.73                
##                     3rd Qu.:39.80                
##                     Max.   :65.10                
##                     NA's   :1                    
##  Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95
##  Min.   : 13448                                                                                                             
##  1st Qu.: 26020                                                                                                             
##  Median : 34655                                                                                                             
##  Mean   : 43444                                                                                                             
##  3rd Qu.: 48675                                                                                                             
##  Max.   :127099                                                                                                             
##                                                                                                                             
##  Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95
##  Min.   :  360                                                                                                             
##  1st Qu.: 5043                                                                                                             
##  Median :11119                                                                                                             
##  Mean   :13282                                                                                                             
##  3rd Qu.:15724                                                                                                             
##  Max.   :69347                                                                                                             
## 
view(regional_data_2)

###Check the number of missing values in each column

# Check the number of missing values in each column
colSums(is.na(regional_data_2))
##                                                                                                                  Indicators 
##                                                                                                                           0 
##                                                                                               Health insurance coverage (%) 
##                                                                                                                           1 
## Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95 
##                                                                                                                           0 
##  Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95 
##                                                                                                                           0

Remove Rows with Missing Values

because small number of missing values only in Health insurance coverage

# Remove rows where a specific column has missing values
regional_data_2 <- regional_data_2 %>%
  filter(!is.na(`Health insurance coverage (%)`))

# Remove all rows with NA in any column
#data_2 <- na.omit(data_2)
view(regional_data_2)

Health Insurance Coverage Analysis

Health insurance coverage varies significantly across regions. Below, we identify regions with the highest and lowest coverage and visualize the distribution.

Health insurance coverage summary

insurance_summary <- regional_data_2 %>%
  arrange(desc(`Health insurance coverage (%)`))

Visualize health insurance coverage

ggplot(regional_data_2, aes(x = reorder(Indicators, `Health insurance coverage (%)`), y = `Health insurance coverage (%)`)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Health Insurance Coverage by Region", x = "Region", y = "Coverage (%)")

## Hospitalization Costs Analysis Hospitalization costs differ between public and private facilities. Here, we analyze the cost distribution and compare public vs. private facility expenditures.

Summary statistics for public vs. private facilities

# Display title and summary for Public Facility expenditure
cat("Summary of Mean Expenditure on Last Hospitalization (Public Facility):\n\n")
## Summary of Mean Expenditure on Last Hospitalization (Public Facility):
summary(regional_data_2$`Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     360    5856   12180   13601   16193   69347
# Add spacing
cat("\n\n") # Adds a few blank lines for spacing
# Display title and summary for Private Facility expenditure
cat("Summary of Mean Expenditure on Last Hospitalization (Private Facility):\n\n")
## Summary of Mean Expenditure on Last Hospitalization (Private Facility):
summary(regional_data_2$`Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   13448   25801   34201   41054   45340  125825

Boxplots visualization statistics for public vs. private facilities

ggplot(regional_data_2, aes(x = "Public Facility", y = `Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)) +
  geom_boxplot(fill = "lightgreen") +
  labs(title = "Hospitalization Costs in Public Facilities", x = "", y = "Expenditure (INR)")

ggplot(regional_data_2, aes(x = "Private Facility", y = `Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)) +
  geom_boxplot(fill = "lightcoral") +
  labs(title = "Hospitalization Costs in Private Facilities", x = "", y = "Expenditure (INR)")

Boxplots for comparison statistics for public vs. private facilities

# Combine data into a new data frame for plotting

data_boxplot <- regional_data_2 %>%
  select(`Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`,
         `Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`) %>%
  pivot_longer(cols = everything(), 
               names_to = "Facility Type", 
               values_to = "Expenditure") %>%
  mutate(`Facility Type` = gsub("Mean expenditure on last hospitalization \\((.*?)\\) .*", "\\1", `Facility Type`))

# Plot boxplots side-by-side
ggplot(data_boxplot, aes(x = `Facility Type`, y = Expenditure, fill = `Facility Type`)) +
  geom_boxplot() +
  labs(title = "Comparison of Hospitalization Costs by Facility Type", x = "Facility Type", y = "Expenditure (INR)") +
  scale_fill_manual(values = c("Public Facility" = "lightgreen", "Private Facility" = "lightcoral")) +
  theme_minimal()

Correlation Analysis

We now investigate whether health insurance coverage reduces hospitalization costs. Correlation analysis helps us understand the relationship between insurance coverage and hospitalization costs.

# Correlation between insurance and hospitalization costs
correlation_public <- cor(regional_data_2$`Health insurance coverage (%)`,
                          regional_data_2$`Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`, 
                          use = "complete.obs")

correlation_private <- cor(regional_data_2$`Health insurance coverage (%)`,
                           regional_data_2$`Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`, 
                           use = "complete.obs")

cat("Correlation between insurance coverage and public facility costs: ", correlation_public, "\n")
## Correlation between insurance coverage and public facility costs:  -0.2574771
cat("Correlation between insurance coverage and private facility costs: ", correlation_private, "\n")
## Correlation between insurance coverage and private facility costs:  -0.08249782

Scatter Plots for Correlation

Scatter plot for public facilities

ggplot(regional_data_2, aes(x = `Health insurance coverage (%)`,
                 y = `Mean expenditure on last hospitalization (Public Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)) +
  geom_point(color = "blue") +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Correlation Between Insurance Coverage and Public Hospitalization Costs",
       x = "Health Insurance Coverage (%)", y = "Public Facility Cost (INR)")
## `geom_smooth()` using formula = 'y ~ x'

Scatter plot for private facilities

ggplot(regional_data_2, aes(x = `Health insurance coverage (%)`,
                 y = `Mean expenditure on last hospitalization (Private Facility) in the Past 12 Months by Type of Facilities Visited (in INR) 95`)) +
  geom_point(color = "purple") +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Correlation Between Insurance Coverage and Private Hospitalization Costs",
       x = "Health Insurance Coverage (%)", y = "Private Facility Cost (INR)")
## `geom_smooth()` using formula = 'y ~ x'

Conclusion and Recommendations

The analysis highlights significant disparities in healthcare access and financial burden. Below are the key insights and actionable recommendations:

Key Insights

  • Health Insurance Coverage varies widely across states, with some regions showing alarmingly low coverage.
  • Hospitalization Costs in private facilities are much higher than in public facilities, placing a heavy financial burden on individuals.
  • Correlation: A weak negative correlation exists between insurance coverage and public facility costs, suggesting partial financial protection.

Recommendations

  1. Expand Insurance Coverage:
    • Target low-coverage regions with awareness campaigns and policy incentives.
  2. Strengthen Public Healthcare:
    • Invest in public hospitals to improve service quality and affordability.
  3. Regulate Private Healthcare Costs:
    • Introduce pricing regulations to reduce financial burdens from private healthcare.
  4. Promote Preventive Care:
    • Develop programs to prevent diseases, reducing hospitalization needs.

Future Directions

Further analysis can explore: - The impact of insurance on health outcomes. - Socioeconomic factors influencing healthcare access.

Download the RMarkdown File for Analysis

Click here to download the RMarkdown File