ASA Analysis - Cakra Analitika

Data Import

The dataset used in this study captures a twenty–year period of socio-economic and environmental development across ASEAN countries, covering the years 2000 to 2019. In total, the dataset consists of 240 observations, where each entry represents a specific country in a particular year. Through this structure, the data not only reflects temporal changes within each nation but also highlights cross-country differences that characterize the ASEAN region as a whole.

Several key variables are included to provide a comprehensive overview of the determinants of population well-being. Life expectancy, measured in years, serves as the main indicator of public health outcomes and is used as the dependent variable in this study. To explain variations in life expectancy, four main explanatory factors are considered. GDP per capita represents the economic capacity of each country and acts as an indicator of prosperity. Health expenditure per capita reflects the level of investment made by a country toward improving healthcare services and accessibility. Meanwhile, CO₂ damage, expressed as a percentage of gross national income, serves as an indicator of environmental stress and sustainability challenges. Lastly, urban population, measured as the percentage of people living in urban areas, captures the demographic and structural aspects of development that may influence living conditions and access to health facilities.

Together, these variables form a balanced panel dataset that allows for both cross-sectional and longitudinal analysis. The dataset thus provides a rich foundation to examine how economic growth, environmental degradation, public health investment, and urbanization collectively shape life expectancy patterns across the ASEAN region over time.

dfasa <- read.csv("D:\\Documents\\dataset_ASEAN_interpolated1.csv")
head(dfasa)
##        Country.Name Country.Code Year CO2_damage Health_expenditure
## 1 Brunei Darussalam          BRN 2000   1.459899           2.547906
## 2 Brunei Darussalam          BRN 2001   1.631936           2.546511
## 3 Brunei Darussalam          BRN 2002   1.593368           2.534479
## 4 Brunei Darussalam          BRN 2003   1.768876           2.602606
## 5 Brunei Darussalam          BRN 2004   1.429289           2.551934
## 6 Brunei Darussalam          BRN 2005   1.218064           2.233862
##   GDP_per_capita Life_expectancy Urban_population
## 1       20130.26          74.017           71.164
## 2       18287.83          74.209           71.652
## 3       18621.29          74.365           72.046
## 4       20677.90          74.509           72.421
## 5       24423.09          74.603           72.794
## 6       29386.27          74.683           73.163

MULTICOLLINEARITY CHECK

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(car)
## Warning: package 'car' was built under R version 4.4.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.3
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
# Model per tahun otomatis dari 2000-2023
models <- list()
for (yr in 2000:2023) {
  models[[as.character(yr)]] <- lm(Life_expectancy ~ GDP_per_capita + Health_expenditure +
                                     Urban_population + CO2_damage,
                                   data = dfasa %>% filter(Year == yr))
}

# Model gabungan seluruh tahun
models[["2000-2023"]] <- lm(Life_expectancy ~ GDP_per_capita + Health_expenditure +
                              Urban_population + CO2_damage, data = dfasa)

# Hitung VIF tiap model
vif_list <- lapply(models, function(m) as.vector(vif(m)))

# Gabungkan hasil VIF jadi satu tabel seperti contohmu
Multikol <- do.call(rbind, vif_list)

# Tambahkan nama baris (rownames) dan kolom
rownames(Multikol) <- c(paste("Tahun", 2000:2023), "Tahun 2000-2023")
colnames(Multikol) <- c("GDP_per_capita", "Health_expenditure", "Urban_population", "CO2_damage")

# Lihat hasil
Multikol
##                 GDP_per_capita Health_expenditure Urban_population CO2_damage
## Tahun 2000            7.203887           1.733582         7.004671   1.906001
## Tahun 2001            7.317453           1.625378         7.291937   1.725218
## Tahun 2002            7.479406           1.515481         7.555937   1.539287
## Tahun 2003            6.435008           1.573591         6.331637   1.666421
## Tahun 2004            7.237533           2.231321         6.811370   2.435341
## Tahun 2005            7.129083           3.206209         5.356053   3.348552
## Tahun 2006            6.165364           3.243916         4.464650   3.022122
## Tahun 2007            8.975299           3.115019         6.055326   3.278438
## Tahun 2008            4.853444           1.782515         4.597213   1.773481
## Tahun 2009            5.832654           1.573185         6.343632   1.566299
## Tahun 2010            6.864312           1.374944         6.987727   1.600680
## Tahun 2011            5.937848           1.263848         5.802072   1.508507
## Tahun 2012            6.269482           1.371305         5.856654   1.661384
## Tahun 2013            7.647793           1.622898         6.890945   2.224761
## Tahun 2014            8.492856           1.505473         7.545215   2.269759
## Tahun 2015            8.747136           2.222964         6.923073   3.284422
## Tahun 2016            5.087380           2.735481         4.902905   2.899405
## Tahun 2017            4.374889           2.752192         5.134873   2.740985
## Tahun 2018            4.362871           3.490844         5.103778   3.606737
## Tahun 2019            4.444851           2.580416         5.080516   2.634783
## Tahun 2020            3.936497           2.414880         4.588157   2.645581
## Tahun 2021            3.402350           2.004583         5.009454   2.060641
## Tahun 2022            3.319331           1.613806         4.264898   1.778160
## Tahun 2023            3.261713           1.605007         4.155280   1.770310
## Tahun 2000-2023       3.394062           1.207256         3.381087   1.212925

Overall, the multicollinearity inspection suggests that there is no severe multicollinearity problem among the independent variables, as the variation across years remains distinguishable and no consistent pattern of excessively high interdependence is observed. This condition ensures that the regression estimates derived from the panel data model are statistically reliable and not distorted by redundancy among explanatory variables.

VARIABLES STANDARDIZATION

Before performing the regression and predictive modeling, all independent variables were standardized using the z-score transformation, while the dependent variable (Life expectancy) was kept in its original scale (years). The standardization process aims to eliminate unit differences among the explanatory variables such as GDP per capita (in USD), CO₂ damage (in percentage of GNI), health expenditure (in percentage of GDP), and urban population (in percentage of total population) so that each variable contributes comparably to the estimation process. The dependent variable (Life expectancy) was intentionally not standardized, as it represents a directly interpretable outcome in years. Keeping the dependent variable in its natural unit allows the interpretation of predicted values in meaningful real world terms for instance, an increase of one predicted unit corresponds to an additional year of life expectancy.

dfasaaa <- data.frame(
  Country.Name = dfasa$Country.Name,
  Year = dfasa$Year,
  Life_expectancy = dfasa$Life_expectancy,
  GDP_per_capita = scale(dfasa$GDP_per_capita),
  CO2_damage = scale(dfasa$CO2_damage),
  Health_expenditure = scale(dfasa$Health_expenditure),
  Urban_population = scale(dfasa$Urban_population)
)
head(dfasaaa)
##        Country.Name Year Life_expectancy GDP_per_capita  CO2_damage
## 1 Brunei Darussalam 2000          74.017      0.5584252 -0.32186290
## 2 Brunei Darussalam 2001          74.209      0.4519184 -0.16597804
## 3 Brunei Darussalam 2002          74.365      0.4711952 -0.20092483
## 4 Brunei Darussalam 2003          74.509      0.5900832 -0.04189456
## 5 Brunei Darussalam 2004          74.603      0.8065846 -0.34959916
## 6 Brunei Darussalam 2005          74.683      1.0934949 -0.54099369
##   Health_expenditure Urban_population
## 1         -0.7306505        0.8828937
## 2         -0.7314279        0.9029805
## 3         -0.7381293        0.9191982
## 4         -0.7001833        0.9346338
## 5         -0.7284073        0.9499870
## 6         -0.9055715        0.9651756

DETERMINING THE BEST TYPE

Common Effect Model

The Common Effect Model (CEM), also known as Pooled Least Squares, is one of the models in panel data regression that combines time series and cross-sectional data into a single entity.

library(plm)
## Warning: package 'plm' was built under R version 4.4.3
## 
## Attaching package: 'plm'
## The following objects are masked from 'package:dplyr':
## 
##     between, lag, lead
cem <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage,
           data=dfasaaa,
           model="pooling")

summary(cem)
## Pooling Model
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, model = "pooling")
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -7.28074 -1.48082 -0.36574  0.84318  6.86816 
## 
## Coefficients:
##                    Estimate Std. Error  t-value  Pr(>|t|)    
## (Intercept)        70.16647    0.18071 388.2843 < 2.2e-16 ***
## GDP_per_capita      0.92729    0.33362   2.7795  0.005885 ** 
## Health_expenditure  0.88007    0.19897   4.4232 1.488e-05 ***
## Urban_population    4.98226    0.33298  14.9627 < 2.2e-16 ***
## CO2_damage          1.14119    0.19944   5.7221 3.189e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    9041.6
## Residual Sum of Squares: 1841.8
## R-Squared:      0.7963
## Adj. R-Squared: 0.79283
## F-statistic: 229.663 on 4 and 235 DF, p-value: < 2.22e-16

All explanatory variables in the Pooling Model (GDP per capita, health expenditure, urban population, and CO₂ damage) have a significant effect on the response variable, life expectancy, at the 5% significance level. The R-Squared value of 0.7963 and Adjusted R-Squared of 0.7928 indicate that these four explanatory variables collectively explain approximately 79.28% of the variation in life expectancy across ASEAN countries during the 2000–2023 period, while the remaining variation is influenced by other factors outside the model.

Fixed Effect Model

The fixed effect approach is that an object has a constant value that remains the same over different periods of time. Likewise, its regression coefficients remain constant over time. The fixed effect model is a model with different intercepts for each individual, but the slope for each subject does not change over time.

# fem ind
fem.ind <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, 
               model = "within",effect= "individual", index = c("Country.Name","Year"))

summary(fem.ind)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "individual", 
##     model = "within", index = c("Country.Name", "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -6.19495 -0.63789  0.29316  0.93060  2.62620 
## 
## Coefficients:
##                    Estimate Std. Error t-value  Pr(>|t|)    
## GDP_per_capita      1.11142    0.25117   4.425 1.500e-05 ***
## Health_expenditure  0.35590    0.15113   2.355   0.01938 *  
## Urban_population    7.98016    0.69376  11.503 < 2.2e-16 ***
## CO2_damage          0.70222    0.12978   5.411 1.595e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1168.9
## Residual Sum of Squares: 505.02
## R-Squared:      0.56795
## Adj. R-Squared: 0.5431
## F-statistic: 74.2718 on 4 and 226 DF, p-value: < 2.22e-16

All explanatory variables in the fixed effect model significantly influence the response variable (life expectancy) at the 5% significance level. The model obtained an R-squared value of 0.5679 and an adjusted R-squared of 0.5431, indicating that the explanatory variables collectively explain 54.31% of the variation in life expectancy across ASEAN countries during the study period.

# fem time
fem.time <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, model = "within", effect= "time", index = c("Country.Name","Year"))

summary(fem.time)
## Oneway (time) effect Within Model
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "time", 
##     model = "within", index = c("Country.Name", "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -6.88296 -1.87393 -0.36326  0.90597  7.77869 
## 
## Coefficients:
##                    Estimate Std. Error t-value  Pr(>|t|)    
## GDP_per_capita      0.90584    0.33660  2.6912  0.007688 ** 
## Health_expenditure  0.97048    0.22880  4.2416 3.315e-05 ***
## Urban_population    5.00939    0.32498 15.4145 < 2.2e-16 ***
## CO2_damage          1.43201    0.24595  5.8223 2.128e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    8234.3
## Residual Sum of Squares: 1571.4
## R-Squared:      0.80916
## Adj. R-Squared: 0.78486
## F-statistic: 224.725 on 4 and 212 DF, p-value: < 2.22e-16

All explanatory variables in the time fixed effect model have a significant influence on the response variable (Life Expectancy) at the 5% significance level. The model yields an R-Squared value of 0.80916 and an Adjusted R-Squared of 0.78486, indicating that the explanatory variables collectively explain 78.49% of the variation in life expectancy across ASEAN countries over time.

fem.twoway <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, model = "within", effect= "twoway", index = c("Country.Name","Year"))

summary(fem.twoway)
## Twoways effects Within Model
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "twoway", 
##     model = "within", index = c("Country.Name", "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -5.528601 -0.556909 -0.052402  0.683020  2.858952 
## 
## Coefficients:
##                     Estimate Std. Error t-value  Pr(>|t|)    
## GDP_per_capita     -0.080596   0.232723 -0.3463   0.72946    
## Health_expenditure  0.227517   0.129484  1.7571   0.08041 .  
## Urban_population    0.517603   0.930890  0.5560   0.57880    
## CO2_damage          1.031358   0.118834  8.6790 1.303e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    361.61
## Residual Sum of Squares: 259.52
## R-Squared:      0.28233
## Adj. R-Squared: 0.15506
## F-statistic: 19.9648 on 4 and 203 DF, p-value: 7.0558e-14

In the two-way fixed effects model, only CO₂ damage has a statistically significant effect on life expectancy at the 5% significance level. The model yields an R-squared value of 0.2823 and an adjusted R-squared of 0.1551, indicating that the explanatory variables collectively explain about 15.51% of the variation in life expectancy across ASEAN countries from 2000 to 2023.

Random Effect Model

Random effect is an approach for estimating panel data where the residuals may be correlated across time and individuals. In a random effect model, parameters that differ across individuals and over time are included in the error term, which is why this model is also referred to as an error component model.

# rem individual
rem_ind <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, index = c("Country.Name", "Year"), effect = "individual", model = "random", random.method = "amemiya")

summary(rem_ind)
## Oneway (individual) effect Random Effect Model 
##    (Amemiya's transformation)
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "individual", 
##     model = "random", random.method = "amemiya", index = c("Country.Name", 
##         "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Effects:
##                  var std.dev share
## idiosyncratic  2.196   1.482 0.114
## individual    17.012   4.125 0.886
## theta: 0.9269
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -6.07917 -0.73942  0.29385  0.89573  2.83389 
## 
## Coefficients:
##                    Estimate Std. Error z-value  Pr(>|z|)    
## (Intercept)        70.16647    1.31747 53.2583 < 2.2e-16 ***
## GDP_per_capita      1.07228    0.24958  4.2964 1.736e-05 ***
## Health_expenditure  0.41357    0.14858  2.7835  0.005378 ** 
## Urban_population    7.27416    0.61964 11.7393 < 2.2e-16 ***
## CO2_damage          0.74289    0.12827  5.7917 6.967e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1211
## Residual Sum of Squares: 523.67
## R-Squared:      0.56757
## Adj. R-Squared: 0.56021
## Chisq: 308.441 on 4 DF, p-value: < 2.22e-16

All explanatory variables in the Random Effect Model significantly affect the response variable (Life Expectancy) at the 5% significance level. The model yields an R-squared value of 0.5676 and an adjusted R-squared of 0.5602, indicating that the explanatory variables collectively explain about 56.02% of the variation in life expectancy across ASEAN countries.

# rem time
rem_time <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, index = c("Country.Name", "Year"), effect = "time", model = "random", random.method = "amemiya")

summary(rem_time)
## Oneway (time) effect Random Effect Model 
##    (Amemiya's transformation)
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "time", 
##     model = "random", random.method = "amemiya", index = c("Country.Name", 
##         "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Effects:
##                  var std.dev share
## idiosyncratic 7.2750  2.6972 0.938
## time          0.4801  0.6929 0.062
## theta: 0.2238
## 
## Residuals:
##     Min.  1st Qu.   Median  3rd Qu.     Max. 
## -7.16208 -1.53896 -0.30969  0.73604  6.64378 
## 
## Coefficients:
##                    Estimate Std. Error  z-value  Pr(>|z|)    
## (Intercept)        70.16647    0.22617 310.2445 < 2.2e-16 ***
## GDP_per_capita      0.90602    0.32765   2.7652  0.005688 ** 
## Health_expenditure  0.89542    0.20338   4.4028 1.069e-05 ***
## Urban_population    4.99076    0.32390  15.4085 < 2.2e-16 ***
## CO2_damage          1.22892    0.20892   5.8823 4.045e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    8720.7
## Residual Sum of Squares: 1738
## R-Squared:      0.8007
## Adj. R-Squared: 0.79731
## Chisq: 944.146 on 4 DF, p-value: < 2.22e-16

All explanatory variables in the Random Effect Model (REM) significantly influence the response variable, life expectancy, at the 5% significance level. The model yields an R-squared value of 0.8007 and an adjusted R-squared of 0.7973, indicating that the explanatory variables collectively explain about 79.73% of the variation in life expectancy across ASEAN countries from 2000 to 2023.

# rem two ways
rem_twoway <- plm(Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population + CO2_damage, data = dfasaaa, index = c("Country.Name", "Year"), effect = "twoway", model = "random", random.method = "amemiya")

summary(rem_twoway)
## Twoways effects Random Effect Model 
##    (Amemiya's transformation)
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + Health_expenditure + 
##     Urban_population + CO2_damage, data = dfasaaa, effect = "twoway", 
##     model = "random", random.method = "amemiya", index = c("Country.Name", 
##         "Year"))
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Effects:
##                  var std.dev share
## idiosyncratic  1.254   1.120 0.037
## individual    30.290   5.504 0.887
## time           2.587   1.609 0.076
## theta: 0.9585 (id) 0.785 (time) 0.7843 (total)
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -5.832106 -0.577973  0.054211  0.799038  2.342464 
## 
## Coefficients:
##                    Estimate Std. Error z-value  Pr(>|z|)    
## (Intercept)        70.16647    1.76544 39.7446 < 2.2e-16 ***
## GDP_per_capita      0.23912    0.21540  1.1101 0.2669416    
## Health_expenditure  0.30849    0.12498  2.4684 0.0135730 *  
## Urban_population    2.54849    0.75695  3.3668 0.0007605 ***
## CO2_damage          1.04892    0.11522  9.1040 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    412.47
## Residual Sum of Squares: 292.25
## R-Squared:      0.29146
## Adj. R-Squared: 0.2794
## Chisq: 96.6688 on 4 DF, p-value: < 2.22e-16

All explanatory variables in the Random Effect Model significantly influence the response variable (Life Expectancy) at the 5% significance level, except for GDP per capita. The model yields an R-squared value of 0.2915 and an adjusted R-squared of 0.2794, indicating that the explanatory variables collectively explain approximately 27.94% of the variation in life expectancy across ASEAN countries.

Determining The Best Model

Uji Chow

Hypothesis

H0 : Common Effect Model (CEM) is better to use

H1 : Fixed Effect Model (FEM) is better to use

# cem vs fem two-way
pooltest(cem, fem.twoway)
## 
##  F statistic
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## F = 38.677, df1 = 32, df2 = 203, p-value < 2.2e-16
## alternative hypothesis: unstability

Fixed Effect Model (FEM) is better to use

# cem vs fem time
pooltest(cem, fem.time)
## 
##  F statistic
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## F = 1.586, df1 = 23, df2 = 212, p-value = 0.0487
## alternative hypothesis: unstability

Fixed Effect Model (FEM) is better to use

# cem vs fem individual
pooltest(cem, fem.ind)
## 
##  F statistic
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## F = 66.468, df1 = 9, df2 = 226, p-value < 2.2e-16
## alternative hypothesis: unstability

Fixed Effect Model (FEM) is better to use

Uji Hausmann

Hypothesis

H0 : Random Effect Model (REM) is better to use H1 : Fixed Effect Model (FEM) is better to use

# fem two-way vs rem two-way
phtest(fem.twoway, rem_twoway)
## 
##  Hausman Test
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## chisq = 15.155, df = 4, p-value = 0.004391
## alternative hypothesis: one model is inconsistent

Fixed Effect Model (FEM) is better to use

# fem time vs rem time
phtest(fem.time, rem_time)
## 
##  Hausman Test
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## chisq = 15.778, df = 4, p-value = 0.003332
## alternative hypothesis: one model is inconsistent

Fixed Effect Model (FEM) is better to use

# fem individual vs rem individual
phtest(fem.ind,rem_ind)
## 
##  Hausman Test
## 
## data:  Life_expectancy ~ GDP_per_capita + Health_expenditure + Urban_population +  ...
## chisq = 5.2618, df = 4, p-value = 0.2615
## alternative hypothesis: one model is inconsistent

Random Effect Model (REM) is better to use

Fixed Effect Model (FEM)

The Fixed Effect Model (FEM) incorporates both individual-specific and time-specific influences in the analysis. This model assumes that the intercepts differ across individuals and over time, while the slope coefficients remain constant for all observations. Parameter estimation in the fixed effect framework is commonly performed using the Least Square Dummy Variable (LSDV) approach, as outlined by Baltagi (2005). When the model includes either an individual effect or a time effect, it is referred to as a one-way error component model. Meanwhile, if both effects are considered simultaneously, it becomes a two-way error component model.

In the one-way specification, each individual or time period has its own intercept. These intercepts are separately estimated and then incorporated into the regression equation. The resulting model assumes that the effect of explanatory variables (the slopes) is uniform across all individuals or time periods.The two-way specification extends this idea by including both individual and time effects. The individual effects capture unobservable characteristics that differ across entities but remain constant over time, whereas the time effects capture shocks or events that influence all entities equally within a given period (Gujarati, 2004). By accounting for these unobserved effects, the model yields more reliable and unbiased estimates of the relationship between the explanatory variables and the dependent variable.

  1. General Model Form \[ y_{it} = \beta_0 + \beta_1 X_{1it} + \beta_2 X_{2it} + \dots + \beta_j X_{jit} + \varepsilon_{it} \]

This equation represents the overall relationship between the dependent and independent variables, without distinguishing between individuals or time periods.

  1. Model with Unit-Specific Intercepts and Constant Slopes \[ y_{it} = (\beta_0 + \beta_{0i}) + \beta_1 X_{1it} + \beta_2 X_{2it} + \dots + \beta_j X_{jit} + \varepsilon_{it} \]

In this specification, the intercept varies across individuals, as indicated by intercept, while the slopes remain identical. This allows the model to account for individual heterogeneity that is constant over time.

  1. Model with Unit- and Time-Specific Intercepts \[ y_{it} = (\beta_0 + \beta_{0i} + \beta_{0t}) + \beta_1 X_{1it} + \beta_2 X_{2it} + \dots + \beta_j X_{jit} + \varepsilon_{it} \]

This version of the model considers both individual and time variations in the intercept term. The inclusion of intercept(i) and intercept(t) implies that unobserved effects may arise not only from differences between individuals but also from time-related factors affecting all individuals.

The estimation of this model is carried out using the Ordinary Least Squares (OLS) technique with the inclusion of dummy variables representing individuals and/or time periods, ensuring that unobserved heterogeneity is properly controlled within the regression framework.

Model Fixed Effect Model Within Estimator

The Fixed Effects Model (FEM) is a panel data regression approach employed to assess how economic, environmental, and social factors influence life expectancy across ASEAN countries over time. This model controls for unobserved, time-invariant characteristics specific to each country such as institutional quality, cultural norms, or long-standing health infrastructure that might otherwise bias the estimates. In the FEM framework, these country-specific effects are allowed to differ across units but are assumed constant over time, while the slope coefficients of the explanatory variables remain uniform across all observations. This allows the analysis to focus on within-country variations across years, providing a clearer understanding of how changes in income, health expenditure, urbanization, and environmental factors contribute to shifts in life expectancy within each ASEAN nation.

# pastikan package plm sudah terpasang
library(plm)

# misal dataset kamu bernama dfasa
# dan sudah punya kolom: Country.Name, Year, Life_expectancy, GDP_per_capita, CO2_damage, Health_expenditure, Urban_population

# ubah jadi data panel
pdata <- pdata.frame(dfasaaa, index = c("Country.Name", "Year"))

# model efek tetap dua arah (fixed effects two-way)
fem.twoway <- plm(Life_expectancy ~ GDP_per_capita + CO2_damage + 
                    Health_expenditure + Urban_population,
                  data = pdata,
                  model = "within",
                  effect = "twoways")

# ringkasan hasil
summary(fem.twoway)
## Twoways effects Within Model
## 
## Call:
## plm(formula = Life_expectancy ~ GDP_per_capita + CO2_damage + 
##     Health_expenditure + Urban_population, data = pdata, effect = "twoways", 
##     model = "within")
## 
## Balanced Panel: n = 10, T = 24, N = 240
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -5.528601 -0.556909 -0.052402  0.683020  2.858952 
## 
## Coefficients:
##                     Estimate Std. Error t-value  Pr(>|t|)    
## GDP_per_capita     -0.080596   0.232723 -0.3463   0.72946    
## CO2_damage          1.031358   0.118834  8.6790 1.303e-15 ***
## Health_expenditure  0.227517   0.129484  1.7571   0.08041 .  
## Urban_population    0.517603   0.930890  0.5560   0.57880    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    361.61
## Residual Sum of Squares: 259.52
## R-Squared:      0.28233
## Adj. R-Squared: 0.15506
## F-statistic: 19.9648 on 4 and 203 DF, p-value: 7.0558e-14

The two-way fixed effects model shows that the standardized predictors explain only a modest portion of the variation in life expectancy, with an R² of 0.2823 and an adjusted R² of 0.1551. This indicates that while the model captures some consistent patterns across ASEAN countries and over time, much of the variation in life expectancy remains unexplained by the included variables. Among the predictors, CO₂ damage has the largest positive coefficient, suggesting that increases in this variable are associated with higher predicted life expectancy, while health expenditure shows a smaller positive effect. Meanwhile, GDP per capita and urban population have minimal predictive influence. Overall, the relatively low R² values reflect that the model’s predictive accuracy is weak, meaning its ability to forecast life expectancy based on these variables alone is limited.

Individual and Time Effect Check

# individual & time effect
plmtest(fem.twoway,type = "bp", effect="twoways" )
## 
##  Lagrange Multiplier Test - two-ways effects (Breusch-Pagan)
## 
## data:  Life_expectancy ~ GDP_per_capita + CO2_damage + Health_expenditure +  ...
## chisq = 1254.4, df = 2, p-value < 2.2e-16
## alternative hypothesis: significant effects

This test aims to examine whether both individual effects and time effects exist simultaneously in the model. It applies the Breusch-Pagan Lagrange Multiplier (LM) test to detect the joint significance of these effects. A significant result (p-value < 0.05) indicates that both cross-sectional and time variations play an important role in explaining differences in the dependent variable, suggesting that a two-way effects model is appropriate.

Cross Section Unit Freedom Test

This test examines whether there is cross-sectional dependence among the residuals of the panel model. If the result is significant, it means that the residuals across countries are correlated, implying that developments or shocks in one country may affect others.

H0: There is no dependency between individual units.

H1: There is dependency between individual units.

pcdtest(fem.twoway, test = c("lm"), index=NULL,w =NULL )
## 
##  Breusch-Pagan LM test for cross-sectional dependence in panels
## 
## data:  Life_expectancy ~ GDP_per_capita + CO2_damage + Health_expenditure +     Urban_population
## chisq = 332.86, df = 45, p-value < 2.2e-16
## alternative hypothesis: cross-sectional dependence

The Breusch-Pagan LM test indicates significant cross-sectional dependence (p < 0.001), suggesting that the panel units are interrelated and that common factors may influence life expectancy across countries.

FIXED EFFECT MODEL (WITH LSDV)

The results of the two-way effects test show that both individual and time effects are statistically significant. This finding indicates that the variation in life expectancy among ASEAN countries is not only shaped by differences across countries but also by changes that occur over time. In other words, there are country-specific characteristics—such as institutional quality, healthcare systems, and economic structures—as well as temporal factors—such as global crises, technological advances, or policy reforms—that jointly influence life expectancy dynamics. Because these influences are not directly captured by observable variables, it becomes essential to use a modeling approach that can account for both dimensions simultaneously.

For this reason, the Fixed Effects Model using the Least Squares Dummy Variable (LSDV) approach is chosen over the Within transformation model. The LSDV method explicitly includes dummy variables representing individual (country) and time effects, enabling the model to isolate unobserved heterogeneity that remains constant within each unit or time period. Unlike the Within estimator, which eliminates fixed effects through data transformation, the LSDV approach retains these effects in the model structure, allowing for a clearer interpretation of how each country and each period contribute to variations in life expectancy. This is especially beneficial in policy-oriented analyses, as it provides a more concrete understanding of which countries deviate significantly from the overall average trend.

Furthermore, the LSDV estimator tends to produce more consistent and efficient estimates when there is correlation between explanatory variables and unobserved effects—a condition often present in socioeconomic panel data. It also facilitates statistical testing for the presence of individual and time-specific differences through F-tests, offering an additional diagnostic advantage over the Within approach. In the context of this study, where the goal is predictive modeling rather than pure hypothesis testing, using LSDV ensures that the predictions incorporate both cross-sectional and temporal heterogeneity. This leads to a more accurate and contextually meaningful model of life expectancy variation within the ASEAN region.

fem.lsdv.std <- lm(Life_expectancy ~ GDP_per_capita + CO2_damage + 
                     Health_expenditure + Urban_population +
                     factor(Year) + factor(Country.Name), 
                   data = dfasaaa)
summary(fem.lsdv.std)
## 
## Call:
## lm(formula = Life_expectancy ~ GDP_per_capita + CO2_damage + 
##     Health_expenditure + Urban_population + factor(Year) + factor(Country.Name), 
##     data = dfasaaa)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.5286 -0.5569 -0.0524  0.6830  2.8590 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      71.24426    0.96835  73.573  < 2e-16 ***
## GDP_per_capita                   -0.08060    0.23272  -0.346 0.729462    
## CO2_damage                        1.03136    0.11883   8.679 1.30e-15 ***
## Health_expenditure                0.22752    0.12948   1.757 0.080407 .  
## Urban_population                  0.51760    0.93089   0.556 0.578803    
## factor(Year)2001                  0.20856    0.50638   0.412 0.680867    
## factor(Year)2002                  0.60266    0.50740   1.188 0.236319    
## factor(Year)2003                  0.95007    0.50991   1.863 0.063878 .  
## factor(Year)2004                  1.20453    0.51294   2.348 0.019821 *  
## factor(Year)2005                  1.85960    0.51793   3.590 0.000414 ***
## factor(Year)2006                  2.27289    0.52340   4.343 2.22e-05 ***
## factor(Year)2007                  2.80353    0.52976   5.292 3.12e-07 ***
## factor(Year)2008                  2.61104    0.53966   4.838 2.59e-06 ***
## factor(Year)2009                  3.42415    0.54424   6.292 1.90e-09 ***
## factor(Year)2010                  3.92059    0.55410   7.076 2.37e-11 ***
## factor(Year)2011                  4.34700    0.56891   7.641 8.38e-13 ***
## factor(Year)2012                  4.58261    0.57777   7.932 1.43e-13 ***
## factor(Year)2013                  4.71867    0.58740   8.033 7.63e-14 ***
## factor(Year)2014                  4.84079    0.59382   8.152 3.65e-14 ***
## factor(Year)2015                  4.75653    0.59912   7.939 1.36e-13 ***
## factor(Year)2016                  4.67746    0.60726   7.703 5.77e-13 ***
## factor(Year)2017                  4.78111    0.61981   7.714 5.39e-13 ***
## factor(Year)2018                  4.91105    0.63349   7.752 4.27e-13 ***
## factor(Year)2019                  5.03252    0.64367   7.819 2.85e-13 ***
## factor(Year)2020                  4.67799    0.66673   7.016 3.33e-11 ***
## factor(Year)2021                  3.24822    0.71066   4.571 8.44e-06 ***
## factor(Year)2022                  4.03059    0.72567   5.554 8.67e-08 ***
## factor(Year)2023                  4.65980    0.73071   6.377 1.19e-09 ***
## factor(Country.Name)Cambodia     -7.21957    2.25557  -3.201 0.001591 ** 
## factor(Country.Name)Indonesia    -6.47294    1.19321  -5.425 1.64e-07 ***
## factor(Country.Name)Lao PDR     -10.63232    1.92166  -5.533 9.65e-08 ***
## factor(Country.Name)Malaysia     -0.82468    0.51626  -1.597 0.111729    
## factor(Country.Name)Myanmar     -10.57869    1.97717  -5.350 2.36e-07 ***
## factor(Country.Name)Philippines  -5.19223    1.34620  -3.857 0.000154 ***
## factor(Country.Name)Singapore     7.10220    1.13401   6.263 2.22e-09 ***
## factor(Country.Name)Thailand     -0.04129    1.41037  -0.029 0.976673    
## factor(Country.Name)Timor-Leste  -9.88581    1.99765  -4.949 1.57e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.131 on 203 degrees of freedom
## Multiple R-squared:  0.9713, Adjusted R-squared:  0.9662 
## F-statistic: 190.8 on 36 and 203 DF,  p-value: < 2.2e-16

Based on the estimation results of the Fixed Effects Model using the Least Squares Dummy Variable (LSDV) approach, variations in life expectancy across ASEAN countries can be explained remarkably well by the combined influence of economic, health, environmental, and demographic factors, with an R-squared of 0.97 and an adjusted R-squared 0.96.These values indicate that the model successfully captures almost all cross-country and temporal variations from 2000 to 2023. Predictively, the model demonstrates a high degree of explanatory power and accuracy, making it a robust tool for estimating changes in life expectancy driven by socioeconomic characteristics.

The analysis reveals that CO₂ damage exerts the strongest and most significant positive effect on life expectancy. This finding suggests that countries with higher levels of industrialization tend to possess stronger healthcare systems, more developed infrastructure, and greater social welfare capacity. In other words, within the predictive framework, the positive effects of development outweigh the negative externalities of industrialization on life expectancy. Meanwhile, health expenditure as a share of GDP shows a positive and economically meaningful relationship with life expectancy. This result underlines that sustained investment in the health sector plays a critical role in improving longevity. Greater fiscal commitment to healthcare through universal coverage, stronger primary care, and service digitalization remains essential to extend population health and address disparities in healthcare access across ASEAN nations. In contrast, GDP per capita and urban population exhibit weak or insignificant effects, implying that economic growth and urbanization alone are insufficient to improve well-being unless accompanied by inclusive and equitable social policies. Economic expansion without redistribution and health-centered planning does not automatically translate into longer or healthier lives.

The time fixed effects indicate a consistently positive upward trend across all years, reflecting improvements in public health systems and quality of life throughout the region. However, the country fixed effects reveal persistent disparities: Singapore and Malaysia perform well above the regional average, while Myanmar, Lao PDR, and Timor-Leste remain substantially below.

Overall, the LSDV model demonstrates strong predictive performance, with low residual errors and high explanatory power. This approach allows for explicit control over both country-specific and temporal heterogeneity, providing clearer interpretability and more stable predictions compared to within estimators.

These findings offer several key policy implications for ASEAN governments and regional decision-makers:

  1. Increasing public health expenditure should be regarded as a productive investment, not merely a fiscal burden. Countries that allocate a higher share of their budgets to healthcare are predicted to achieve longer life expectancies. Therefore, improving the efficiency and sustainability of health financing through universal health coverage, digital innovation, and better resource allocation should be prioritized.

  2. The results indicate that economic growth and urbanization alone cannot ensure improved well being. Policymakers must ensure that growth is inclusive and accompanied by equitable access to basic services, clean environments, and health infrastructure in both urban and rural areas.

  3. The significant role of CO₂ damage highlights the need to balance economic development with environmental sustainability. ASEAN countries should accelerate the transition toward low carbon economies through green recovery programs, renewable energy investment, circular economy initiatives, and incentives for environmentally responsible industries.

  4. The substantial cross country disparities underscore the importance of regional cooperation. Strengthening ASEAN’s collective health resilience through cross-border data systems, shared medical technology, and coordinated environmental policies is essential to narrow health and welfare gaps among member states.

this study emphasizes that the future of well-being in ASEAN is not determined by the pace of economic growth but by how effectively that growth is transformed into equitable health improvements. A health-centered development approach should serve as the foundation for long-term regional progress ensuring that economic gains translate into a healthier, more resilient, and longer-living ASEAN community.