Explanation of the paper

In this session I’m going to replicate some of the results from the paper “Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach” (Xavier Sala-i-Martin, Gernot Doppelhofer and Ronald I. Miller The American Economic Review Vol. 94, No. 4 (Sep., 2004), pp. 813-835).

Objective

The objective of the paper is to develop method for model specification. The method is called “Bayesian Averaging of Classical Estimates (BACE)”. In this method the researcher takes handful of variables and tries to narrow the support to the meaningful ones. Very often different researchers use different specifications to explain certain phenomenon. One have to ask what is the right specification. This method tries to address this issue.

Implementation

The BACE model, as its name suggests, use Bayesian approach in order to choose the right model; e.g, it uses the notion that we have prior assumptions about the model’s important explanatory variables. Given those priors we look at the data and update our knowledge of the data. Specifically, \(g(\beta|y) = f(y|\beta)g(\beta)/f(y)\) where \(g(\beta)\) is the prior density on \(\beta\), \(f(y|\beta)\) is the data on \(\beta\) in the data, \(f(y)\) is the marginal density of \(y\) and \(g(\beta|y)\) is the posterior. Intuitively, this means that the researcher have beliefs on the data, but she also updates its beliefs due to the data.

The idea that the writers are going to introduce is that the researcher should randomize specifications, and then she will see what variables remain in the model given our data and the priors. It is clear though, that variables that have strong priors will be more likely to remain in each model specification. Once we have many model specifications we will estimate the variables for each specification and average them:

\(E(\beta|y) = \sum_{j=1}^{2^K} P(M_{j}|y)\hat{\beta_j}\)

Where \(K\) is the number of variables to keep on the \(j\)th specification. As we can see all that one have to choose is the \(K\) - the number of variables to keep in each specification.

To wrap it up - so far we have seen that given our priors and data we can calculate the probability that a variable is included in the model. All the researcher has to do is choosing the \(K\) - the number of variables to include in each randomization. The bigger the \(K\) the more combinations of variables are prevalent. Once we get our multiple models we average the covariates.

Case study

The paper is using the BACE method in order to determine what variables are important in explaining the GDP growth of 88 coutries. The writers chose 67 variables as a beginning, from whom they are going to choose the relevant ones, using BACE.
In what follows I am going to reproduce some of their paper.

Empirical Replication

First, we will take the data (the data can be found at the AER website):

data <- readxl::read_xls("C:/Users/dorgo/Documents/R/ml4econ/BACE_data.xls")

Next, we will subset the data:

{Africa <- c("Algeria","Benin","Botswana","Burundi","Cameroon",
  "Cent'l Afr. Rep.","Congo","Egypt",
  "Ethiopia","Gabon","Gambia","Ghana","Kenya","Lesotho",
  "Liberia","Madagascar","Malawi",
  "Mauritania","Morocco", "Niger", "Nigeria","Rwanda","Senegal",
  "South africa","Tanzania", "Togo", "Tunisia",
  "Uganda","Zaire","Zambia","Zimbabwe","Canada",
  "Costa Rica","Dominican Rep.","El Salvador","Guatemala",
  "Haiti","Honduras","Jamaica")


Europe_etc <- c("Netherlands","Norway","Portugal","Spain","Sweden",
                 "Turkey","United Kingdom","Australia","Fiji",
                 "Papua New Guinea")


Asia_etc <- c("Mexico","Panama","Trinidad & Tobago","United States",
"Argentina", "Bolivia", "Brazil", "Chile","Colombia",
"Ecuador","Paraguay","Peru","Uruguay","Venezuela","Hong Kong",
"India","Indonesia","Israel","Japan","Jordan","Korea","Malaysia",
"Nepal","Pakistan","Philippines","Singapore","Sri Lanka","Syria","Taiwan",
"Thailand","Austria","Belgium","Denmark","Finland","France",
"Germany, West","Greece","Ireland","Italy")



included <- c(Africa, Europe_etc, Asia_etc)

data1 <- data %>% 
  filter(data$COUNTRY %in% included)

data1[4:71] %<>% sapply(FUN = as.numeric)

}

Now we will write some functions to help us replicate the paper. It is important to note that the results will not be exactly the same, due to limits of processing time (the writers used 87M iterations, I will use only 1M).

bayesian <- function(iterations, k_size){

  a <- bms(X.data = GR6096 ~ ., mcmc="bd",
               iter = iterations,
               g="UIP", mprior="fixed",
               mprior.size = k_size, user.int = FALSE,
               nmodel=10, data = data1[,4:71])

  b <- paste("bms", k_size, sep = "")
  assign(b, a, envir = .GlobalEnv)
}
ks <- c(5,7,9,11,16,22,28)

sapply(ks, bayesian, iterations = 1000000)

extract <- function(num, colu){
  
  name <- paste("bms", num, sep = "")
  a <- get(name)
  b <- coef(a)[,colu]
  return(b)
  
  
}

Now, I will replicate the results from the paper.

table2 <- coef(bms7) %>% as.data.frame()

table3 <- sapply(ks, extract, colu = 1) %>% as.data.frame()
colnames(table3) <- paste("kbar", ks, sep = "")

table4 <- sapply(ks, extract, colu = 2) %>% as.data.frame()
colnames(table4) <- paste("kbar", ks, sep = "")

table5 <- sapply(ks, extract, colu = 4) %>% as.data.frame()
colnames(table5) <- paste("kbar", ks, sep = "")

I will only display two of the results (though the rest was computed in the chunk above)

kable(table2, "html") %>%
    kable_styling(position = "center") %>%
    scroll_box(width = "500px", height = "200px")

	PIP	Post Mean	Post SD	Cond.Pos.Sign	Idx
EAST	0.838140	0.0193115	0.0101693	0.9999726	14
P60	0.756978	0.0189050	0.0126921	1.0000000	44
IPRICE1	0.703474	-0.0000580	0.0000429	0.0000000	29
TROPICAR	0.551773	-0.0080563	0.0079246	0.0004096	62
GDPCH60L	0.494529	-0.0039265	0.0044591	0.0005035	20
DENS65C	0.342597	0.0000028	0.0000043	0.9994688	11
MALFAL66	0.271176	-0.0044091	0.0078719	0.0002581	36
CONFUC	0.206310	0.0115408	0.0247881	1.0000000	9
LIFE060	0.191369	0.0001490	0.0003441	0.9917228	34
LAAM	0.169619	-0.0021592	0.0053244	0.0149570	30
SAFRICA	0.148902	-0.0022143	0.0058705	0.0040362	55
SPAIN	0.130906	-0.0013871	0.0040037	0.0097704	59
MUSLIM00	0.098725	0.0011993	0.0041273	0.9904280	38
BUDDHA	0.098542	0.0021020	0.0071650	1.0000000	5
AVELF	0.097996	-0.0011479	0.0039739	0.0006939	3
GVR61	0.092900	-0.0040794	0.0145970	0.0038213	25
MINING	0.086407	0.0032135	0.0118888	0.9999768	37
OPENDEC1	0.085635	0.0007870	0.0029552	0.9935540	41
RERD	0.078479	-0.0000061	0.0000241	0.0000000	53
YRSOPEN	0.070527	0.0008070	0.0034025	0.9993903	66
H60	0.062957	-0.0042903	0.0192760	0.0043998	26
POP1560	0.060308	0.0030890	0.0153386	0.9601545	48
GOVSH61	0.059808	-0.0021945	0.0105906	0.0143794	24
OTHFRAC	0.058741	0.0003834	0.0018024	0.9976167	43
DENS60	0.056617	0.0000006	0.0000031	0.9969797	10
TROPPOP	0.046315	-0.0004906	0.0026574	0.0066069	63
PRIGHTS	0.046093	-0.0000771	0.0004391	0.0362094	47
BRIT	0.041742	0.0001965	0.0011794	0.9914714	4
GGCFD3	0.040958	-0.0022662	0.0140473	0.0347673	22
HINDU00	0.035763	0.0005569	0.0037296	0.9705282	28
SCOUT	0.035608	-0.0001281	0.0008419	0.0026679	56
PRIEXP70	0.035540	-0.0003277	0.0022927	0.0621272	51
GOVNOM1	0.035295	-0.0012530	0.0081051	0.0069698	23
PROT00	0.034142	-0.0003317	0.0024040	0.0627380	52
ABSLATIT	0.031772	0.0000028	0.0000475	0.6243548	1
EUROPE	0.030006	0.0000238	0.0016896	0.5544558	17
FERTLDC1	0.028848	-0.0001240	0.0019735	0.3484124	18
REVCOUP	0.028195	-0.0001987	0.0015540	0.0069870	54
SIZE60	0.027492	-0.0000305	0.0002830	0.0988651	57
CATH00	0.026810	-0.0001690	0.0016731	0.1383066	6
CIV72	0.026782	-0.0001796	0.0015182	0.0392428	7
COLONY	0.025539	-0.0001112	0.0010417	0.0929950	8
POP6560	0.022854	-0.0000589	0.0183851	0.4751903	50
AIRDIST	0.022404	0.0000000	0.0000001	0.3221300	2
LHCPC	0.020237	0.0000046	0.0000666	0.7334091	33
DPOP6090	0.019800	0.0014847	0.0437023	0.6449495	13
POP60	0.019630	0.0000000	0.0000000	0.9422313	49
PI6090	0.019296	-0.0000015	0.0000171	0.0476783	45
SQPI6090	0.019247	0.0000000	0.0000002	0.0985608	46
TOT1DEC1	0.018641	0.0005538	0.0074665	0.8701250	60
NEWSTATE	0.018607	0.0000178	0.0002950	0.7736336	39
GDE1	0.018409	0.0008055	0.0112503	0.8679994	19
WARTORN	0.018230	-0.0000120	0.0004073	0.3302798	65
GEEREC1	0.018129	0.0017092	0.0269759	0.8450549	21
TOTIND	0.017559	-0.0001013	0.0015030	0.1546216	61
SOCIALIST	0.016747	0.0000637	0.0008142	0.9855496	58
OIL	0.015823	0.0000551	0.0009593	0.7935284	40
WARTIME	0.015761	-0.0000231	0.0011902	0.3749128	64
ENGFRAC	0.015729	-0.0000259	0.0009258	0.3936042	16
ORTH00	0.015568	0.0001036	0.0018590	0.8799460	42
LT100CR	0.015507	-0.0000270	0.0007473	0.3774424	35
ZTROPICS	0.015416	-0.0000290	0.0008354	0.3641671	67
HERF00	0.015220	-0.0000642	0.0010526	0.1444810	27
ECORG	0.015093	-0.0000009	0.0001336	0.4694892	15
LANDLOCK	0.014914	-0.0000273	0.0005522	0.2572750	32
LANDAREA	0.014498	0.0000000	0.0000000	0.4819285	31
DENS65I	0.014181	0.0000000	0.0000019	0.3790283	12

The table above follows the second table from the paper, in which \(K=7\). In this table we can see the Posterior inclusion probability on the left column, the posterior mean conditional on inclusion, the s.d of the former column, and the certainty of the sign. The table shows that 18 variables are the most important given \(K = 7\).

kable(table5, "html") %>%
    kable_styling(position = "center") %>%
    scroll_box(width = "500px", height = "200px")

	kbar5	kbar7	kbar9	kbar11	kbar16	kbar22	kbar28
EAST	1.0000000	0.9999726	1.0000000	0.0000000	0.0000000	0.0000000	0.0000000
P60	1.0000000	1.0000000	0.9998830	1.0000000	1.0000000	1.0000000	1.0000000
IPRICE1	0.0000000	0.0000000	0.0000000	0.9992479	0.0000939	0.0000356	0.0000000
TROPICAR	0.0002919	0.0004096	0.0004425	0.0000921	0.9982069	1.0000000	1.0000000
MALFAL66	0.0000359	0.0005035	0.0005803	0.0011109	0.0020900	0.9998971	0.9997700
GDPCH60L	0.0014540	0.9994688	0.9998854	0.9995790	1.0000000	0.9921956	0.0006346
DENS65C	1.0000000	0.0002581	1.0000000	1.0000000	0.9992396	0.0031843	0.0042482
LIFE060	0.9901516	1.0000000	0.0009440	0.9944940	0.9998980	0.0031432	0.9988453
SPAIN	0.0029136	0.9917228	0.0194972	0.0120155	0.0029026	0.0124670	0.0178356
CONFUC	1.0000000	0.0149570	0.9938396	0.0034777	0.0124001	0.0004788	0.9864349
LAAM	0.0135675	0.0040362	0.0038540	0.0012028	0.9967089	0.9991885	0.9975094
SAFRICA	0.0056695	0.0097704	0.0072230	0.9997827	0.9997824	0.9984524	0.9997626
GVR61	0.0010448	0.9904280	1.0000000	0.9998965	0.0028309	0.9966282	0.0055281
AVELF	0.0002597	1.0000000	1.0000000	0.9953774	0.9975634	0.9994641	0.9935003
BUDDHA	1.0000000	0.0006939	0.9878105	0.0070153	0.9965085	0.9966865	0.9934124
OPENDEC1	0.9960361	0.0038213	0.0013374	0.0004702	0.9990012	0.9954507	0.9953407
YRSOPEN	1.0000000	0.9999768	0.0027741	0.0050941	0.0062613	0.0002133	0.0066676
MUSLIM00	0.9764486	0.9935540	0.9974608	0.9973672	0.0088176	0.0180082	0.0008388
RERD	0.0000000	0.0000000	0.9866140	0.0000000	0.0000060	0.0323440	0.0388872
GOVSH61	0.0258568	0.9993903	0.9976769	0.9988442	0.0021382	0.0063637	0.0292408
H60	0.0014897	0.0043998	0.0000000	0.0253577	0.0303822	0.0126392	0.0020297
MINING	1.0000000	0.9601545	0.9969188	0.0131370	0.0195550	0.9616922	0.9638914
POP1560	0.9432698	0.0143794	0.0236151	0.9783628	0.9596923	0.0026265	0.7919412
TROPPOP	0.0088262	0.9976167	0.0136628	0.9957744	0.0140196	0.0344739	0.0392752
BRIT	0.9973896	0.9969797	0.0052691	0.0338398	0.0270944	0.0273567	0.0020213
OTHFRAC	0.9956865	0.0066069	0.9449499	0.0068680	0.9849362	0.0441570	0.9795539
PRIEXP70	0.0427285	0.0362094	0.0300684	0.9435402	0.9640732	0.0532203	0.0347143
DENS60	0.9945519	0.9914714	0.0103547	0.0115381	0.0181061	0.9470223	0.0425533
PRIGHTS	0.0297478	0.0347673	0.0071478	0.9730671	0.0632707	0.0575358	0.9187230
ABSLATIT	0.7517146	0.9705282	0.9779650	0.0175436	0.0200521	0.0028198	0.0362566
PROT00	0.0459113	0.0026679	0.9790616	0.9738543	0.0038267	0.7156889	0.0607893
SCOUT	0.0021540	0.0621272	0.0018104	0.0026235	0.6966108	0.9663910	0.0719798
HINDU00	0.9767083	0.0069698	0.0582253	0.0041866	0.0035652	0.9583467	0.0486292
GOVNOM1	0.0164067	0.0627380	0.0713037	0.0858475	0.8631939	0.0344140	0.0557547
SIZE60	0.0796244	0.6243548	0.2525893	0.5901216	0.1246206	0.8024032	0.9188941
EUROPE	0.5477184	0.5544558	0.0050172	0.0789799	0.9476055	0.0304840	0.8966488
GGCFD3	0.0825900	0.3484124	0.5499705	0.1847163	0.1251417	0.0664721	0.1965394
REVCOUP	0.0034276	0.0069870	0.1519555	0.0795802	0.0660271	0.0604580	0.0527306
CIV72	0.0344701	0.0988651	0.5827687	0.0723053	0.0728469	0.8664876	0.0611841
COLONY	0.0956391	0.1383066	0.1213194	0.1391926	0.9516164	0.9489659	0.7522167
POP6560	0.4377194	0.0392428	0.0693072	0.2476116	0.4110972	0.1752038	0.9760258
FERTLDC1	0.4170570	0.0929950	0.2894308	0.1605457	0.1431437	0.3507316	0.3227974
CATH00	0.1834077	0.4751903	0.8074614	0.8746640	0.2293214	0.2967356	0.9414636
PI6090	0.0405174	0.3221300	0.5344786	0.4809732	0.9298539	0.3264203	0.9316514
SQPI6090	0.0547602	0.7334091	0.1346959	0.9190784	0.7585217	0.1573904	0.1768022
DPOP6090	0.6793459	0.6449495	0.9185238	0.6099967	0.2374492	0.9817841	0.9226316
GDE1	0.9446998	0.9422313	0.8788833	0.8240765	0.0997153	0.9055678	0.2589900
WARTORN	0.2873589	0.0476783	0.0541076	0.8146693	0.9190261	0.9329807	0.1434518
NEWSTATE	0.8217165	0.0985608	0.8301538	0.9948881	0.4936576	0.0704008	0.3399390
AIRDIST	0.2894000	0.8701250	0.9885421	0.1805732	0.7809978	0.7191525	0.0705414
SOCIALIST	0.9987054	0.7736336	0.6312985	0.1144136	0.6902596	0.1999461	0.3786010
WARTIME	0.2028565	0.8679994	0.7861844	0.5507275	0.1315756	0.5796536	0.5382243
POP60	0.9462052	0.3302798	0.3868591	0.7599710	0.6860807	0.3158392	0.6096466
LHCPC	0.5310281	0.8450549	0.7983607	0.1226738	0.9842786	0.2503271	0.6091377
OIL	0.7358931	0.1546216	0.3102515	0.7799627	0.6482800	0.5606528	0.6189761
ENGFRAC	0.2787334	0.9855496	0.1451161	0.9129400	0.5406897	0.4975778	0.4012236
TOTIND	0.1683113	0.7935284	0.1890896	0.1559409	0.2238068	0.5509662	0.6828875
LT100CR	0.3741114	0.3749128	0.2888909	0.4978541	0.3350369	0.5440671	0.3420667
ORTH00	0.8885191	0.3936042	0.4633074	0.3466382	0.3672915	0.4329094	0.4591506
TOT1DEC1	0.8789141	0.8799460	0.2832606	0.4079328	0.2616272	0.3932785	0.4538845
GEEREC1	0.8475396	0.3774424	0.4545767	0.4846810	0.3064087	0.6499269	0.5344941
ZTROPICS	0.2395763	0.3641671	0.7779417	0.2301488	0.4457482	0.5041232	0.1625185
ECORG	0.5883517	0.1444810	0.3132020	0.3296926	0.2866268	0.2512713	0.5790139
LANDLOCK	0.4828340	0.4694892	0.1511065	0.3264070	0.3046215	0.1844666	0.3224001
DENS65I	0.4065801	0.2572750	0.8207921	0.4095387	0.5985423	0.6416562	0.6047849
HERF00	0.1408257	0.4819285	0.4542343	0.2606749	0.2310337	0.3814386	0.5458431
LANDAREA	0.2766586	0.3790283	0.3708982	0.7974722	0.6899957	0.4457971	0.2075870

The second table follows the fifth table in the paper. The table shows the certainty of the sign per \(K\) (\(K\) is taking the values of 5,7,9,11,16,22,28). The numbers in the table are the probabilities that the sign is certain.

Extensions

I am going to argue that we can use this method, combined with lasso to show some patterns of causality. We can recall that lasso throws away variables, and it will throw away variables that are highly correlated with other variables in our set of variables. In our case we have a dummy variable for ‘east asian’ countries, and Confutian variable for percentage of Confutians in the country. This variables are, obviously, highly correlated, and we expect lasso to throw the Confutian variable away for \(\lambda\) high enough. If so, we would be able to know that East-Asian country -> more growth. Nontheless, we don’t know what is the mechanism it goes through. Here the BACE comes in handy.
The BACE method starts at randomization of \(K\) variables. So, the higher the \(K\) the higher the probability that we will get both East Asian dummy variable and the Confutian variable, and the importance of the dummy will decrease. If we will see that happen we will know that mechanism of the East Asia variable go through Confutian variable.
To summarise - we can use the lasso for causation, and the BACE for mechanism given the lasso.
I will now show that the importance of the East Asian dummy will decrease as \(K\) goes up. I will also show the influence of being in East Asia on GDP growth. I will leave for someone else to show the relations between Confutian population percentage and GDP growth.
First, I will show the change in the importance of the variables as \(K\) increases.

bayesian2 <- function(iterations, k_size){

  a <- bms(X.data = GR6096 ~ ., mcmc="bd",
               iter = iterations,
               g="UIP", mprior="fixed",
               mprior.size = k_size, user.int = FALSE,
               nmodel=10, data = data1[,4:71])

  b <- coef(a) %>%  as.data.frame() %>% rownames_to_column() 
  colnames(b)[1] <- "name"
  b %<>% rownames_to_column() 
  b <- b[,c(2,1,3)]
  return(b)
  
  
}

n <- 100000
df <- bayesian2(iterations = n, k_size = 1)
df1 <- df[,1:2]
df2 <- df[,c(1,3)]
colnames(df1)[2] <- "1"
colnames(df2)[2] <- "1"
for (i in 2:67) {

  tmp <- bayesian2(iterations = n, k_size = i)
  tmp1 <- tmp[,1:2]
  tmp2 <- tmp[,c(1,3)]
  colnames(tmp1)[2] <- i
  colnames(tmp2)[2] <- i
  
  df1 %<>% left_join(tmp1, by = "name")
  df2 %<>% left_join(tmp2, by = "name")
}

We can see that the EAST variable is the most important one, but ist importance goes down as the \(K\) goes up, because now it also includes Confutian variable, and it takes some of the variance of EAST. (It’s consistent with table 3 in the paper, which has not been shown here). It’s can be seen that the row of EAST tend to white, but as we get to the left of the x-axis it gets less white. It can be seen both in the first graph (which reflects the probablity of EAST to be in the model) and both in the second graph (which shows the estimator for the \(\beta\)s).

long_df1 <- melt(df1, id = "name")
long_df1$name <- factor(long_df1$name, levels = df1$name)

long_df1$value <- as.numeric(levels(long_df1$value))[long_df1$value]

p <- long_df1 %>% 
  ggplot(aes(y = name, x = variable)) +
  geom_tile(aes(fill = value),colour = "white") + 
              scale_fill_gradient(low = "firebrick3",high = "white")


p + theme_grey(base_size = 9) + labs(x = "",
     y = "") + scale_x_discrete(expand = c(0, 0)) +
     scale_y_discrete(expand = c(0, 0))

long_df2 <- melt(df2, id = "name")
long_df2$name <- factor(long_df2$name, levels = df2$name)

v <- long_df2 %>% 
  ggplot(aes(y = name, x = variable)) +
  geom_tile(aes(fill = value),colour = "white") + 
              scale_fill_gradient(low = "dodgerblue4",high = "white")


v + theme_grey(base_size = 9) + labs(x = "",
     y = "") + scale_x_discrete(expand = c(0, 0)) +
     scale_y_discrete(expand = c(0, 0))

Now, to finish this section we will use the double selection lasso to see the influence of the East Asia dummy on the GDP growth. We would like to see if it throws away the Confutian variable. In any case we can still argue that it is part of the East Asia mechanism as explained above.

YT <- data1$GR6096
Y0 <- data1$EAST
X <- data1[,(c(5:17, 19:71))] %>% as.matrix()
double_Lasso <- rlassoEffect(x = X, y = YT, d = Y0,
                             method = "double selection")
double_Lasso$selection.index

##  ABSLATIT   AIRDIST     AVELF      BRIT    BUDDHA    CATH00     CIV72 
##     FALSE     FALSE     FALSE     FALSE      TRUE     FALSE     FALSE 
##    COLONY    CONFUC    DENS60   DENS65C   DENS65I  DPOP6090     ECORG 
##     FALSE      TRUE     FALSE      TRUE     FALSE     FALSE     FALSE 
##   ENGFRAC    EUROPE  FERTLDC1      GDE1  GDPCH60L   GEEREC1    GGCFD3 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
##   GOVNOM1   GOVSH61     GVR61       H60    HERF00   HINDU00   IPRICE1 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
##      LAAM  LANDAREA  LANDLOCK     LHCPC   LIFE060   LT100CR  MALFAL66 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
##    MINING  MUSLIM00  NEWSTATE       OIL  OPENDEC1    ORTH00   OTHFRAC 
##     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
##       P60    PI6090  SQPI6090   PRIGHTS   POP1560     POP60   POP6560 
##      TRUE     FALSE     FALSE     FALSE     FALSE     FALSE     FALSE 
##  PRIEXP70    PROT00      RERD   REVCOUP   SAFRICA     SCOUT    SIZE60 
##     FALSE     FALSE      TRUE     FALSE     FALSE     FALSE     FALSE 
## SOCIALIST     SPAIN  TOT1DEC1    TOTIND  TROPICAR   TROPPOP   WARTIME 
##     FALSE     FALSE     FALSE     FALSE     FALSE      TRUE     FALSE 
##   WARTORN   YRSOPEN  ZTROPICS 
##     FALSE      TRUE     FALSE

double_Lasso$coefficients.reg

##   (Intercept)           xd1       xBUDDHA       xCONFUC      xDENS65C 
##  1.430168e-02  8.660365e-03  1.383491e-02  5.270209e-02  2.504732e-06 
##          xP60         xRERD      xTROPPOP      xYRSOPEN 
##  1.304975e-02 -6.886884e-05 -9.932221e-03  9.374284e-03

As we can see the double selection lasso kept the Confutian variable. It is possible though, that for higher \(\lambda\) it would have thrown it away.
Someone, someday may try and find the pattern exists between CONFUC and growth. I will leave it open for further work.

BACER

Dor Goldenberg

July 15, 2019