US International goods and services Trade dataset is a combination of two datasets collected from United States Census Bureau and Federal Reserve. They can be accessed from
URL:(Capital Export) http://www.census.gov/foreign-trade/statistics/historical/SAEXP.xls (Capital Import) http://www.census.gov/foreign-trade/statistics/historical/SAIMP.xls and URL:(Capacity Utilization) http://www.federalreserve.gov/datadownload/Build.aspx?rel=G17 (Euro Dollar Rate) http://www.federalreserve.gov/datadownload/Build.aspx?rel=H15 ###Read in the dataset
#data read in
rm(list=ls())
Trade.data<-read.csv("~/Desktop/Applied_Regression/Logistic/International_Export_Import.csv")
head(Trade.data,n=14L);
## Time Export Import TradeBalance CapacityUtilization EuroDollarRate
## 1 1992-03 16054 11365 1 80.2999 4.43
## 2 1992-04 14347 10863 1 80.6967 4.19
## 3 1992-05 13956 10407 1 80.7975 3.99
## 4 1992-06 15698 11533 1 80.5974 4.00
## 5 1992-07 13979 11485 1 81.1304 3.54
## 6 1992-08 13547 11306 1 80.5548 3.43
## 7 1992-09 14606 11680 1 80.5531 3.22
## 8 1992-10 15625 12177 1 80.9928 3.32
## 9 1992-11 14165 11581 1 81.1599 3.70
## 10 1992-12 15794 12419 1 81.0541 3.60
## 11 1993-01 13903 10521 1 81.2884 3.37
## 12 1993-02 13667 10870 1 81.4514 3.24
## 13 1993-03 16619 13334 1 81.2911 3.21
## 14 1993-04 15222 12367 1 81.4254 3.21
The US International goods and services Trade dataset contains 6 variables which cover the data from March 1992 to March 2015.
str(Trade.data)
## 'data.frame': 277 obs. of 6 variables:
## $ Time : Factor w/ 277 levels "1992-03","1992-04",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Export : num 16054 14347 13956 15698 13979 ...
## $ Import : num 11365 10863 10407 11533 11485 ...
## $ TradeBalance : int 1 1 1 1 1 1 1 1 1 1 ...
## $ CapacityUtilization: num 80.3 80.7 80.8 80.6 81.1 ...
## $ EuroDollarRate : num 4.43 4.19 3.99 4 3.54 3.43 3.22 3.32 3.7 3.6 ...
Time: time of the data, marked with the format yyyy-mm
Export:Total Capital Goods(The materials using for final goods production) Export at Time(i), i=1,2,3,…277, the unit is millions
Import:The value of total capital goods Import at Time(i), i=1,2,3,…277, the unit is millions
TradeBalance:if the International Trade Account is deficit(Import>Export), TradeBalance=0. If the account is surplus(Export>Import), TradeBalance=1
CapacityUtilization:The Ratio of Capacity actually used over installed productive capacity (Wikipedia) the unit is Percentage
EuroDollarRate:U.S.-dollar denominated deposits in foreign banks or foreign branches of American banks. (investopedia) the unit is percentage
For this Analysis, there mush be two continuous independent variabls and one categorical dependent variable. I am interest in how Capacity Utilization and Euro Dollar affect the US Internation Trade Balance.
In This case, I want to test if US International Trade Balance can be explained by US Capacity Utilization and Euro Dollar Rate. \(Alternative Hypothesis\)
the \(Null Hypothesis\) is that US International Trade Balance can not be explained by US Capacity Utilization and Euro Dollar Rate.In other words, US International Trade Balance can be explained by other variables or it just turns out to be randomization.
In this case, I try to find out whether the volatility of Euro Dollar Rate and US Capacity Utilization will make US Trade Balance deficit or surplus.
#construct new dataset containing the data I used
attach(Trade.data)
Trade.subdata <- subset(Trade.data,select = c(TradeBalance,CapacityUtilization,EuroDollarRate))
cor(Trade.subdata)
## TradeBalance CapacityUtilization EuroDollarRate
## TradeBalance 1.0000000 0.1636181 0.3659614
## CapacityUtilization 0.1636181 1.0000000 0.7056917
## EuroDollarRate 0.3659614 0.7056917 1.0000000
Following the correlation of the dataset, I will use the single independent variable regression between Trade Balance and EuroDollarRate firstly, and then plug in Capacity Utilization to check if the variable will contribute significant explaination to the dependent variable.
attach(Trade.subdata)
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
TradeBalance<-factor(TradeBalance)
model.1IV<-glm(TradeBalance~EuroDollarRate,data = Trade.subdata,family = "binomial")
summary(model.1IV)
##
## Call:
## glm(formula = TradeBalance ~ EuroDollarRate, family = "binomial",
## data = Trade.subdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9888 -0.9737 0.6593 0.9553 1.4344
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.74037 0.23088 -3.207 0.00134 **
## EuroDollarRate 0.37559 0.06413 5.857 4.72e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 371.34 on 276 degrees of freedom
## Residual deviance: 332.92 on 275 degrees of freedom
## AIC: 336.92
##
## Number of Fisher Scoring iterations: 4
library(aod)
wald.test(b = coef(model.1IV),Sigma = vcov(model.1IV),Terms =2)
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 34.3, df = 1, P(> X2) = 4.7e-09
attach(Trade.subdata)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 4):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
model.2IV<-glm(TradeBalance~EuroDollarRate+CapacityUtilization,data = Trade.subdata,family = "binomial")
summary(model.2IV)
##
## Call:
## glm(formula = TradeBalance ~ EuroDollarRate + CapacityUtilization,
## family = "binomial", data = Trade.subdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0897 -0.9395 0.6805 0.8985 1.6211
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.57677 3.74835 2.021 0.0432 *
## EuroDollarRate 0.51834 0.09228 5.617 1.94e-08 ***
## CapacityUtilization -0.11114 0.05000 -2.223 0.0262 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 371.34 on 276 degrees of freedom
## Residual deviance: 327.67 on 274 degrees of freedom
## AIC: 333.67
##
## Number of Fisher Scoring iterations: 4
wald.test(b = coef(model.2IV),Sigma = vcov(model.2IV),Terms =2:3)
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 37.9, df = 2, P(> X2) = 5.8e-09
The model with two continuous Independent Variables has the highest Chi Squre score. Also the coefficients of two IVs are both significant from zero. Hence, I will use the two independent variables as my final models.
FinalModel<-glm(TradeBalance~EuroDollarRate+CapacityUtilization,data = Trade.subdata, family = "binomial")
summary(FinalModel)
##
## Call:
## glm(formula = TradeBalance ~ EuroDollarRate + CapacityUtilization,
## family = "binomial", data = Trade.subdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0897 -0.9395 0.6805 0.8985 1.6211
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.57677 3.74835 2.021 0.0432 *
## EuroDollarRate 0.51834 0.09228 5.617 1.94e-08 ***
## CapacityUtilization -0.11114 0.05000 -2.223 0.0262 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 371.34 on 276 degrees of freedom
## Residual deviance: 327.67 on 274 degrees of freedom
## AIC: 333.67
##
## Number of Fisher Scoring iterations: 4
par(mfrow = c(1,1))
FinalModel.res<-residuals(FinalModel,type = "deviance")
plot(fitted(FinalModel),FinalModel.res,pch=21, cex=1, bg='blue',main="Plot of Fitted Values vs. Residuals ", xlab = "Fitted Values of Model", ylab = "Residuals")
abline(0,0)
In the plot below, I will check if the residuals has correlation with the Independent variable.
####Residual Vs. Capacity Utilization
attach(Trade.subdata)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 3):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 5):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
plot(CapacityUtilization,FinalModel.res,pch=21, cex=1, bg='blue',main="Plot of Capacity Utilization vs. Residuals ", xlab = "Capacity Utilization", ylab = "Residuals")
abline(1,0)
abline(-1,0)
plot(EuroDollarRate,FinalModel.res,pch=21, cex=1, bg='blue',main="Plot of EuroDollarRate vs. Residuals ", xlab = "EuroDollarRate", ylab = "Residuals")
abline(1,0)
abline(-1,0)
We can seee in the plot that the residuals have apparently correlation with the Capacity Utilization, when Untilization is over 75%. Also, residuals of the model have positive correlation with Euro Dollar Rate, since it keep increasing along with the rise of Euro Dollar Rate.
The histogram of the residuals are not normally distributed. The peak of the histogram diagram is lower than the sides both on the left and right. I am curious that if I can separate the dataset into two subsets: one subset contains the data when Trade Balance equals to 1, and the other contains the data when Trade Balance equals. Therefore, the residuals can be seen as the two residuals subsets for the prediction of Trade Balance surplus and deficit. Then, we want to check if the residuals for each of predication are normal distributed.
hist(FinalModel.res,xlab = "Residuals",main = "The Histogram of Standard Residuals of model")
fit1<-predict(FinalModel,subset(Trade.subdata,Trade.subdata$TradeBalance==1))
fit0<-predict(FinalModel,subset(Trade.subdata,Trade.subdata$TradeBalance==0))
#find the residual around 1 and 0
resid1<-fit1-1
resid0<-fit0-0
hist(resid1, main = "Histogram of Residual When Dependent Variable = 1",xlab = "Residual")
hist(resid0, main = "Histogram of Residual When Dependent Variable = 0",xlab = "Residual")
apparently, each of the residual subsets is not normally distributed.
boxplot(FinalModel.res,main="Box PLot of the Residual")
qqnorm(FinalModel.res,main = "QQPlot of the Residual")
qqline(FinalModel.res)
qqnorm(resid1,main="QQplot of the Residual, When Trade Balance =1")
qqline(resid1)
qqnorm(resid0,main = "QQplot of the Residual, When Trade Balance =0")
qqline(resid0)
plot(FinalModel.res, ylab = "Standardized Residual", main = ("Standardized residual plot"))
We can see in the summary, the intercept of the model is 7.58, the slope of Euro Dollar rate is 0.52 and the slope of the slope of Capacity Utilization is -0.11. The intercept of the model shows that when Euro Dollar Rate and Capacity Utilization both equal to zero, the odds ratio is equal to 7.57, which means US trading balance is surplus. Increasing Euro Dollar Rate by 1 units (1%) will increase the odds ratio by 0.52, while keeping Capacity Utilization unchanged. In a similar way, increaing 1 unit of Capacity Utilization (1%) will decresse the odds ratio by 0.11.
In this case, the p-values for the estimate is 0.0432 (intercept), 1.04e-08, (Euro Dollar Rate) and 0.0262(Capacity Utilization). If we use the significant level of 95%, the Null hypothesis that the two independent variables cannot explain the dependent variable is rejected. Hence, we accept the alternative hypothesis that Euro Dollar rate and US Capacity Utilization can exlain the phenomenon of US Trade Balance deficit or surplus.
summary(FinalModel)
##
## Call:
## glm(formula = TradeBalance ~ EuroDollarRate + CapacityUtilization,
## family = "binomial", data = Trade.subdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0897 -0.9395 0.6805 0.8985 1.6211
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.57677 3.74835 2.021 0.0432 *
## EuroDollarRate 0.51834 0.09228 5.617 1.94e-08 ***
## CapacityUtilization -0.11114 0.05000 -2.223 0.0262 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 371.34 on 276 degrees of freedom
## Residual deviance: 327.67 on 274 degrees of freedom
## AIC: 333.67
##
## Number of Fisher Scoring iterations: 4
From the plot, we car hardly conclude that the model does not shows the linearity of the residuals. In other words, the expected value of reisidual is not zero in this case. The empirical scatter plot of the residuals shaw that there are two groups of reisduals, separately falling around the categorical value we assumed.
par(mfrow = c(1,1))
FinalModel.res<-residuals(FinalModel,type = "deviance")
plot(fitted(FinalModel),FinalModel.res,pch=21, cex=1, bg='blue',main="Plot of Fitted Values vs. Residuals ", xlab = "Fitted Values of Model", ylab = "Residuals")
abline(0,0)
The plot of the fitted value Vs. Residual does nots show linear. So, the expected value of residual is not zero.
par(mfrow = c(1,1))
plot(FinalModel.res,pch=21,cex=1,bg="blue",xlab = "index",ylab = "Residual", main="Residual Value")
The plot does not clearly show there is absolutely no serial correlation. We can find autocorrelation does exist in the year of 2005~2009 (index 150 to 200)
The residual is not normal distributed as showed in the graph. I try to look into the histograms separated by the dependent value, when dependent variable equal to 1 or 0. The two histograms also appear not normal distributed, with some skewness. It is showed that the residual plot, when dependent variable is equal to 1, is more closed to normal ditribution, rather than the residuals, when dependet variable is equal to 0. (We can find large deviation in the both sides of the plot from QQ plot)
hist(FinalModel.res, main = "Residual Histogram")
#find the residual around 1 and 0
hist(resid1, main = "Histogram of Residual When Dependent Variable = 1",xlab = "Residual")
hist(resid0, main = "Histogram of Residual When Dependent Variable = 0",xlab ="Residual")
qqnorm(resid1,main="QQplot of the Residual, When Trade Balance =1")
qqline(resid1)
qqnorm(resid0,main = "QQplot of the Residual, When Trade Balance =0")
qqline(resid0)
The residual of the model turns to be Heteroskedastic. Next, I use Breusch-Pagan test to find if the residual is really Heteroskedastic.
plot(FinalModel.res,main = "Residual plot")
attach(Trade.subdata)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 3):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 4):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 6):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
#test TradeBalance ~ EuroDollarRate
bptest(model.1IV)
##
## studentized Breusch-Pagan test
##
## data: model.1IV
## BP = 3.1605, df = 1, p-value = 0.07544
# test TradeBalance ~ CapacityUtilization
model.1IV2<-glm(TradeBalance~CapacityUtilization,data = Trade.subdata,family = "binomial")
bptest(model.1IV2)
##
## studentized Breusch-Pagan test
##
## data: model.1IV2
## BP = 65.3676, df = 1, p-value = 6.215e-16
# test TradeBalance ~ EuroDollarRate + CapacityUtilization
bptest(FinalModel)
##
## studentized Breusch-Pagan test
##
## data: FinalModel
## BP = 4.9134, df = 2, p-value = 0.08572
The Null Hypothesis for the Breusch - Pagan test is that the data is homoskedestic and the Alternative Hypothesis is that the data is heteroskedastic. the first test for Trade balance which is explained by Euro Dollar Rate, second test for Trade balance explaiined by Capacity Utilization and the last shows that Trade balance explained by both independent variables. The p-value for the three tests are 0.07544, 6.25e-16 and 0.08572. If we use alpha = 10%, the Null hypothesis will be rejected, which means the data is heteroskedastic.
Both Euro Dollar Rate and Capacity Utilization are not ultimate or proximal cause of the Trade Balance deficit or surplus. There are too many factors that will affect the Trade Balance fluctuating. Also, Euro Dollar Rate may be a probability cause of Trade Balance, because there is no clearly relationship bwtween Trade Balance and Euro Dollar Rate. The relationship between Trade Balance and Capacity Utilization is a little more complicated. there may be a probability cause between two variables, since the Trade Balance can also be affected by otehr factors. But, When native capacity utilization inceases, there is a obvious result that the demand for capital goods will rise. Hences, the relationship between Capacity and Trade Balance may also be determinate.
I use the G* Power with odds ratio (Calcuate using EuroDollarRate and CapacityUtilization both equal to 1), 0.39 for H0(from model), 0.05 alpha and .95 power to find the fitted size for the model. The best sample size is 255, which is really close to the number of dataset used. Hence, I decide to use the original dataset for the model.
#form an index array
samplesize<-255
set.seed(88)
samplerow<- nrow(Trade.subdata)
#random pick index from the oringinal set
model.index <- sample(samplerow, samplesize, replace = FALSE)
#construct a new set containing the samples
Trade.sample<-Trade.subdata[model.index,]
attach(Trade.sample)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 5):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 6):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 7):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 9):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
model.sample<-glm(Trade.sample$TradeBalance~Trade.sample$EuroDollarRate+Trade.sample$CapacityUtilization,data = Trade.sample, family = "binomial")
summary(model.sample)
##
## Call:
## glm(formula = Trade.sample$TradeBalance ~ Trade.sample$EuroDollarRate +
## Trade.sample$CapacityUtilization, family = "binomial", data = Trade.sample)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1166 -0.9681 0.6781 0.8845 1.5637
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.03602 3.81745 1.843 0.0653 .
## Trade.sample$EuroDollarRate 0.50610 0.09704 5.215 1.84e-07 ***
## Trade.sample$CapacityUtilization -0.10263 0.05100 -2.012 0.0442 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 335.69 on 254 degrees of freedom
## Residual deviance: 298.24 on 252 degrees of freedom
## AIC: 304.24
##
## Number of Fisher Scoring iterations: 4
wald.test(b = coef(model.sample),Sigma = vcov(model.sample),Terms =2:3)
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 32.8, df = 2, P(> X2) = 7.6e-08
I regressed Euro Dollar Rate on Capacity Utilization to check the whether colinearity existed in the two independent variables.The R square of the linear regression is 0.498, which is not a small number. Hence, there is some multicollinearity in the data.
attach(Trade.subdata)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.sample:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 6):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 7):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 8):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 10):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
Colinear<-lm(EuroDollarRate~CapacityUtilization)
summary(Colinear)
##
## Call:
## lm(formula = EuroDollarRate ~ CapacityUtilization)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.2343 -1.0346 0.1970 0.9844 3.6099
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -28.26228 1.91317 -14.77 <2e-16 ***
## CapacityUtilization 0.39916 0.02417 16.52 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.526 on 275 degrees of freedom
## Multiple R-squared: 0.498, Adjusted R-squared: 0.4962
## F-statistic: 272.8 on 1 and 275 DF, p-value: < 2.2e-16
The data is definately accurate, since the data is collected from the official government site. The only error we may need to think about is the Euro Dollar Rate, which is not absolutely determined by the market. Some manipulation may affect the real rate, if the faking LIBOR rate case happens on Euro Dollar Rate.
we input a interaction between two independent variables to model and find the effect of the interation is significant. Chi-squared test score improved and what’s more, the slope of Capacity Utilization turns out to more significant from zero, even with a more harsh test level.
attach(Trade.subdata)
## The following object is masked _by_ .GlobalEnv:
##
## TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 3):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.sample:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 7):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 8):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 9):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.subdata (pos = 11):
##
## CapacityUtilization, EuroDollarRate, TradeBalance
##
## The following objects are masked from Trade.data:
##
## CapacityUtilization, EuroDollarRate, TradeBalance
model.interact<-glm(TradeBalance~EuroDollarRate+CapacityUtilization+EuroDollarRate*CapacityUtilization,data = Trade.subdata, family = "binomial")
summary(model.interact)
##
## Call:
## glm(formula = TradeBalance ~ EuroDollarRate + CapacityUtilization +
## EuroDollarRate * CapacityUtilization, family = "binomial",
## data = Trade.subdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2112 -0.9837 0.5057 0.9955 1.7593
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 28.01647 7.11462 3.938 8.22e-05
## EuroDollarRate -7.67700 2.25556 -3.404 0.000665
## CapacityUtilization -0.37305 0.09233 -4.040 5.34e-05
## EuroDollarRate:CapacityUtilization 0.10266 0.02835 3.621 0.000294
##
## (Intercept) ***
## EuroDollarRate ***
## CapacityUtilization ***
## EuroDollarRate:CapacityUtilization ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 371.34 on 276 degrees of freedom
## Residual deviance: 312.97 on 273 degrees of freedom
## AIC: 320.97
##
## Number of Fisher Scoring iterations: 4
wald.test(b = coef(model.interact),Sigma = vcov(model.interact),Terms =2:4)
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 44.9, df = 3, P(> X2) = 9.9e-10
Euro Dollar Rate and Capacity Utilization are likely to exlain the log odds of Trade Balance for deficit or surplus.The two independent variables have co-linearity and the interactation effect is significant. But using two variables together can eplain the Trade Balance beter rather then a single variable.