How has the expenditures on R&D impacted technological spillover between the sectors and geographical areas?
Are there any margin of differences between the expenditures on R&D in Europe in comparison to other parts of the world?
Has the margin of difference suggest a different level of technological spillover between geographical areas?
Can we conclude that the spillover in one geographical area is better than the other?
The data set consist of 1629 observations with 7 variables.
## Classes 'tbl_df', 'tbl' and 'data.frame': 1629 obs. of 7 variables:
## $ year : num 1983 1983 1983 1983 1983 ...
## $ fi : num 1 2 3 4 5 6 7 8 9 10 ...
## $ sector: num 4 5 2 2 11 5 1 11 3 2 ...
## $ geo : num 3 3 3 1 4 1 3 3 3 3 ...
## $ patent: num 18 4 29 45 1 0 1 0 0 47 ...
## $ rdexp : num 5.29 4.31 3.76 5.87 4.21 ...
## $ spil : num 8.98 10.42 9.65 9.63 8.7 ...
## # A tibble: 5 x 7
## year fi sector geo patent rdexp spil
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1983 1 4 3 18 5.29 8.98
## 2 1983 2 5 3 4 4.31 10.4
## 3 1983 3 2 3 29 3.76 9.65
## 4 1983 4 2 1 45 5.87 9.63
## 5 1983 5 11 4 1 4.21 8.70
## [1] 0
This shows that there is no missing values
Note: fi , sector and geo are categorical variables but they encoded into numeric for the purpose of analysis.
## year fi sector geo
## Min. :1983 Min. : 1 Min. : 1.00 Min. :1.000
## 1st Qu.:1985 1st Qu.: 46 1st Qu.: 3.00 1st Qu.:3.000
## Median :1987 Median : 91 Median : 5.00 Median :3.000
## Mean :1987 Mean : 91 Mean : 6.32 Mean :2.641
## 3rd Qu.:1989 3rd Qu.:136 3rd Qu.: 9.00 3rd Qu.:3.000
## Max. :1991 Max. :181 Max. :15.00 Max. :4.000
## patent rdexp spil
## Min. : 0.00 Min. :0.8651 Min. : 6.825
## 1st Qu.: 3.00 1st Qu.:4.1617 1st Qu.: 8.864
## Median : 18.00 Median :5.0752 Median : 9.619
## Mean : 60.79 Mean :5.2013 Mean : 9.399
## 3rd Qu.: 57.00 3rd Qu.:6.0911 3rd Qu.: 9.980
## Max. :925.00 Max. :8.7000 Max. :10.759
## vars n mean sd median trimmed mad min max
## year 1 1629 1987.00 2.58 1987.00 1987.00 2.97 1983.00 1991.00
## fi 2 1629 91.00 52.27 91.00 91.00 66.72 1.00 181.00
## sector 3 1629 6.32 4.23 5.00 5.89 4.45 1.00 15.00
## geo 4 1629 2.64 0.73 3.00 2.79 0.00 1.00 4.00
## patent 5 1629 60.79 121.56 18.00 31.13 25.20 0.00 925.00
## rdexp 6 1629 5.20 1.26 5.08 5.12 1.43 0.87 8.70
## spil 7 1629 9.40 0.93 9.62 9.46 0.89 6.82 10.76
## range skew kurtosis se
## year 8.00 0.00 -1.23 0.06
## fi 180.00 0.00 -1.20 1.29
## sector 14.00 0.71 -0.65 0.10
## geo 3.00 -1.58 0.86 0.02
## patent 925.00 3.84 17.36 3.01
## rdexp 7.83 0.46 -0.46 0.03
## spil 3.93 -0.65 -0.29 0.02
From the correlation table, its obvious there is a strong relationship of about 55% between patent and rdexp which implies that there is a multicollinearity between the model if i will have to consider all the variables for model building. Rd exp on the other hand has 35% correlation coefficient with spil, the highest among other predictors.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## # A tibble: 4 x 3
## geo sum_rdexp sum_spil
## <dbl> <dbl> <dbl>
## 1 1 1460. 2378.
## 2 2 707. 1083.
## 3 3 6264. 11771.
## 4 4 41.3 78.8
## # A tibble: 15 x 3
## sector sum_rdexp sum_spil
## <dbl> <dbl> <dbl>
## 1 1 573. 1105.
## 2 2 1295. 2477.
## 3 3 983. 1724.
## 4 4 628. 1080.
## 5 5 1514. 2764.
## 6 6 334. 638.
## 7 7 518. 787.
## 8 8 122. 210.
## 9 9 511. 1038.
## 10 10 391. 826.
## 11 11 124. 236.
## 12 12 342. 599.
## 13 13 173. 324.
## 14 14 67.5 173.
## 15 15 897. 1331.
## # A tibble: 9 x 3
## year sum_rdexp sum_spil
## <dbl> <dbl> <dbl>
## 1 1983 892. 1664.
## 2 1984 912. 1678.
## 3 1985 929. 1692.
## 4 1986 937. 1696.
## 5 1987 946. 1701.
## 6 1988 957. 1711.
## 7 1989 962. 1715.
## 8 1990 969. 1725.
## 9 1991 969. 1731.
How has the expenditures on R&D impacted technological spillover between the sectors and geographical areas?
It is evident from the chart that the more a sector spend on r/d the highest the technological spillover effect on the sector. Take for an instance, Sector_1 with about 600m pounds expenditure on r/d further improve the sector as a result technological spillover by about 1200m pounds. This further improve the sector and the industry.
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
library(ggplot2)
#Scatter_plot
scatter <- ggplot(Yr, aes(year, sum_spil))
scatter + geom_point() + labs(x ="Year", y = "sum_spil") +
geom_smooth(method = "lm", colour = "Red", alpha = 0.1, fill = "Blue")
scatter <- ggplot(Yr, aes(year, sum_rdexp))
scatter + geom_point() + labs(x ="Year", y = "sum_rdexp") +
geom_smooth(method = "lm", colour = "Red", alpha = 0.1, fill = "Blue")
##
## Call:
## lm(formula = spil ~ ., data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.48192 -0.43267 0.05886 0.46771 1.83142
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.174e+01 1.467e+01 -4.209 2.71e-05 ***
## year 3.586e-02 7.389e-03 4.853 1.33e-06 ***
## fi -4.228e-04 3.649e-04 -1.159 0.247
## sector -9.342e-02 4.621e-03 -20.214 < 2e-16 ***
## geo -2.777e-01 2.858e-02 -9.714 < 2e-16 ***
## patent -1.335e-03 1.934e-04 -6.904 7.24e-12 ***
## rdexp 2.568e-01 1.853e-02 13.855 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7639 on 1622 degrees of freedom
## Multiple R-squared: 0.3206, Adjusted R-squared: 0.3181
## F-statistic: 127.6 on 6 and 1622 DF, p-value: < 2.2e-16
regressor = lm(formula = spil ~ year + sector + geo + patent + rdexp,
data = data)
summary(regressor)
##
## Call:
## lm(formula = spil ~ year + sector + geo + patent + rdexp, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.46087 -0.43013 0.06179 0.47028 1.83642
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.160e+01 1.467e+01 -4.199 2.83e-05 ***
## year 3.576e-02 7.389e-03 4.839 1.43e-06 ***
## sector -9.343e-02 4.622e-03 -20.214 < 2e-16 ***
## geo -2.743e-01 2.844e-02 -9.645 < 2e-16 ***
## patent -1.331e-03 1.934e-04 -6.881 8.48e-12 ***
## rdexp 2.583e-01 1.849e-02 13.972 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7639 on 1623 degrees of freedom
## Multiple R-squared: 0.32, Adjusted R-squared: 0.3179
## F-statistic: 152.8 on 5 and 1623 DF, p-value: < 2.2e-16
plot(regressor)
This model is significant as p-value is less than 0.05 (p-value: < 2.2e-16). With Adjusted R-squared of roughly 31.8% (Adjusted R-squared: 0.3179) it means the model can explain only 31.8% of the variation in technological spillover while 68% of the variation in spillover is being explained by other factors outside the model. Each of the variables are significant because they have all got P values that is far less than 0.05.