We start the analysis with installing the packages: “lmtest” and “tseries” to find several useful diagnostics for example the Breusch-Pagan test of homoscedasticity.

Simple Linear Regression Model

Estimate the model that explains the GVA as a function of labor productivity

Call the data that use for analysis (the data that we will only use are GVA as the dependent variable (y) and Labor Productivity as the independent variable (x) which the Business Birth Rate will we use in other publication with the topic: Multiple Linear Regression)

y = data$GVA
x = data$Labor.Productivity

Plot the data on a scatter diagram

plot(x,y)

estimate a simple regression model and the fitted regression model line

model.1 <- lm(y ~ x)
plot(x, y)            
abline(model.1, col = "blue", lwd = 2)

summary of the model, which contains point estimates of the parameters, their significance, the R-squared, the adjusted R-squared, and the F-test.

summary(model.1)

## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5300 -0.8375 -0.4193  1.0386  3.0149 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -21.38866    3.19237  -6.700 5.36e-05 ***
## x             0.31460    0.03335   9.432 2.71e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.773 on 10 degrees of freedom
## Multiple R-squared:  0.899,  Adjusted R-squared:  0.8889 
## F-statistic: 88.97 on 1 and 10 DF,  p-value: 2.708e-06

AIC(model.1)

## [1] 51.61663

BIC(model.1)

## [1] 53.07135

The more low AIC and BIC of the model, the more model is efficient.

confint is used to make sure that the slope is not equals to zero.

confint(model.1)

##                   2.5 %      97.5 %
## (Intercept) -28.5016956 -14.2756203
## x             0.2402859   0.3889174

The effect of x on y is statistically significant and positive.

Breusch-Pagan heterogeneity test H0: There is no heteroscedasticity H1: There is heteroscedasticity

bptest(model.1)

## 
##  studentized Breusch-Pagan test
## 
## data:  model.1
## BP = 0.52669, df = 1, p-value = 0.468

The assumption of constant variance is met.

Jarque-Bera Normality test for regression model residuals. H0: The distribution of residuals is normal H1: The distribution of residuals is not normal

jarque.bera.test(model.1$residuals)

## 
##  Jarque Bera Test
## 
## data:  model.1$residuals
## X-squared = 0.37587, df = 2, p-value = 0.8287

you can use other normality test like Kolmogorov-Smirnov using fuction ks.test()

So the model that can we’ve founded is: y = -21.38866 + 0.31460x + e with all parameters are significant through F-stat (overall) and t-test (individuals), R-squared is 0.8889 which close to 1 means the model has good representation, no heteroscedasticity and the residuals is normal.

Reference: Piras, Gianfranco, and Giuseppe Arbia. A Primer for Spatial Econometrics: With Applications in R. Palgrave Macmillan, 2021.

An Easy Guidance of The Simple Linear Regression with Study Case

Davin Firmansha

2025-07-27

Simple Linear Regression Model