## [1] 35 6
The data sample has 35 rows and 6 columns
## Rows: 35
## Columns: 6
## $ Firm <chr> "NEDBANK", "NEDBANK", "NEDBANK", "NEDBANK", "NEDBANK", "RES", "RE…
## $ Year <dbl> 2019, 2020, 2021, 2022, 2023, 2019, 2020, 2021, 2022, 2023, 2019,…
## $ ROE <dbl> 14.00000000, 9.70000000, 14.40000000, 16.00000000, 17.00000000, 1…
## $ SDTA <dbl> 81.7106734, 87.1510726, 80.2505882, 81.7817823, 78.3318297, 16.15…
## $ LDTA <dbl> 1.9053492, 0.3492974, 4.2922815, 2.1881426, 5.3314670, 18.2492308…
## $ DTE <dbl> 510.3524038, 700.0236787, 546.9506171, 523.8273926, 512.1186051, …
The sample data has 4numeric variables and one character variable
library(stats)
shapiro_test_ROE <- shapiro.test(X7_Listed_Companies_ESE$ROE)
print(shapiro_test_ROE)##
## Shapiro-Wilk normality test
##
## data: X7_Listed_Companies_ESE$ROE
## W = 0.96939, p-value = 0.4268
The p-value = 0.4268 is greater than the significance level. This indicates that the data is normally distributed
##
## Shapiro-Wilk normality test
##
## data: X7_Listed_Companies_ESE$SDTA
## W = 0.78096, p-value = 8.827e-06
The p-value = 8.827e-06 is less than the significance level of 0.05. This indicates that the data is not normally distributed.
##
## Shapiro-Wilk normality test
##
## data: X7_Listed_Companies_ESE$LDTA
## W = 0.77905, p-value = 8.133e-06
The p-value = 8.133e-06 is less than the significance level of 0.05. This indicates that the data is not normally distributed.
##
## Shapiro-Wilk normality test
##
## data: X7_Listed_Companies_ESE$DTE
## W = 0.73732, p-value = 1.492e-06
The p-value = 1.492e-06 is less than the significance level of 0.05. This indicates that the data is not normally distributed.
library(car)
model <- lm(ROE ~ SDTA + LDTA + DTE, data = X7_Listed_Companies_ESE)
vif_results <- vif(model)
print(vif_results)## SDTA LDTA DTE
## 9.496595 1.450525 10.573011
The VIF of 9.496595 for the SDTA indicates a moderate to high correlation to the other independent variables. The VIF of 1.450595 for the LDTA indicates that it does not correlate with the other independent variables. On the other hand, the VIF value 0f 10.573011 for the DTE indicates very high multicollinearity to the other independent variables and it might need to be removed from the model.
library(lmtest)
lm_auto <- lm(ROE ~ SDTA + LDTA + DTE, data = X7_Listed_Companies_ESE)
dw_test <- dwtest(lm_auto)
print(dw_test)##
## Durbin-Watson test
##
## data: lm_auto
## DW = 1.1955, p-value = 0.001504
## alternative hypothesis: true autocorrelation is greater than 0
The dw value of 1.1955 indicates a positive autocorrelation. This implies that a positive error in one observation is followed by another in subsequent observations. This calls for the application of Generalized Least Squares procedure to correct the autocorrelation.
##
## studentized Breusch-Pagan test
##
## data: lm_model
## BP = 4.6498, df = 3, p-value = 0.1993
The p-value of 0.1993 indicates no evidence of heteroscedasticity (it is greater than 0.05). The variance of the errors/residuals is not constant across all levels of the independent variables. The results of our model are more rebust, they are less sensitive to the changes in the model or data.