Installing the necessary Packages
library(readxl)
## Warning: package 'readxl' was built under R version 4.4.3
eyecare_data <- read_excel("Exhibits.xlsx", sheet = "Exhibit4")
head(eyecare_data)
summary(eyecare_data)
## Year Paying_Screening Paying_Surgery Free_Screening
## Min. :1976 Min. : 0 Min. : 248 Min. : 0
## 1st Qu.:1980 1st Qu.: 28422 1st Qu.: 2286 1st Qu.: 60846
## Median :1984 Median : 62980 Median : 5342 Median :102323
## Mean :1984 Mean : 91996 Mean : 7206 Mean :136315
## 3rd Qu.:1987 3rd Qu.:136940 3rd Qu.:10654 3rd Qu.:193345
## Max. :1991 Max. :241643 Max. :19511 Max. :338407
## Free_Surgery Total_Screening Total_Surgery
## Min. : 0 Min. : 0 Min. : 248
## 1st Qu.: 4678 1st Qu.: 89268 1st Qu.: 6964
## Median :11587 Median :165303 Median :16930
## Mean :13776 Mean :228311 Mean :20981
## 3rd Qu.:22080 3rd Qu.:330285 3rd Qu.:32734
## Max. :31979 Max. :569335 Max. :51490
We can see that the column ‘Year’ is set to type ‘integer’ by default. For our dataset, since we are finding out a trend over years, we will retain the datatype as integer
Now we will use a subset of the data to include years 1977 - 1991.
eyecare_filtered <- subset(eyecare_data, Year >1976 & Year < 1992)
head(eyecare_filtered)
Model with Log of Year
eyecare_filtered$Year <- as.integer(eyecare_filtered$Year)
reg_model2 <- lm(log(Total_Surgery) ~ log(Free_Screening) + log(Paying_Screening) + log(Year), data = eyecare_filtered)
summary(reg_model2)
##
## Call:
## lm(formula = log(Total_Surgery) ~ log(Free_Screening) + log(Paying_Screening) +
## log(Year), data = eyecare_filtered)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.19872 -0.07347 0.02328 0.09543 0.14762
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2363.30737 1043.67414 2.264 0.04474 *
## log(Free_Screening) 0.52716 0.05779 9.122 1.84e-06 ***
## log(Paying_Screening) 1.26652 0.32976 3.841 0.00274 **
## log(Year) -312.63997 137.92618 -2.267 0.04456 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1316 on 11 degrees of freedom
## Multiple R-squared: 0.99, Adjusted R-squared: 0.9872
## F-statistic: 362.3 on 3 and 11 DF, p-value: 2.841e-11
Model with Liner Year
reg_model3 <- lm(log(Total_Surgery) ~ log(Free_Screening) + log(Paying_Screening) + Year, data = eyecare_filtered)
summary(reg_model3)
##
## Call:
## lm(formula = log(Total_Surgery) ~ log(Free_Screening) + log(Paying_Screening) +
## Year, data = eyecare_filtered)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.19885 -0.07305 0.02276 0.09559 0.14721
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 301.86816 133.32752 2.264 0.04477 *
## log(Free_Screening) 0.52653 0.05766 9.132 1.82e-06 ***
## log(Paying_Screening) 1.26662 0.32765 3.866 0.00263 **
## Year -0.15745 0.06899 -2.282 0.04337 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1313 on 11 degrees of freedom
## Multiple R-squared: 0.99, Adjusted R-squared: 0.9873
## F-statistic: 363.9 on 3 and 11 DF, p-value: 2.774e-11