Introduction

This report demonstrates stimulation of normally distributed variables and fitting a multiple linear regression model using R.

x<- c(23,4,6,8,34,9)
y<- c(43,67,89,45)
set.seed(123)
n = 43
x = rnorm(n,3,1)
z = runif(n,3,9)
error = rnorm(n)
beta <- c(2.3,2.9,1.6)
y = 2.3 + 2.9*x + 1.6*z + error
model = lm(y~x+z)
summary(model)
## 
## Call:
## lm(formula = y ~ x + z)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.34052 -0.59054 -0.01095  0.44613  1.73442 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.88535    0.56042   3.364   0.0017 ** 
## x            3.03422    0.12231  24.808   <2e-16 ***
## z            1.59628    0.06064  26.325   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7034 on 40 degrees of freedom
## Multiple R-squared:  0.9666, Adjusted R-squared:  0.965 
## F-statistic: 579.2 on 2 and 40 DF,  p-value: < 2.2e-16

Data Simulation

n <- 43

x <- rnorm(n, mean = 3, sd = 1)
z <- runif(n, min = 3, max = 9)
error <- rnorm(n)

beta <- c(2.3, 2.9, 1.6)

y <- 2.3 + 2.9*x + 1.6*z + error

Model Fitting

model <- lm(y ~ x + z)
summary(model)
## 
## Call:
## lm(formula = y ~ x + z)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2943 -0.6639 -0.1667  0.4958  2.1406 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.45124    0.82203   4.198 0.000146 ***
## x            2.63064    0.15636  16.825  < 2e-16 ***
## z            1.55380    0.08547  18.179  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9248 on 40 degrees of freedom
## Multiple R-squared:  0.9212, Adjusted R-squared:  0.9173 
## F-statistic: 233.8 on 2 and 40 DF,  p-value: < 2.2e-16

Diagnostic Plot

plot(model)

Interpretation

The multiple linear regression model has an R-squared value of 0.9212, meaning that approximately 92.12% of the variability in the response variable (y) is explained by the predictors x and z.

The adjusted R-squared value of 0.9173 confirms that the model maintains strong explanatory power even after accounting for the number of predictors included.

This indicates that the model provides an excellent fit to the simulated data.

Coeffiecient Inerpretation

The estimated regression equation is: ลท = 3.451 + 2.631x + 1.554z

Holding z constant, a one-unit increase in x increases y by approximately 2.63 units. Similarly, holding x constant, a one-unit increase in z increase y by approximately 1.55 units.

Both predictors are highly statistically significant (p-values < 0.001), indicating strong evidence that x and z contribute to explaining the response variable