\[ Y = \beta_0 + \beta_1 X + \varepsilon \] This equation models execution time as a linear function of CPU speed.
\[ Y = \beta_0 + \beta_1 X + \varepsilon \] This equation models execution time as a linear function of CPU speed.
We generate synthetic CPU speed and execution time data for demonstration.
## cpu_speed execution_time ## 1 2.575155 77.43146 ## 2 3.576610 50.95705 ## 3 2.817954 53.80784 ## 4 3.766035 48.18608 ## 5 3.880935 40.01735 ## 6 2.091113 72.83862
These formulas show how the slope and intercept are estimated using least squares.
\[ \hat{\beta}_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \]
\[ \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \]
The following R code fits a simple linear regression model using the lm() function.
model <- lm(execution_time ~ cpu_speed, data = df) model
## ## Call: ## lm(formula = execution_time ~ cpu_speed, data = df) ## ## Coefficients: ## (Intercept) cpu_speed ## 127.43 -22.32
## ## Call: ## lm(formula = execution_time ~ cpu_speed, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -10.7147 -3.6014 0.0977 3.5281 9.4061 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 127.433 5.030 25.34 < 2e-16 *** ## cpu_speed -22.325 1.574 -14.19 2.6e-14 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.938 on 28 degrees of freedom ## Multiple R-squared: 0.8779, Adjusted R-squared: 0.8735 ## F-statistic: 201.3 on 1 and 28 DF, p-value: 2.603e-14
We add temperature as a third variable to illustrate how execution time depends on both CPU speed and temperature, enabling a 3D visualization in the next slide.