2024-09-22

Simple Linear Regression (SLR)

A simple linear regression line seeks to discover the correlation between one independent and one dependent variable. Usually, the x and y coordinates are displayed using a Cartesian coordinate system. Our task is to find the dependent variable value concerning the independent variable value. The “simple” in SLR indicates that the outcome variable is related to just one prognostic.

Here is a Simple Linear Regression Line: \(y = a + Bx\). The formula describes the sum of “a” hat as the y-intercept and the sum of “b” hat as the slope of a line.

The Simple Linear Regression equation is:

\[ y=\hat{a}+\hat{\beta}x\\ \]

Berlin COVID-19 Simple Linear Regression Formulation

We will implement a simple linear regression formula for COVID-19 cases in Berlin. Beta 0 will be our y-intercept in the regression line. Beta 1 will be our Slope the value of angle in the regression line.

\[ \hat{\beta(0)}=\hat{y}-\hat{\beta(1)}\hat{x}\\ \hat{\beta(1)}={\frac {{\sum}(x_{i}-{\bar {x}})(y_{i}-{\bar{y}})}{{\sum}(x_{i}-{\bar {x}})^2}} \]

Berlin New Cases vs Ongoing Cases

Summary of Berlin New Cases vs Ongoing Cases

The Graph compares New Cases of COVID 19 vs Ongoing Cases from COVID 19. Our dependent variable is Ongoing cases, and our independent variable is New Cases. We notice a slight upward slope on the regression line. New cases peaked at about half a million Ongoing Cases. We notice an outlier at about 800000 Ongoing Cases and 12000 New Cases of COVID-19. The New Cases of COVID-19 are more prevalent as Ongoing Cases values increase. Most of the New and Ongoing Cases seem to be clustered at the lower end of the graph. This means that more New Cases were prevailing rather than Ongoing Cases in the extensive set of values.

Berlin 7 day incidence vs Ongoing Cases

Summary of Berlin 7 Day Incidence vs Ongoing Cases

The Graph compares Cases Still Testing Positive for COVID-19 After the 7 Days Incidence. The graph looks uniform with a bell curve shape. Ongoing Cases is our dependent variable, and 7 Day Incidence is our independent variable. We notice a slight negative slope on the regression line with a decrease in 7 Day incidence as Ongoing Cases increase. The values of 7 Day incidence remain lower than the larger values of Ongoing Cases. We had fewer Ongoing Cases after 7 Days of testing positive for COVID-19.

Berlin 3D Plot of Cases vs New Cases vs 7 Day Incidence

Berlin 3D R Code of Plotly Plot in Cases vs New Cases vs 7 Day Incidence

The 3D Plot code scatter3D was very functional and easy to use.

scatter3D(x=berlin#cases, y=berlin#new.cases, z=berlin$X7_days_incidence, title=‘3D Plot with Cases vs New Cases vs 7 Day Incidence’, bty = “b2”, pch = 20, cex = 2, ticktype = “detailed”, xlab=“Ongoing Cases”,ylab= “New Cases”, zlab=“7 Day Incidence”, phi = 10, theta=30)

*The $ after berlin seems to be an argument, so replaced them with the number symbol #.

Summary of Berlin Cases vs New Cases vs 7 Day Incidence

We can appreciate the data forms a 3D Bell Shape. Composed mostly of low 7 Day incidence and a large range of Cases with New cases. The graph allows us to see New and Ongoing Cases as the dependent variable and 7 Day Incidence as the independent variable. With the help of a the legend we can appropriate the quantity of 7 Day Incidence, a larger values has a brighter color while the lower values have a darker color.