This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Line Regression and what is it?

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables, assuming a linear relationship. It helps predict the value of a dependent variable based on the values of the independent variables. Essentially, it finds the best-fit line or plane that represents the data, allowing for predictions and understanding of how independent variables influence the dependent variable.

\[ \text {This is what the formula looks like: } y = \beta_0 + \beta_1 x + \varepsilon \]

Dataset: diamond from using UsingR

The data set Diamond contains two columns with carat and price of the diamond

carat is the weight of the diamond price is the price of the diamond

We will find out for each influences each other using linear regression

##   carat price
## 1  0.17   355
## 2  0.16   328
## 3  0.17   350
## 4  0.18   325
## 5  0.25   642
## 6  0.16   342

Plotly for Diamond Price and Carat relationship

Interactive plot graph where you can individually see each plot and can compare up and down the Regression Line

ggplot Scatterplot for Diamond Price and Carat relationship

This shows the price of a diamond based on the weight in carrots We can see how the higher the weight the higher the price

ggplot Scatterplot Applied Linear Regression Line

Adding a linear regression line on top of the Scatterplot we already have

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Understanding the Slope Here

The slope \(\beta _{1}\) represents the average change in the dependent variable \(y\) for a one-unit increase in the independent variable \(x\). It indicates how much \(y\) is expected to change for each unit change in \(x\). google.com

Formula: \[ \beta_1 = \frac{ Cov(X, Y)}{Var(X) } \] Dividing 2 covariance of x and y by an independent variable x gives us \(\beta_1\) or slope. If greater than 0 then its positive if less then negative

Linear Regression Model Code Prediction and statistics for Diamond Price and Carat relationship

Contains Residual error and had min, quatertile 1, median, 3rd quatile , and max.

## 
## Call:
## lm(formula = price ~ carat, data = diamond)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -85.159 -21.448  -0.869  18.972  79.370 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -259.63      17.32  -14.99   <2e-16 ***
## carat        3721.02      81.79   45.50   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 31.84 on 46 degrees of freedom
## Multiple R-squared:  0.9783, Adjusted R-squared:  0.9778 
## F-statistic:  2070 on 1 and 46 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = price ~ carat, data = diamond)
## 
## Coefficients:
## (Intercept)        carat  
##      -259.6       3721.0