Regression and Correlation

Scatter plot

To create a scatter plot: See Graphing with R file.

Regression equation

To calculate regression equation: lm(response variable ~ explanatory variable)

You can also do:

lm.out = lm(response variable ~ explanatory variable) –calculates the linear model (you can call it anything you want. It doesn’t have to be lm.out)

lm.out –prints out the linear model

Example:

Find the linear model for the amount of gas used based on temperature

lm(gas_consumed~temperature, data=Gas)
## 
## Call:
## lm(formula = gas_consumed ~ temperature, data = Gas)
## 
## Coefficients:
## (Intercept)  temperature  
##       4.571       -0.223

Equation: gas_consumed = 4.571 - 0.223 * temperature

Plot linear model on the scatter plot:

To plot the linear model on the scatter plot

gf_point(response_variable~explanatory_variable, data=Dataset, title=“type a title for the graph”)|>

gf_lm() #plots the linear model on the scatter plot

Example, draw the scatter plot and linear model on the scatter plot for gas consumed versus temperature.

gf_point(gas_consumed~temperature, data=Gas, title="Gas Consumed vs Temperature", xlab="Temperature (C)", ylab="gas consumed")|> 
  gf_lm()

Scatter Plot with regression

Residuals

To find and plot residuals:

lm.out = lm(response variable ~ explanatory variable)

residuals(lm.out) –calculates the residuals

gf_point(residuals(lm.out) ~independent variable, data=Dataset)|> –plots the residuals against the independent variable

gf_hline(yintercept = 0) - plots a horizontal line through (0,0)

Example:

Find and plot the residuals for gas consumed vs temperature.

lm.out<-lm(gas_consumed~temperature, data=Gas)
residuals(lm.out)
##           1           2           3           4           5           6 
##  0.07256170  0.20706857  0.35166949 -0.25912868 -0.03682822 -0.01452777 
##           7           8           9          10          11          12 
##  0.04157544 -0.01382365 -0.51382365 -0.68002090  0.19838276  0.82068322 
##          13          14          15          16          17          18 
##  0.02068322 -0.13471586 -0.11241541  0.15448597 -0.02321357 -0.07861266
gf_point(residuals(lm.out)~temperature, data=Gas, title="Residuals vs Temperature")|> 
  gf_hline(yintercept = 0)

Residual Plot

Coefficient of determination

To calculate the coefficient of determination

Make sure the lm.out has been calculated, then compute rsquared(lm.out)

example

Calculate the coefficient of determination for the gas consumed vs temperature.

rsquared(lm.out)
## [1] 0.560199

Correlation Coefficient:

To calculate the correlation coefficient

cor(response variable~explanatory variable, data=Dataset)

Example:

Find the correlation coefficient for the amount of gas used based on temperature

cor(gas_consumed~temperature, data= Gas) 
## [1] -0.7484644