31/07/2021

Overview

We will try to predict the number of murders based on the number of assaults. We will do that in the following way:

  1. Explaining the data set
  2. Building a regression model
  3. Plotting results

Data set used

We will use the built-in R data set about the rates of murders, assaults and rapes in the 50 states of the US. The data set can be loaded in the following way:

data("USArrests")
head(USArrests, 5)
           Murder Assault UrbanPop Rape
Alabama      13.2     236       58 21.2
Alaska       10.0     263       48 44.5
Arizona       8.1     294       80 31.0
Arkansas      8.8     190       50 19.5
California    9.0     276       91 40.6

Regression model

We will use the simple linear regression model. We will regress the rate of murders on the rate of assaults.

The found coefficients of the model are 0.63, 0.04.

The \(R^2\) coefficient of the model is 0.64. The coefficient is quite high, showing a clear correlation between murders and assaults.

Plot

We plot the points with the regression line.

Thank you

Thank you for reading the presentation. Any questions?