The probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.
The probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.
This probability can be based from different kinds of test statistics for example: \[z=\frac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\] \[t=\frac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}\]
The p value is used usually as evidence for or against the statistical significance of the null hypothesis during hypothesis tests. The lower the value the more evidence to reject the null hypothesis and viceversa.
linmod<-lm(UrbanPop~Murder,data=USArrests) summary(linmod)
Call:
lm(formula = UrbanPop ~ Murder, data = USArrests)
Residuals:
Min 1Q Median 3Q Max
-32.248 -9.953 1.255 12.482 25.180
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 63.7393 4.2597 14.963 <2e-16 ***
Murder 0.2312 0.4785 0.483 0.631
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 14.59 on 48 degrees of freedom
Multiple R-squared: 0.00484, Adjusted R-squared: -0.01589
F-statistic: 0.2335 on 1 and 48 DF, p-value: 0.6312
In our previous slide in the rightmost column Pr(>|t|) is where we find the p-values for the simple linear regression model: \(Y_i=\beta_0+\beta_1X_i+\epsilon\) and we find that we have a p-value of 0.631. With such a high p-value we would fail to reject our null hypothesis
And as the p-value suggested during our hypothesis test the correlation is minimal
Here is a plotly graph to see our murder rate by state and expectedly the highest rates are not in the most populated states which also explains the p-value we obtained from our tests.