P2-Koefisien Determinasi dan Korelasi
Data
Download data here.
## YearsExperience Salary
## 1 1.2 39344
## 2 1.4 46206
## 3 1.6 37732
## 4 2.1 43526
## 5 2.3 39892
## 6 3.0 56643
Persamaan Regresi
Persamaan Regresi dengan MKT manual
salary$xdif <- salary$YearsExperience-mean(salary$YearsExperience)
salary$ydif <- salary$Salary-mean(salary$Salary)
salary$crp <- salary$xdif*salary$ydif
salary$xsq <- salary$xdif^2
#estimator b0 dan b1
b1 <- sum(salary$crp)/sum(salary$xsq)
b1
## [1] 9449.962
## [1] 24848.2
Persamaan Regresi dengan R function
##
## Call:
## lm(formula = Salary ~ YearsExperience, data = salary)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7958.0 -4088.5 -459.9 3372.6 11448.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24848.2 2306.7 10.77 1.82e-11 ***
## YearsExperience 9450.0 378.8 24.95 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5788 on 28 degrees of freedom
## Multiple R-squared: 0.957, Adjusted R-squared: 0.9554
## F-statistic: 622.5 on 1 and 28 DF, p-value: < 2.2e-16
Plot Regresi
plot(salary$YearsExperience, salary$Salary,xlab="YearsExperience",ylab="Salary",pch=16)
abline(model1)
Koefisien Determinasi
## [1] 938128552
## [1] 20856849300
## [1] 21794977852
## [1] 0.9569567
ANOVA
## Analysis of Variance Table
##
## Response: Salary
## Df Sum Sq Mean Sq F value Pr(>F)
## YearsExperience 1 2.0857e+10 2.0857e+10 622.51 < 2.2e-16 ***
## Residuals 28 9.3813e+08 3.3505e+07
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##koefisien korelasi
x1_x1bar<-salary$YearsExperience-mean(salary$YearsExperience)
x2_x2bar<-salary$Salary-mean(salary$Salary)
A<-sum(x1_x1bar*x2_x2bar)
varx1<-sum((x1_x1bar)^2)
varx2<-sum((x2_x2bar)^2)
B<-sqrt(varx1*varx2)
corr<-A/B
corr
## [1] 0.9782416
## [1] 0.9782416
Uji Signifikansi Korelasi
#Pearson Correlation test, alpha 1%
cor.test( ~ YearsExperience + Salary, data=salary, method = "pearson", continuity = FALSE, conf.level = 0.99)
##
## Pearson's product-moment correlation
##
## data: YearsExperience and Salary
## t = 24.95, df = 28, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 99 percent confidence interval:
## 0.9424207 0.9918711
## sample estimates:
## cor
## 0.9782416
Excercise
- For a random sample of Indian states, the ANOVA table shown refers to hypothetical data on x = tax revenue in Indian rupees and y = agricultural subsidies in Indian rupees. Fill in the blanks in the table.
- The table here shows the ANOVA table for a regression analysis of y = the selling price (in thousands of dollars) and x = the size of house (in thousands of square feet). The prediction equation is \(\hat{y} = 9.2 + 77x\).
What was the sample size?
The sample mean house size was 1.53 thousand square feet. What was the sample mean selling price? (Hint: What does \(\hat{y}\) equal when \(x = \bar{x}\) ?)
- Create a regression equation based on the data in this link give interpretation for the regression equation, determine the determination coefficient, correlation, and do the hypothesis testing for correlation.