Kelompok 8: 1. Muhammad Jodi At-Takbir(G1401221024) 2. Siti Arbaynah(G1401221047) 3. Fauzan Fajari(G1401221055) 4. Muhammad Fahrezi Maulana(G1401221103)
library(readxl)
## Warning: package 'readxl' was built under R version 4.3.2
data43 <- read_excel("C:/Users/ASUS/Downloads/Data 4.3.xlsx")
data43
## # A tibble: 24 × 2
## X Y
## <dbl> <dbl>
## 1 1 23
## 2 2 29
## 3 3 49
## 4 4 64
## 5 4 74
## 6 5 87
## 7 6 96
## 8 6 97
## 9 7 109
## 10 8 119
## # ℹ 14 more rows
y<-data43$X
x<-data43$Y
plot(x,y)
model <- lm(Y~X,data<-data43)
summary(model)
##
## Call:
## lm(formula = Y ~ X, data = data <- data43)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.603 -14.801 -0.045 17.335 29.092
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2127 7.9853 4.66 0.00012 ***
## X 9.9695 0.7218 13.81 2.56e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.75 on 22 degrees of freedom
## Multiple R-squared: 0.8966, Adjusted R-squared: 0.8919
## F-statistic: 190.7 on 1 and 22 DF, p-value: 2.556e-12
Y = 37.2127 + 9.9695X
Ketika x mengalami kenaikan satu satuan maka y mengalami kenaikan sebesar 9.9695 dan etika x sama dengan 0 maka dugaan rataan y adalah 37.2127
library(car)
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.3.3
library (lmtest)
## Warning: package 'lmtest' was built under R version 4.3.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.3.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
model.reg= lm(formula = Y ~.,data=data43)
model.reg
##
## Call:
## lm(formula = Y ~ ., data = data43)
##
## Coefficients:
## (Intercept) X
## 37.21 9.97
plot(model.reg,1)
plot(model.reg,2)
# A. Kondisi Gauss-Marcov ## 1. Nilai Harapan Sisaan
H0:μ sama dengan 0
H1:μ tidak sama dengan 0
t.test(model.reg$residuals,mu = 0,conf.level = 0.95)
##
## One Sample t-test
##
## data: model.reg$residuals
## t = 2.5206e-16, df = 23, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -7.744817 7.744817
## sample estimates:
## mean of x
## 9.436896e-16
Berdasarkan uji t tersebut, diperoleh p-value > α (1 > 0.05). Maka diputuskan tak tolak H0 dengan nilai harapan sisaan = 0.
H0:μ sama dengan 0
H1:μ tidak sama dengan 0
cek.homogen = lm(formula = abs(model.reg$residuals) ~ X, # y: abs residual
data = data43)
summary(cek.homogen)
##
## Call:
## lm(formula = abs(model.reg$residuals) ~ X, data = data43)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.289 -11.751 1.434 7.495 16.526
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.8125 4.4454 2.432 0.0236 *
## X 0.4065 0.4019 1.012 0.3228
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.44 on 22 degrees of freedom
## Multiple R-squared: 0.04444, Adjusted R-squared: 0.001009
## F-statistic: 1.023 on 1 and 22 DF, p-value: 0.3228
ncvTest(model.reg)
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 0.3640031, Df = 1, p = 0.54629
Uji Breusch-Pagan menghasilkan nilai p-value > α. Maka dari itu, dapat diputuskan tak tolak H0 yaitu ragam sisaan homogen.
H0 : Sisaan saling bebas
H1 : Sisaan tidak saling bebas
dwtest(model.reg)
##
## Durbin-Watson test
##
## data: model.reg
## DW = 0.2393, p-value = 3.171e-11
## alternative hypothesis: true autocorrelation is greater than 0
Uji di atas menghasilkan nilai p-value < α (0.05) sehingga diputuskan tolak H0. Dengan demikian disimpulkan bahwa tidak sisaan saling bebas.
H0:Galat menyebar normal
H1:Galat tidak menyebar normal
shapiro.test(model.reg$residuals)
##
## Shapiro-Wilk normality test
##
## data: model.reg$residuals
## W = 0.94875, p-value = 0.2546
Pada uji tersebut, diperoleh p-value = 0.2546. Hal ini menunjukkan p-value > α (0.05) sehingga keputusannya tak tolak H0. Maka galat menyebar normal.
H0 : Galat bebas terhadap peubah bebas
H1 : Galat tidak bebas terhadap peubah bebas
model <- lm(X~.,data=data43)
summary(model)
##
## Call:
## lm(formula = X ~ ., data = data43)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5862 -1.4909 -0.3358 1.0578 3.9064
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.342748 0.945277 -2.478 0.0214 *
## Y 0.089933 0.006512 13.811 2.56e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.781 on 22 degrees of freedom
## Multiple R-squared: 0.8966, Adjusted R-squared: 0.8919
## F-statistic: 190.7 on 1 and 22 DF, p-value: 2.556e-12
Pada uji anova tersebut menunjukkan p-value = 2.556e-12. Hal ini menunjukkan bahwa p-value <0.05 sehingga diputuskan tolak H0 yaitu galat atau sisaan bebas dari peubah bebas.
Berdasarkan serangkaian uji tersebut, dapat disimpulkan bahwa asumsi yang dilanggar oleh data tersebut yaitu sisaan tidak saling bebas.
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.3.2
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data <- readxl::read_excel("C:/Users/ASUS/Downloads/Data 4.10.xlsx")
data
## # A tibble: 12 × 2
## Y X
## <dbl> <dbl>
## 1 8.11 0
## 2 11 5
## 3 8.2 15
## 4 8.3 16
## 5 9.4 17
## 6 9.3 18
## 7 9.6 19
## 8 10.3 20
## 9 11.3 21
## 10 11.4 22
## 11 12.2 23
## 12 12.9 24
x <- data$X
y <- data$Y
model <- lm(y~x, data)
summary(model)
##
## Call:
## lm(formula = y ~ x, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7852 -0.8997 -0.1394 0.7607 2.2730
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.10970 1.05167 7.711 1.62e-05 ***
## x 0.12347 0.05826 2.119 0.0601 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.399 on 10 degrees of freedom
## Multiple R-squared: 0.3099, Adjusted R-squared: 0.2409
## F-statistic: 4.491 on 1 and 10 DF, p-value: 0.0601
anova(model)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 8.7909 8.7909 4.491 0.0601 .
## Residuals 10 19.5745 1.9574
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
hatvalues(model)
## 1 2 3 4 5 6 7
## 0.56502890 0.31936416 0.08815029 0.08410405 0.08352601 0.08641618 0.09277457
## 8 9 10 11 12
## 0.10260116 0.11589595 0.13265896 0.15289017 0.17658960
s = sqrt(1.9574)
ei = model$residuals
n = dim(data)[1]
p = length(model$coefficients)
xbar = mean(data$X)
d = (data$X - xbar)^2
td = sum(d)
hii = ((1/n)+d)/td ##### hii=hatvalues(model)
ri = ei/(s*sqrt(1-hii))
Obs = c(1:n)
summ <- cbind.data.frame(Obs, data, hii, ri)
head(summ)
## Obs Y X hii ri
## 1 1 8.11 0 0.4818400771 0.0003013294
## 2 2 11.00 5 0.2361753372 1.8588982834
## 3 3 8.20 15 0.0049614644 -1.2623420726
## 4 4 8.30 16 0.0009152216 -1.2765650849
## 5 5 9.40 17 0.0003371869 -0.5780924402
## 6 6 9.30 18 0.0032273603 -0.7389144291
for (i in 1:dim(summ)[1]){
cutoff <- 2*p/n
titik_leverage <- which(hii > cutoff)
}
titik_leverage
## [1] 1
for (i in 1:dim(summ)[1]){
absri <- abs(summ[,5])
pencilan <- which(absri > 2)
}
pencilan
## integer(0)
Di = (ri^2/p)*(hii/(1-hii))
summ <- cbind.data.frame(Obs, data, hii, ri, Di)
head(summ)
## Obs Y X hii ri Di
## 1 1 8.11 0 0.4818400771 0.0003013294 4.221746e-08
## 2 2 11.00 5 0.2361753372 1.8588982834 5.342224e-01
## 3 3 8.20 15 0.0049614644 -1.2623420726 3.972776e-03
## 4 4 8.30 16 0.0009152216 -1.2765650849 7.464141e-04
## 5 5 9.40 17 0.0003371869 -0.5780924402 5.636140e-05
## 6 6 9.30 18 0.0032273603 -0.7389144291 8.839132e-04
for (i in 1:dim(summ)[1]){
fcrit = qf(p=0.95, df1=p, df2=n-p)
inf <- which(Di > fcrit)
}
inf
## named integer(0)
plot(model)
influence.measures(model)
## Influence measures of
## lm(formula = y ~ x, data = data) :
##
## dfb.1_ dfb.x dffit cov.r cook.d hat inf
## 1 0.000356 -0.000328 0.000356 2.838 7.03e-08 0.5650 *
## 2 1.619009 -1.405990 1.635464 0.680 9.10e-01 0.3194 *
## 3 -0.252156 0.100036 -0.427938 0.924 8.40e-02 0.0882
## 4 -0.198944 0.040463 -0.422691 0.911 8.16e-02 0.0841
## 5 -0.059760 -0.008461 -0.176155 1.251 1.66e-02 0.0835
## 6 -0.047078 -0.043860 -0.232217 1.195 2.82e-02 0.0864
## 7 -0.013811 -0.063458 -0.198922 1.251 2.11e-02 0.0928
## 8 0.003656 -0.029334 -0.067691 1.364 2.53e-03 0.1026
## 9 -0.025815 0.083556 0.157634 1.339 1.35e-02 0.1159
## 10 -0.042688 0.100645 0.165053 1.369 1.48e-02 0.1327
## 11 -0.139524 0.277404 0.411275 1.195 8.51e-02 0.1529
## 12 -0.289105 0.515996 0.710052 0.943 2.22e-01 0.1766
dffits.cutoff <- 2*sqrt(p/n)
covrat.lower <- 1-3*(p/n)
covrat.upper <- 1+3*(p/n)
dfbetas.cutoff <- 2/sqrt(n)
summary(influence.measures(model))
## Potentially influential observations of
## lm(formula = y ~ x, data = data) :
##
## dfb.1_ dfb.x dffit cov.r cook.d hat
## 1 0.00 0.00 0.00 2.84_* 0.00 0.57_*
## 2 1.62_* -1.41_* 1.64_* 0.68 0.91_* 0.32
Berdasarkan Serangkaian Uji, terlihat bahwa data no 1 dan 2 adalah data unusual yang memiliki potensi untuk mengaruhi hasil analisis regresi secara signifikan. Terkadang, pengamatan ini dapat menjadi outlier yang mempengaruhi kecocokan model atau representasi umum dari data. Namun, penting untuk memeriksa setiap pengamatan secara individual dan mempertimbangkan konteks analisis sebelum membuat keputusan tentang bagaimana menangani pengaruhnya.