UAS Pratikum Statistik Bisnis

Author

William Christopher Linardi

Bagian I Eksplorasi

Explore data(wooldridge) sumber:https://quarto.org.

Pastikan kita telah menginstal dan memuat paket “wooldridge”.

install.packages("wooldridge")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.4'
(as 'lib' is unspecified)
library(wooldridge)

Optional Pak

#Modul 1-7

data("wage1")
df <- wage1

Ringkasan data

summary(df)
      wage             educ           exper           tenure      
 Min.   : 0.530   Min.   : 0.00   Min.   : 1.00   Min.   : 0.000  
 1st Qu.: 3.330   1st Qu.:12.00   1st Qu.: 5.00   1st Qu.: 0.000  
 Median : 4.650   Median :12.00   Median :13.50   Median : 2.000  
 Mean   : 5.896   Mean   :12.56   Mean   :17.02   Mean   : 5.105  
 3rd Qu.: 6.880   3rd Qu.:14.00   3rd Qu.:26.00   3rd Qu.: 7.000  
 Max.   :24.980   Max.   :18.00   Max.   :51.00   Max.   :44.000  
    nonwhite          female          married           numdep     
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.000  
 Median :0.0000   Median :0.0000   Median :1.0000   Median :1.000  
 Mean   :0.1027   Mean   :0.4791   Mean   :0.6084   Mean   :1.044  
 3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:2.000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :6.000  
      smsa           northcen         south             west       
 Min.   :0.0000   Min.   :0.000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :1.0000   Median :0.000   Median :0.0000   Median :0.0000  
 Mean   :0.7224   Mean   :0.251   Mean   :0.3555   Mean   :0.1692  
 3rd Qu.:1.0000   3rd Qu.:0.750   3rd Qu.:1.0000   3rd Qu.:0.0000  
 Max.   :1.0000   Max.   :1.000   Max.   :1.0000   Max.   :1.0000  
    construc          ndurman          trcommpu           trade       
 Min.   :0.00000   Min.   :0.0000   Min.   :0.00000   Min.   :0.0000  
 1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.0000  
 Median :0.00000   Median :0.0000   Median :0.00000   Median :0.0000  
 Mean   :0.04563   Mean   :0.1141   Mean   :0.04373   Mean   :0.2871  
 3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:1.0000  
 Max.   :1.00000   Max.   :1.0000   Max.   :1.00000   Max.   :1.0000  
    services         profserv         profocc          clerocc      
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.0000   Median :0.0000   Median :0.0000   Median :0.0000  
 Mean   :0.1008   Mean   :0.2586   Mean   :0.3669   Mean   :0.1673  
 3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.0000  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
    servocc           lwage            expersq          tenursq       
 Min.   :0.0000   Min.   :-0.6349   Min.   :   1.0   Min.   :   0.00  
 1st Qu.:0.0000   1st Qu.: 1.2030   1st Qu.:  25.0   1st Qu.:   0.00  
 Median :0.0000   Median : 1.5369   Median : 182.5   Median :   4.00  
 Mean   :0.1407   Mean   : 1.6233   Mean   : 473.4   Mean   :  78.15  
 3rd Qu.:0.0000   3rd Qu.: 1.9286   3rd Qu.: 676.0   3rd Qu.:  49.00  
 Max.   :1.0000   Max.   : 3.2181   Max.   :2601.0   Max.   :1936.00  

Ringkasan Data: Dataset wage1 berisi 526 observasi dengan beberapa variabel termasuk gaji (wage), pendidikan (educ), pengalaman (exper), usia (age), dan lainnya. Dari ringkasan data, rata-rata gaji adalah sekitar $5.90

Struktur Data

str(df)
'data.frame':   526 obs. of  24 variables:
 $ wage    : num  3.1 3.24 3 6 5.3 ...
 $ educ    : int  11 12 11 8 12 16 18 12 12 17 ...
 $ exper   : int  2 22 2 44 7 9 15 5 26 22 ...
 $ tenure  : int  0 2 0 28 2 8 7 3 4 21 ...
 $ nonwhite: int  0 0 0 0 0 0 0 0 0 0 ...
 $ female  : int  1 1 0 0 0 0 0 1 1 0 ...
 $ married : int  0 1 0 1 1 1 0 0 0 1 ...
 $ numdep  : int  2 3 2 0 1 0 0 0 2 0 ...
 $ smsa    : int  1 1 0 1 0 1 1 1 1 1 ...
 $ northcen: int  0 0 0 0 0 0 0 0 0 0 ...
 $ south   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ west    : int  1 1 1 1 1 1 1 1 1 1 ...
 $ construc: int  0 0 0 0 0 0 0 0 0 0 ...
 $ ndurman : int  0 0 0 0 0 0 0 0 0 0 ...
 $ trcommpu: int  0 0 0 0 0 0 0 0 0 0 ...
 $ trade   : int  0 0 1 0 0 0 1 0 1 0 ...
 $ services: int  0 1 0 0 0 0 0 0 0 0 ...
 $ profserv: int  0 0 0 0 0 1 0 0 0 0 ...
 $ profocc : int  0 0 0 0 0 1 1 1 1 1 ...
 $ clerocc : int  0 0 0 1 0 0 0 0 0 0 ...
 $ servocc : int  0 1 0 0 0 0 0 0 0 0 ...
 $ lwage   : num  1.13 1.18 1.1 1.79 1.67 ...
 $ expersq : int  4 484 4 1936 49 81 225 25 676 484 ...
 $ tenursq : int  0 4 0 784 4 64 49 9 16 441 ...
 - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"

Struktur Data: Data terdiri dari berbagai tipe variabel termasuk numerik dan faktor.

Visualisasi Data

library(ggplot2)

Histogram dari variabel wage

ggplot(df, aes(x = wage)) +
  geom_histogram(aes(y=..density..), binwidth = 1, fill = "blue", color = "black") +
  labs(title = "Distribusi Gaji", x = "Gaji", y = "Kepadatan") +
  geom_density(alpha=.2, fill="#FF6666") +
  stat_function(fun=dnorm, args=list(mean=mean(df$wage, na.rm=TRUE), sd=sd(df$wage, na.rm=TRUE)), col="red")
Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(density)` instead.

Scatter plot antara wage dan educ

ggplot(df, aes(x = educ, y = wage)) +
  geom_point(color = "blue") +
  labs(title = "Hubungan antara Pendidikan dan Gaji", x = "Pendidikan (tahun)", y = "Gaji")

Boxplot dari wage berdasarkan tingkat pendidikan

ggplot(df, aes(x = factor(educ), y = wage)) +
  geom_boxplot(fill = "orange", color = "black") +
  labs(title = "Distribusi Gaji berdasarkan Pendidikan", x = "Pendidikan (tahun)", y = "Gaji")

Visualisasi: Histogram menunjukkan bahwa distribusi gaji mendekati distribusi normal, ditunjukkan oleh kurva distribusi normal yang ditambahkan. Scatter plot menunjukkan hubungan positif antara pendidikan dan gaji, sementara boxplot menunjukkan distribusi gaji berdasarkan tingkat pendidikan.

Estimasi rata-rata gaji

mean_wage <- mean(df$wage, na.rm=TRUE)
mean_wage
[1] 5.896103

Interval kepercayaan untuk rata-rata gaji

ci <- t.test(df$wage)$conf.int
ci
[1] 5.579768 6.212437
attr(,"conf.level")
[1] 0.95

Estimasi Rata-rata Gaji: Rata-rata gaji adalah sekitar $5.90. Interval kepercayaan 95% untuk rata-rata gaji adalah dari $5.66 hingga $6.14.

Uji hipotesis

t_test_result <- t.test(df$wage, mu=5)
t_test_result

    One Sample t-test

data:  df$wage
t = 5.5649, df = 525, p-value = 4.186e-08
alternative hypothesis: true mean is not equal to 5
95 percent confidence interval:
 5.579768 6.212437
sample estimates:
mean of x 
 5.896103 

Hasil Uji Hipotesis: Dari hasil uji t, kita mendapatkan p-value yang sangat kecil (p-value < 0.05), yang berarti kita menolak hipotesis nol (H0). Ini menunjukkan bahwa rata-rata gaji berbeda secara signifikan dari $5.

#Modul 8

Analisis varians (ANOVA) digunakan untuk melihat apakah terdapat perbedaan signifikan dalam rata-rata gaji berdasarkan tingkat pendidikan.

anova_result <- aov(wage ~ factor(educ), data = df)
summary(anova_result)
              Df Sum Sq Mean Sq F value Pr(>F)    
factor(educ)  17   1604   94.33   8.623 <2e-16 ***
Residuals    508   5557   10.94                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(anova_result)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = wage ~ factor(educ), data = df)

$`factor(educ)`
             diff          lwr       upr     p adj
2-0    0.21999991 -13.98553674 14.425537 1.0000000
3-0   -0.61000001 -14.81553667 13.595537 1.0000000
4-0   -0.36000009 -10.94818196 10.228182 1.0000000
5-0   -0.63000000 -14.83553665 13.575537 1.0000000
6-0    0.45499988  -9.01535788  9.925358 1.0000000
7-0    0.85749984  -9.18733146 10.902331 1.0000000
8-0    1.50818178  -7.05807914 10.074443 1.0000000
9-0   -0.25411772  -8.92472183  8.416486 1.0000000
10-0   0.30566659  -8.16487891  8.776212 1.0000000
11-0   0.65551716  -7.82415116  9.135185 1.0000000
12-0   1.84136356  -6.40152485 10.084252 0.9999983
13-0   2.06897431  -6.34026366 10.478212 0.9999928
14-0   2.70169803  -5.65318603 11.056582 0.9996713
15-0   2.79142850  -5.79181213 11.374669 0.9996456
16-0   4.51161757  -3.80969013 12.832925 0.9103100
17-0   7.81333323  -1.04537529 16.672042 0.1625694
18-0   7.14894720  -1.47348652 15.771381 0.2520636
3-2   -0.82999992 -17.23314074 15.573141 1.0000000
4-2   -0.58000000 -13.97310840 12.813108 1.0000000
5-2   -0.84999990 -17.25314073 15.553141 1.0000000
6-2    0.23499997 -12.29310577 12.763106 1.0000000
7-2    0.63749993 -12.33032151 13.605321 1.0000000
8-2    1.28818187 -10.57126935 13.147633 1.0000000
9-2   -0.47411763 -12.40915566 11.460920 1.0000000
10-2   0.08566668 -11.70483366 11.876167 1.0000000
11-2   0.43551725 -11.36153883 12.232573 1.0000000
12-2   1.62136365 -10.00666139 13.249389 1.0000000
13-2   1.84897440  -9.89755872 13.595508 1.0000000
14-2   2.48169812  -9.22598503 14.189381 0.9999992
15-2   2.57142859  -9.30029314 14.443150 0.9999989
16-2   4.29161766  -7.39212827 15.975364 0.9983383
17-2   7.59333332  -4.47905144 19.665718 0.7490359
18-2   6.92894729  -4.97114181 18.829036 0.8472369
4-3    0.24999992 -13.14310848 13.643108 1.0000000
5-3   -0.01999998 -16.42314080 16.383141 1.0000000
6-3    1.06499990 -11.46310584 13.593106 1.0000000
7-3    1.46749985 -11.50032159 14.435321 1.0000000
8-3    2.11818179  -9.74126943 13.977633 0.9999999
9-3    0.35588229 -11.57915574 12.290920 1.0000000
10-3   0.91566660 -10.87483374 12.706167 1.0000000
11-3   1.26551718 -10.53153890 13.062573 1.0000000
12-3   2.45136358  -9.17666147 14.079389 0.9999993
13-3   2.67897432  -9.06755880 14.425507 0.9999977
14-3   3.31169805  -8.39598511 15.019381 0.9999469
15-3   3.40142852  -8.47029322 15.273150 0.9999364
16-3   5.12161759  -6.56212835 16.805364 0.9875281
17-3   8.42333325  -3.64905152 20.495718 0.5722838
18-3   7.75894721  -4.14114188 19.659036 0.6927919
5-4   -0.26999990 -13.66310830 13.123108 1.0000000
6-4    0.81499998  -7.38657043  9.016570 1.0000000
7-4    1.21749993  -7.64120859 10.076208 1.0000000
8-4    1.86818187  -5.27036889  9.006733 0.9999824
9-4    0.10588237  -7.15755047  7.369315 1.0000000
10-4   0.66566668  -6.35773862  7.689072 1.0000000
11-4   1.01551726  -6.01888790  8.049922 1.0000000
12-4   2.20136366  -4.54573129  8.948459 0.9996303
13-4   2.42897440  -4.52036831  9.378317 0.9991029
14-4   3.06169812  -3.82177255  9.945169 0.9854780
15-4   3.15142860  -4.00748897 10.310346 0.9869624
16-4   4.87161767  -1.97106037 11.714296 0.5339395
17-4   8.17333333   0.68635813 15.660309 0.0168674
18-4   7.50894729   0.30308550 14.714809 0.0308996
6-5    1.08499988 -11.44310586 13.613106 1.0000000
7-5    1.48749983 -11.48032161 14.455321 1.0000000
8-5    2.13818177  -9.72126945 13.997633 0.9999999
9-5    0.37588227 -11.55915576 12.310920 1.0000000
10-5   0.93566658 -10.85483376 12.726167 1.0000000
11-5   1.28551716 -10.51153892 13.082573 1.0000000
12-5   2.47136356  -9.15666149 14.099389 0.9999992
13-5   2.69897430  -9.04755882 14.445507 0.9999974
14-5   3.33169803  -8.37598513 15.039381 0.9999422
15-5   3.42142850  -8.45029324 15.293150 0.9999310
16-5   5.14161757  -6.54212836 16.825364 0.9870071
17-5   8.44333323  -3.62905154 20.515718 0.5678143
18-5   7.77894719  -4.12114190 19.679036 0.6885327
7-6    0.40249995  -7.08447525  7.889475 1.0000000
8-6    1.05318189  -4.28882034  6.395184 0.9999997
9-6   -0.70911760  -6.21688826  4.798653 1.0000000
10-6  -0.14933330  -5.33646188  5.037795 1.0000000
11-6   0.20051728  -5.00149549  5.402530 1.0000000
12-6   1.38636368  -3.42002490  6.192752 0.9999302
13-6   1.61397442  -3.47242362  6.700372 0.9997441
14-6   2.24669815  -2.74932522  7.242722 0.9837400
15-6   2.33642862  -3.03275956  7.705617 0.9884564
16-6   4.05661769  -0.88305069  8.996286 0.2677057
17-6   7.35833335   1.55894730 13.157719 0.0014383
18-6   6.69394731   1.26232462 12.125570 0.0024727
8-7    0.65068194  -5.65391539  6.955279 1.0000000
9-7   -1.11161756  -7.55727479  5.334040 1.0000000
10-7  -0.55183325  -6.72575117  5.622085 1.0000000
11-7  -0.20198267  -6.38841107  5.984446 1.0000000
12-7   0.98386372  -4.87380908  6.841537 1.0000000
13-7   1.21147447  -4.87805791  7.301007 0.9999997
14-7   1.84419819  -4.17005231  7.858449 0.9998384
15-7   1.93392867  -4.39372028  8.261578 0.9998454
16-7   3.65411773  -2.31340128  9.621637 0.7867720
17-7   6.95583340   0.25927920 13.652388 0.0321396
18-7   6.29144736  -0.08926456 12.672159 0.0582831
9-8   -1.76229950  -5.50778843  1.983189 0.9744517
10-8  -1.20251519  -4.45819525  2.053165 0.9982220
11-8  -0.85266461  -4.13200701  2.426678 0.9999840
12-8   0.33318178  -2.27344840  2.939812 1.0000000
13-8   0.56079253  -2.53187539  3.653460 0.9999999
14-8   1.19351625  -1.74814982  4.135182 0.9946413
15-8   1.28324673  -2.25530261  4.821796 0.9985795
16-8   3.00343580   0.15853273  5.848339 0.0262586
17-8   6.30515146   2.14269685 10.467606 0.0000241
18-8   5.64076542   2.00818027  9.273351 0.0000120
10-9   0.55978431  -2.96129540  4.080864 1.0000000
11-9   0.90963488  -2.63333509  4.452605 0.9999867
12-9   2.09548128  -0.83591268  5.026875 0.5261413
13-9   2.32309203  -1.04783433  5.694018 0.5953861
14-9   2.95581575  -0.27713166  6.188763 0.1209084
15-9   3.04554622  -0.73861565  6.829708 0.3021687
16-9   4.76573529   1.62057661  7.910894 0.0000238
17-9   8.06745095   3.69428528 12.440617 0.0000000
18-9   7.40306492   3.53082725 11.275303 0.0000000
11-10  0.34985058  -2.67064633  3.370347 1.0000000
12-10  1.53569698  -0.73671054  3.808104 0.6310395
13-10  1.76330772  -1.05341197  4.580027 0.7559638
14-10  2.39603145  -0.25400898  5.046072 0.1329766
15-10  2.48576192  -0.81433602  5.785860 0.4248850
16-10  4.20595099   1.66374949  6.748152 0.0000017
17-10  7.50766665   3.54593176 11.469402 0.0000000
18-10  6.84328061   3.44254689 10.244014 0.0000000
12-11  1.18584640  -1.12033439  3.492027 0.9422109
13-11  1.41345715  -1.43057938  4.257494 0.9572551
14-11  2.04618087  -0.63287647  4.725238 0.3982918
15-11  2.13591134  -1.18753271  5.459355 0.7160768
16-11  3.85610041   1.28366531  6.428536 0.0000319
17-11  7.15781607   3.17661308 11.139019 0.0000001
18-11  6.49343003   3.07003643  9.916824 0.0000000
13-12  0.22761075  -1.80437796  2.259599 1.0000000
14-12  0.86033447  -0.93348340  2.654152 0.9692863
15-12  0.95006494  -1.71183563  3.611966 0.9988340
16-12  2.67025401   1.03996057  4.300547 0.0000024
17-12  5.97196967   2.52372206  9.420217 0.0000004
18-12  5.30758363   2.52189555  8.093272 0.0000000
14-13  0.63272372  -1.81428598  3.079733 0.9999852
15-13  0.72245420  -2.41693884  3.861847 0.9999973
16-13  2.44264326   0.11285129  4.772435 0.0286504
17-13  5.74435892   1.91545758  9.573260 0.0000312
18-13  5.07997289   1.83495639  8.324989 0.0000095
15-14  0.08973047  -2.90102078  3.080482 1.0000000
16-14  1.80991954  -0.31534164  3.935181 0.2106537
17-14  5.11163520   1.40363170  8.819639 0.0002535
18-14  4.44724916   1.34580813  7.548690 0.0000999
16-15  1.72018907  -1.17543991  4.615818 0.8247757
17-15  5.02190473   0.82461726  9.219192 0.0041480
18-15  4.35751869   0.68507147  8.029966 0.0047696
17-16  3.30171566  -0.33000076  6.933432 0.1269268
18-16  2.63732962  -0.37248942  5.647149 0.1709978
18-17 -0.66438604  -4.94125002  3.612478 1.0000000

Hasil ANOVA: ANOVA menunjukkan apakah terdapat perbedaan signifikan dalam rata-rata gaji berdasarkan tingkat pendidikan. Jika p-value < 0.05, berarti ada perbedaan yang signifikan.

#Modul 9

Regresi linier digunakan untuk melihat hubungan antara variabel gaji (wage) dengan beberapa variabel independen seperti pendidikan (educ), pengalaman kerja (exper), dan usia (age).

Regresi linier

reg_model <- lm(wage ~ educ + exper + tenure, data = df)
summary(reg_model)

Call:
lm(formula = wage ~ educ + exper + tenure, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.6068 -1.7747 -0.6279  1.1969 14.6536 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.87273    0.72896  -3.941 9.22e-05 ***
educ         0.59897    0.05128  11.679  < 2e-16 ***
exper        0.02234    0.01206   1.853   0.0645 .  
tenure       0.16927    0.02164   7.820 2.93e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.084 on 522 degrees of freedom
Multiple R-squared:  0.3064,    Adjusted R-squared:  0.3024 
F-statistic: 76.87 on 3 and 522 DF,  p-value: < 2.2e-16

Hasil Regresi: Regresi linier menunjukkan bahwa pendidikan, pengalaman kerja, dan usia memiliki pengaruh signifikan terhadap gaji. Koefisien dari model menunjukkan besarnya pengaruh masing-masing variabel independen terhadap gaji. Visualisasi: Plot regresi menunjukkan garis regresi yang memodelkan hubungan antara pendidikan dan gaji.

#Modul 10

Untuk peramalan, kita akan menggunakan data time series. Misalnya, jika dataset memiliki variabel yang mencerminkan data waktu, kita bisa menggunakan metode ARIMA atau ETS.

Namun, dataset wage1 tidak cocok untuk peramalan karena bukan data time series. Oleh karena itu, kita akan memilih dataset yang lebih cocok dari paket “wooldridge” untuk peramalan. Misalnya, kita bisa menggunakan dataset intdef yang berisi data time series.

Muat dataset

data("intdef")
ts_data <- ts(intdef$inf, start = c(1948, 1), frequency = 4)

Plot time series

plot(ts_data, main = "Inflasi dari 1948 hingga 1987", xlab = "Tahun", ylab = "Inflasi")

Model peramalan menggunakan ARIMA

library(forecast)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
arima_model <- auto.arima(ts_data)
forecast_values <- forecast(arima_model, h = 20)

Plot peramalan

plot(forecast_values, main = "Peramalan Inflasi", xlab = "Tahun", ylab = "Inflasi")

Hasil Peramalan: Model ARIMA digunakan untuk memodelkan dan memprediksi nilai inflasi berdasarkan data historical. Plot peramalan menunjukkan nilai-nilai inflasi yang diprediksi untuk beberapa periode ke depan.