Soru 2 Bu soruyu yanıtlamak için DISCRIM verilerini kullanın. Bunlar, New Jersey ve Pennsylvania’daki fast-food restoranlarındaki çeşitli ürünlerin fiyatlarına ilişkin posta kodu düzeyinde veriler ve posta kodu popülasyonunun özellikleridir. Buradaki fikir, fast-food restoranlarının siyahların daha yoğun olduğu bölgelerde daha yüksek fiyatlar talep edip etmediğini öğrenmektir. Modelimiz psoda=ß0+ß1prpblck+ß2income+u

a Modelin değişkenlerinin ve prppov değişkeninin ne anlama geldiğini yazın.

b ortalama prpblck ve income değerlerini standart sapmalarıyla birlikte bulun. prpblck ve income ölçü birimleri nelerdir?

c Bu modeli OLS ile tahmin edin ve sonuçları, n ve R-kare dahil olmak üzere denklem biçiminde rapor edin. (Tahminleri raporlarken bilimsel gösterimi kullanmayın.) prpblck üzerindeki katsayıyı yorumlayın. Sizce ekonomik olarak büyük mü?

d Basit regresyon psoda=ß0+ß1prpblck+u modelini kullanarak basit regresyonu tahmin edin. Ayrımcılık etkisi income’ı kontrol ettiğiniz modele göre daha mı büyük daha mı küçük?

Cevap 2a

data(“discrim”) head(discrim) ## psoda pfries pentree wagest nmgrs nregs hrsopen emp psoda2 pfries2 pentree2 ## 1 1.12 1.06 1.02 4.25 3 5 16.0 27.5 1.11 1.11 1.05 ## 2 1.06 0.91 0.95 4.75 3 3 16.5 21.5 1.05 0.89 0.95 ## 3 1.06 0.91 0.98 4.25 3 5 18.0 30.0 1.05 0.94 0.98 ## 4 1.12 1.02 1.06 5.00 4 5 16.0 27.5 1.15 1.05 1.05 ## 5 1.12 NA 0.49 5.00 3 3 16.0 5.0 1.04 1.01 0.58 ## 6 1.06 0.95 1.01 4.25 4 4 15.0 17.5 1.05 0.94 1.00 ## wagest2 nmgrs2 nregs2 hrsopen2 emp2 compown chain density crmrte state ## 1 5.05 5 5 15.0 27.0 1 3 4030 0.0528866 1 ## 2 5.05 4 3 17.5 24.5 0 1 4030 0.0528866 1 ## 3 5.05 4 5 17.5 25.0 0 1 11400 0.0360003 1 ## 4 5.05 4 5 16.0 NA 0 3 8345 0.0484232 1 ## 5 5.05 3 3 16.0 12.0 0 1 720 0.0615890 1 ## 6 5.05 3 4 15.0 28.0 0 1 4424 0.0334823 1 ## prpblck prppov prpncar hseval nstores income county lpsoda ## 1 0.1711542 0.0365789 0.0788428 148300 3 44534 18 0.11332869 ## 2 0.1711542 0.0365789 0.0788428 148300 3 44534 18 0.05826885 ## 3 0.0473602 0.0879072 0.2694298 169200 3 41164 12 0.05826885 ## 4 0.0528394 0.0591227 0.1366903 171600 3 50366 10 0.11332869 ## 5 0.0344800 0.0254145 0.0738020 249100 1 72287 10 0.11332869 ## 6 0.0591327 0.0835001 0.1151341 148000 2 44515 18 0.05826885 ## lpfries lhseval lincome ldensity NJ BK KFC RR ## 1 0.05826885 11.90699 10.70401 8.301521 1 0 0 1 ## 2 -0.09431065 11.90699 10.70401 8.301521 1 1 0 0 ## 3 -0.09431065 12.03884 10.62532 9.341369 1 1 0 0 ## 4 0.01980261 12.05292 10.82707 9.029418 1 0 0 1 ## 5 NA 12.42561 11.18840 6.579251 1 1 0 0 ## 6 -0.05129331 11.90497 10.70358 8.394799 1 1 0 0 help(discrim) ## Cevap 2b mean(discrim\(prpblck) ## [1] NA sd(discrim\)prpblck) ## [1] NA mean(discrim\(income) ## [1] NA sd(discrim\)income) ## [1] NA sum(is.na(discrim\(prpblck)) ## [1] 1 sum(is.na(discrim\)income)) ## [1] 1 mean(discrim\(prpblck,na.rm = TRUE) ## [1] 0.1134864 sd(discrim\)prpblck,na.rm = TRUE) ## [1] 0.1824165 mean(discrim\(income, na.rm = TRUE) ## [1] 47053.78 sd(discrim\)income, na.rm = TRUE) ## [1] 13179.29 library(vtable) ## Loading required package: kableExtra sumtable(discrim, summ=c(‘notNA(x)’, ‘countNA(x)’, ‘mean(x)’,‘sd(x)’),out=‘return’) ## Variable NotNA CountNA Mean Sd ## 1 psoda 402 8 1.045 0.089 ## 2 pfries 393 17 0.922 0.106 ## 3 pentree 398 12 1.322 0.643 ## 4 wagest 390 20 4.616 0.347 ## 5 nmgrs 404 6 3.42 1.018 ## 6 nregs 388 22 3.608 1.244 ## 7 hrsopen 410 0 14.439 2.81 ## 8 emp 404 6 17.622 9.423 ## 9 psoda2 388 22 1.045 0.094 ## 10 pfries2 382 28 0.941 0.109 ## 11 pentree2 386 24 1.354 0.65 ## 12 wagest2 389 21 4.996 0.253 ## 13 nmgrs2 404 6 3.484 1.14 ## 14 nregs2 388 22 3.608 1.244 ## 15 hrsopen2 399 11 14.466 2.752 ## 16 emp2 397 13 17.567 8.607 ## 17 compown 410 0 0.344 0.476 ## 18 chain 410 0 2.117 1.11 ## 19 density 409 1 4561.803 5132.408 ## 20 crmrte 409 1 0.053 0.047 ## 21 state 410 0 1.193 0.395 ## 22 prpblck 409 1 0.113 0.182 ## 23 prppov 409 1 0.071 0.067 ## 24 prpncar 409 1 0.115 0.117 ## 25 hseval 409 1 147399.267 56070.468 ## 26 nstores 410 0 3.139 1.809 ## 27 income 409 1 47053.785 13179.286 ## 28 county 410 0 13.659 8.045 ## 29 lpsoda 402 8 0.04 0.085 ## 30 lpfries 393 17 -0.088 0.115 ## 31 lhseval 409 1 11.829 0.389 ## 32 lincome 409 1 10.72 0.284 ## 33 ldensity 409 1 7.959 0.996 ## 34 NJ 410 0 0.807 0.395 ## 35 BK 410 0 0.417 0.494 ## 36 KFC 410 0 0.195 0.397 ## 37 RR 410 0 0.241 0.428 ## Cevap 2c discrimreg <- lm(psoda~prpblck+income, data = discrim) summary(discrimreg) ## ## Call: ## lm(formula = psoda ~ prpblck + income, data = discrim) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.29401 -0.05242 0.00333 0.04231 0.44322 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.563e-01 1.899e-02 50.354 < 2e-16 ## prpblck 1.150e-01 2.600e-02 4.423 1.26e-05 ## income 1.603e-06 3.618e-07 4.430 1.22e-05 ## — ## Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1 ## ## Residual standard error: 0.08611 on 398 degrees of freedom ## (9 observations deleted due to missingness) ## Multiple R-squared: 0.06422, Adjusted R-squared: 0.05952 ## F-statistic: 13.66 on 2 and 398 DF, p-value: 1.835e-06 ## Cevap 2d basitdiscrimreg <- lm(psoda~prpblck, data = discrim) summary(basitdiscrimreg) ## ## Call: ## lm(formula = psoda ~ prpblck, data = discrim) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.30884 -0.05963 0.01135 0.03206 0.44840 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.03740 0.00519 199.87 < 2e-16
## prpblck 0.06493 0.02396 2.71 0.00702 ** ## — ## Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1 ## ## Residual standard error: 0.0881 on 399 degrees of freedom ## (9 observations deleted due to missingness) ## Multiple R-squared: 0.01808, Adjusted R-squared: 0.01561 ## F-statistic: 7.345 on 1 and 399 DF, p-value: 0.007015