SSB 201 – Sosyal Bilimler İçin İstatistik I

Final Sınavı

library(MASS)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:MASS':
## 
##     select
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

Soru 1 – Veri Setini İnceleme ve Düzenleme

1.a) Boston veri setindeki değişken isimlerini ve veri türlerini inceleyiniz.

data("Boston")
head(Boston)
##      crim zn indus chas   nox    rm  age    dis rad tax ptratio  black lstat
## 1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3 396.90  4.98
## 2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8 396.90  9.14
## 3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8 392.83  4.03
## 4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7 394.63  2.94
## 5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7 396.90  5.33
## 6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7 394.12  5.21
##   medv
## 1 24.0
## 2 21.6
## 3 34.7
## 4 33.4
## 5 36.2
## 6 28.7
names(Boston)
##  [1] "crim"    "zn"      "indus"   "chas"    "nox"     "rm"      "age"    
##  [8] "dis"     "rad"     "tax"     "ptratio" "black"   "lstat"   "medv"

1.b) Aşağıdaki değişkenleri seçerek boston_tr isimli yeni bir veri seti oluşturunuz:

  • konut_degeri (medv)
  • oda_sayisi (rm)
  • dusuk_sosyoek (lstat)
  • nehir_kenari (chas)
  • emlak_vergisi (tax)
boston_tr <- Boston %>% dplyr::select(medv,rm,lstat,chas,tax)
head(boston_tr)
##   medv    rm lstat chas tax
## 1 24.0 6.575  4.98    0 296
## 2 21.6 6.421  9.14    0 242
## 3 34.7 7.185  4.03    0 242
## 4 33.4 6.998  2.94    0 222
## 5 36.2 7.147  5.33    0 222
## 6 28.7 6.430  5.21    0 222

1.c) Bu yeni veri setindeki değişken adlarını Türkçeleştiriniz ve boston_tr üzerine kaydediniz.

boston_tr2 <- boston_tr %>% 
 rename (konut_degeri = medv,
   oda_sayisi = rm,
   dusuk_sosyoek = lstat,
   nehir_kenari = chas,
   emlak_vergisi = tax,)
head(boston_tr2)
##   konut_degeri oda_sayisi dusuk_sosyoek nehir_kenari emlak_vergisi
## 1         24.0      6.575          4.98            0           296
## 2         21.6      6.421          9.14            0           242
## 3         34.7      7.185          4.03            0           242
## 4         33.4      6.998          2.94            0           222
## 5         36.2      7.147          5.33            0           222
## 6         28.7      6.430          5.21            0           222
names(boston_tr2)
## [1] "konut_degeri"  "oda_sayisi"    "dusuk_sosyoek" "nehir_kenari" 
## [5] "emlak_vergisi"

Soru 2 – Betimsel İstatistikler

2.a) boston_tr veri seti için summary() fonksiyonunu kullanarak genel özeti elde ediniz.

summary(boston_tr2)
##   konut_degeri     oda_sayisi    dusuk_sosyoek    nehir_kenari    
##  Min.   : 5.00   Min.   :3.561   Min.   : 1.73   Min.   :0.00000  
##  1st Qu.:17.02   1st Qu.:5.886   1st Qu.: 6.95   1st Qu.:0.00000  
##  Median :21.20   Median :6.208   Median :11.36   Median :0.00000  
##  Mean   :22.53   Mean   :6.285   Mean   :12.65   Mean   :0.06917  
##  3rd Qu.:25.00   3rd Qu.:6.623   3rd Qu.:16.95   3rd Qu.:0.00000  
##  Max.   :50.00   Max.   :8.780   Max.   :37.97   Max.   :1.00000  
##  emlak_vergisi  
##  Min.   :187.0  
##  1st Qu.:279.0  
##  Median :330.0  
##  Mean   :408.2  
##  3rd Qu.:666.0  
##  Max.   :711.0

2.b) Konut değeri değişkeninin ortalama, medyan ve ranj değerlerini açıklayınız.

mean(boston_tr2$konut_degeri)
## [1] 22.53281
median(boston_tr2$konut_degeri)
## [1] 21.2

Soru 3 – Frekans ve Yüzde Tablosu

table(boston_tr2$konut_degeri)
## 
##    5  5.6  6.3    7  7.2  7.4  7.5  8.1  8.3  8.4  8.5  8.7  8.8  9.5  9.6  9.7 
##    2    1    1    2    3    1    1    1    2    2    2    1    2    1    1    1 
## 10.2 10.4 10.5 10.8 10.9   11 11.3 11.5 11.7 11.8 11.9   12 12.1 12.3 12.5 12.6 
##    3    2    2    1    2    1    1    1    2    2    2    1    1    1    1    1 
## 12.7 12.8   13 13.1 13.2 13.3 13.4 13.5 13.6 13.8 13.9   14 14.1 14.2 14.3 14.4 
##    3    1    1    4    1    3    4    2    2    5    2    1    3    1    2    2 
## 14.5 14.6 14.8 14.9   15 15.1 15.2 15.3 15.4 15.6 15.7   16 16.1 16.2 16.3 16.4 
##    3    2    1    3    3    1    3    1    2    5    1    1    3    2    1    1 
## 16.5 16.6 16.7 16.8   17 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8 17.9   18 18.1 
##    2    2    2    2    1    3    3    1    3    3    1    1    5    1    1    1 
## 18.2 18.3 18.4 18.5 18.6 18.7 18.8 18.9   19 19.1 19.2 19.3 19.4 19.5 19.6 19.7 
##    3    2    3    4    2    3    2    4    2    4    2    5    6    4    5    2 
## 19.8 19.9   20 20.1 20.2 20.3 20.4 20.5 20.6 20.7 20.8 20.9   21 21.1 21.2 21.4 
##    3    4    5    5    2    4    4    3    6    2    3    2    3    2    5    5 
## 21.5 21.6 21.7 21.8 21.9   22 22.1 22.2 22.3 22.4 22.5 22.6 22.7 22.8 22.9   23 
##    2    2    7    2    3    7    1    5    2    2    3    5    2    4    4    4 
## 23.1 23.2 23.3 23.4 23.5 23.6 23.7 23.8 23.9   24 24.1 24.2 24.3 24.4 24.5 24.6 
##    7    4    4    2    1    2    4    4    5    2    3    1    3    4    3    2 
## 24.7 24.8   25 25.1 25.2 25.3 26.2 26.4 26.5 26.6 26.7   27 27.1 27.5 27.9   28 
##    3    4    8    1    1    1    1    2    1    3    1    1    2    4    2    1 
## 28.1 28.2 28.4 28.5 28.6 28.7   29 29.1 29.4 29.6 29.8 29.9 30.1 30.3 30.5 30.7 
##    1    1    2    1    1    3    2    2    1    2    2    1    3    1    1    1 
## 30.8   31 31.1 31.2 31.5 31.6 31.7   32 32.2 32.4 32.5 32.7 32.9   33 33.1 33.2 
##    1    1    1    1    2    2    1    2    1    1    1    1    1    1    2    2 
## 33.3 33.4 33.8 34.6 34.7 34.9 35.1 35.2 35.4   36 36.1 36.2 36.4 36.5   37 37.2 
##    1    2    1    1    1    3    1    1    2    1    1    2    1    1    1    1 
## 37.3 37.6 37.9 38.7 39.8 41.3 41.7 42.3 42.8 43.1 43.5 43.8   44 44.8 45.4   46 
##    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1 
## 46.7 48.3 48.5 48.8   50 
##    1    1    1    1   16
      prop.table(table(boston_tr2$konut_degeri)) *100
## 
##         5       5.6       6.3         7       7.2       7.4       7.5       8.1 
## 0.3952569 0.1976285 0.1976285 0.3952569 0.5928854 0.1976285 0.1976285 0.1976285 
##       8.3       8.4       8.5       8.7       8.8       9.5       9.6       9.7 
## 0.3952569 0.3952569 0.3952569 0.1976285 0.3952569 0.1976285 0.1976285 0.1976285 
##      10.2      10.4      10.5      10.8      10.9        11      11.3      11.5 
## 0.5928854 0.3952569 0.3952569 0.1976285 0.3952569 0.1976285 0.1976285 0.1976285 
##      11.7      11.8      11.9        12      12.1      12.3      12.5      12.6 
## 0.3952569 0.3952569 0.3952569 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 
##      12.7      12.8        13      13.1      13.2      13.3      13.4      13.5 
## 0.5928854 0.1976285 0.1976285 0.7905138 0.1976285 0.5928854 0.7905138 0.3952569 
##      13.6      13.8      13.9        14      14.1      14.2      14.3      14.4 
## 0.3952569 0.9881423 0.3952569 0.1976285 0.5928854 0.1976285 0.3952569 0.3952569 
##      14.5      14.6      14.8      14.9        15      15.1      15.2      15.3 
## 0.5928854 0.3952569 0.1976285 0.5928854 0.5928854 0.1976285 0.5928854 0.1976285 
##      15.4      15.6      15.7        16      16.1      16.2      16.3      16.4 
## 0.3952569 0.9881423 0.1976285 0.1976285 0.5928854 0.3952569 0.1976285 0.1976285 
##      16.5      16.6      16.7      16.8        17      17.1      17.2      17.3 
## 0.3952569 0.3952569 0.3952569 0.3952569 0.1976285 0.5928854 0.5928854 0.1976285 
##      17.4      17.5      17.6      17.7      17.8      17.9        18      18.1 
## 0.5928854 0.5928854 0.1976285 0.1976285 0.9881423 0.1976285 0.1976285 0.1976285 
##      18.2      18.3      18.4      18.5      18.6      18.7      18.8      18.9 
## 0.5928854 0.3952569 0.5928854 0.7905138 0.3952569 0.5928854 0.3952569 0.7905138 
##        19      19.1      19.2      19.3      19.4      19.5      19.6      19.7 
## 0.3952569 0.7905138 0.3952569 0.9881423 1.1857708 0.7905138 0.9881423 0.3952569 
##      19.8      19.9        20      20.1      20.2      20.3      20.4      20.5 
## 0.5928854 0.7905138 0.9881423 0.9881423 0.3952569 0.7905138 0.7905138 0.5928854 
##      20.6      20.7      20.8      20.9        21      21.1      21.2      21.4 
## 1.1857708 0.3952569 0.5928854 0.3952569 0.5928854 0.3952569 0.9881423 0.9881423 
##      21.5      21.6      21.7      21.8      21.9        22      22.1      22.2 
## 0.3952569 0.3952569 1.3833992 0.3952569 0.5928854 1.3833992 0.1976285 0.9881423 
##      22.3      22.4      22.5      22.6      22.7      22.8      22.9        23 
## 0.3952569 0.3952569 0.5928854 0.9881423 0.3952569 0.7905138 0.7905138 0.7905138 
##      23.1      23.2      23.3      23.4      23.5      23.6      23.7      23.8 
## 1.3833992 0.7905138 0.7905138 0.3952569 0.1976285 0.3952569 0.7905138 0.7905138 
##      23.9        24      24.1      24.2      24.3      24.4      24.5      24.6 
## 0.9881423 0.3952569 0.5928854 0.1976285 0.5928854 0.7905138 0.5928854 0.3952569 
##      24.7      24.8        25      25.1      25.2      25.3      26.2      26.4 
## 0.5928854 0.7905138 1.5810277 0.1976285 0.1976285 0.1976285 0.1976285 0.3952569 
##      26.5      26.6      26.7        27      27.1      27.5      27.9        28 
## 0.1976285 0.5928854 0.1976285 0.1976285 0.3952569 0.7905138 0.3952569 0.1976285 
##      28.1      28.2      28.4      28.5      28.6      28.7        29      29.1 
## 0.1976285 0.1976285 0.3952569 0.1976285 0.1976285 0.5928854 0.3952569 0.3952569 
##      29.4      29.6      29.8      29.9      30.1      30.3      30.5      30.7 
## 0.1976285 0.3952569 0.3952569 0.1976285 0.5928854 0.1976285 0.1976285 0.1976285 
##      30.8        31      31.1      31.2      31.5      31.6      31.7        32 
## 0.1976285 0.1976285 0.1976285 0.1976285 0.3952569 0.3952569 0.1976285 0.3952569 
##      32.2      32.4      32.5      32.7      32.9        33      33.1      33.2 
## 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.3952569 0.3952569 
##      33.3      33.4      33.8      34.6      34.7      34.9      35.1      35.2 
## 0.1976285 0.3952569 0.1976285 0.1976285 0.1976285 0.5928854 0.1976285 0.1976285 
##      35.4        36      36.1      36.2      36.4      36.5        37      37.2 
## 0.3952569 0.1976285 0.1976285 0.3952569 0.1976285 0.1976285 0.1976285 0.1976285 
##      37.3      37.6      37.9      38.7      39.8      41.3      41.7      42.3 
## 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 
##      42.8      43.1      43.5      43.8        44      44.8      45.4        46 
## 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 0.1976285 
##      46.7      48.3      48.5      48.8        50 
## 0.1976285 0.1976285 0.1976285 0.1976285 3.1620553

Not: Aşağıdaki soruları boston_tr veri setini kullanarak yanıtlayın.

3.a) Nehir kenarı (chas) değişkenine ait frekans tablosunu oluşturunuz.

table(boston_tr2$nehir_kenari)
## 
##   0   1 
## 471  35

3.b) Aynı değişken için yüzde (%) dağılımını hesaplayınız.

prop.table(table(boston_tr2$nehir_kenari)) *100
## 
##         0         1 
## 93.083004  6.916996

Soru 4 – Saçılım Diyagramı ve Korelasyon

4.a) Oda sayısı (rm) ile konut değeri (medv) arasındaki ilişkiyi gösteren bir saçılım diyagramı oluşturunuz. Eksen adlarını ve grafiğin başlığını ekleyiniz.

`




#### **4.b) Bu iki değişken arasındaki korelasyon katsayısını hesaplayınız ve korelasyon katsayısını yorumlayınız.**

### **Soru 5 – Basit Doğrusal Regresyon**

#### **5.a) Oda sayısı değişkeninin konut değeri değişkenini yordayıp yordamadığını test eden bir basit doğrusal regresyon modeli kurunuz.**

5.b) Regresyon çıktısından aşağıdaki bilgileri bulunuz ve yorumlayınız:

  • eğim (β₁)

  • kesişim (β₀)

  • R-kare (R²)

5.c) Oda sayısı (oda_sayisi) ile konut değeri (konut_degeri) arasındaki ilişkiyi gösteren bir saçılım diyagramı oluşturunuz. Grafiğe uygun eksen adlarını ve bir başlık ekleyiniz. Bu grafiğin üzerine basit doğrusal regresyon çizgisini ekleyiniz.

Soru 6 - geom_jitter fonksiyonunun kullanım amacı nedir?