Bu ödevde, TIMSS 2015 uygulamasına ait bir kitapçığın Türkiye ve Amerika verilerini kullanacaksınız.
Veri seti adı: “TRUSA.RDS”. Bu veri setini R ortamına aktarınız.
Veri setinde eksik veri olup olmadığını kontrol ediniz.
Kitapçıktaki 35 maddenin toplamını hesaplayarak veri setine yeni bir sütun olarak ekleyiniz.
Toplam puanın her iki ülkeye göre betimsel istatistiklerini hesaplayınız.
Toplam puanın, Türkiye ve ABD örneklemlerinde farklılaşıp farklılaşmadığını t testi ile test ediniz.
Veri setinde %5, %10 ve %15 oranında eksik veriler oluşturunuz.
Oluşturulan eksik veri setlerinde önce eksik verinin rastgele olup olmadığını test ediniz. Ardından, liste bazında silme yöntemiyle eksik verileri temizleyerek e seçeneğinde gerçekleştirdiğiniz t testini tekrarlayınız. Tam veri ile elde edilen sonuçlarla karşılaştırınız.
f seçeneğinde oluşturulan veri setlerindeki eksik verileri, belirlediğiniz bir kayıp veri atama yöntemiyle doldurunuz. Daha sonra, e seçeneğinde gerçekleştirdiğiniz t testini tekrar ediniz ve tam veri ile elde edilen sonuçlarla karşılaştırınız.
Eksik veri oranının uygulanan yöntemlerin performansına etkisini açıklayınız.
İyi kodlamalar :)
Çözüm 1: Emrah hocanın versiyonu
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.6
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.1 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.2.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(psych)
##
## Attaching package: 'psych'
##
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
TRUSA <- readRDS("TRUSA.RDS")
TRUSA$toplam <- TRUSA %>% select(starts_with("M")) %>% rowSums()
TRUSA %>% group_by(CNT) %>% summarise(n=n(),
ort=mean(toplam),
sd=sd(toplam),
min=min(toplam),
max=max(toplam))
## # A tibble: 2 × 6
## CNT n ort sd min max
## <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 TUR 435 13.5 7.57 2 32
## 2 USA 716 17.0 7.53 1 34
t.test(toplam ~ CNT, data=TRUSA)
##
## Welch Two Sample t-test
##
## data: toplam by CNT
## t = -7.8242, df = 912.31, p-value = 1.41e-14
## alternative hypothesis: true difference in means between group TUR and group USA is not equal to 0
## 95 percent confidence interval:
## -4.494510 -2.691921
## sample estimates:
## mean in group TUR mean in group USA
## 13.45287 17.04609
Çözüm 2: Hocanın anlatımı
TRUSA <- readRDS("TRUSA.RDS")
#library(nainar)
#miss_var_table(TRUSA)
SUM <- TRUSA %>% dplyr::select(starts_with("M")) %>% rowSums()
TRUSA$SUM <-SUM
d)descriptive statistics of total score
library(dplyr)
library(psych)
describe.by(TRUSA$SUM, TRUSA$CNT)
## Warning in describe.by(TRUSA$SUM, TRUSA$CNT): describe.by is deprecated.
## Please use the describeBy function
##
## Descriptive statistics by group
## group: TUR
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 435 13.45 7.57 11 12.74 7.41 2 32 30 0.71 -0.61 0.36
## ------------------------------------------------------------
## group: USA
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 716 17.05 7.53 17 16.92 8.9 1 34 33 0.1 -0.9 0.28
library(effsize) #to calculate the effect size
##
## Attaching package: 'effsize'
## The following object is masked from 'package:psych':
##
## cohen.d
t_test_result <- t.test(SUM~CNT, data=TRUSA, var.equal=TRUE)
print(t_test_result)
##
## Two Sample t-test
##
## data: SUM by CNT
## t = -7.8348, df = 1149, p-value = 1.064e-14
## alternative hypothesis: true difference in means between group TUR and group USA is not equal to 0
## 95 percent confidence interval:
## -4.493049 -2.693382
## sample estimates:
## mean in group TUR mean in group USA
## 13.45287 17.04609
Calculate Cohen’s d
cohen_d_result <-effsize::cohen.d(TRUSA$SUM[TRUSA$CNT=="TUR"],
TRUSA$SUM[TRUSA$CNT=="USA"])
print(cohen_d_result)
##
## Cohen's d
##
## d estimate: -0.4762813 (small)
## 95 percent confidence interval:
## lower upper
## -0.5971341 -0.3554285
library(mvdalab)
##
## Attaching package: 'mvdalab'
## The following object is masked from 'package:psych':
##
## smc
TRUSA_5 <- introNAs(TRUSA, percent=5)
TRUSA_10 <- introNAs(TRUSA, percent = 10)
TRUSA_15 <- introNAs(TRUSA, percent= 15)
TRUSA_5 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 53 71 49 63 57 53 63 51
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 60 63 54 66 55 50 62 50
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 57 56 64 54 76 58 59 58
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 64 49 56 64 55 56 54 46
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 50 51 60 65 50 66 56
TRUSA_10 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 114 127 115 119 112 115 116 111
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 112 107 123 108 114 103 105 106
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 112 127 117 131 135 114 99 117
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 116 102 104 120 121 120 126 118
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 111 107 116 136 117 101 115
TRUSA_15 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 175 160 187 198 201 174 157 185
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 164 189 187 147 160 164 147 195
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 171 170 171 153 161 166 192 177
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 147 181 162 179 169 168 163 173
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 177 170 184 173 176 185 175
TRUSA_5_lw <- na.omit(TRUSA_5)
TRUSA_10_lw <- na.omit(TRUSA_10)
TRUSA_15_lw <- na.omit(TRUSA_15)
ordinalsa poly
Çoklu atama yapmamız gerekir
#library(mice)
#TRUSA_5_im1 <- mice(TRUSA_5[,2:36],m=5,maxit = 50, method = 'logreg', seed=500)
#TRUSA_10_im1 <- mice(TRUSA_10[,2:36],m=5, maxit=50, method='logreg',seed=500)
#TRUSA_15_im1 <- mice(TRUSA_15[,2:36],m=5, maxit = 50, method='logreg',seed=500)
#completed_data_1 <- complete(TRUSA_5_im1, 1)
#completed_data_2 <- complete(TRUSA_5_im1,2)
#completed_data_3 <- complete(TRUSA_5_im1,3)
#completed_data_4 <- complete(TRUSA_5_im1, 4)
#completed_data_5 <- complete(TRUSA_5_im1, 5)
#t_test_result_1 <- t.test(SUM~CNT, data=completed_data_1, var.equal=TRUE)
#not: cok uzundu ormesi ondan sabitledim :(
library(tidyverse)
library(stevemisc)
##
## Attaching package: 'stevemisc'
## The following object is masked from 'package:lubridate':
##
## dst
## The following object is masked from 'package:dplyr':
##
## tbl_df
library(knitr)
library(haven)
library(summarytools)
##
## Attaching package: 'summarytools'
## The following object is masked from 'package:tibble':
##
## view
library(outliers)
##
## Attaching package: 'outliers'
## The following object is masked from 'package:psych':
##
## outlier
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggpmisc)
## Zorunlu paket yükleniyor: ggpp
## Registered S3 methods overwritten by 'ggpp':
## method from
## heightDetails.titleGrob ggplot2
## widthDetails.titleGrob ggplot2
##
## Attaching package: 'ggpp'
## The following object is masked from 'package:ggplot2':
##
## annotate
library(psych)
library(sur)
##
## Attaching package: 'sur'
## The following object is masked from 'package:psych':
##
## skew
library(moments)
library(corrplot)
## corrplot 0.95 loaded
library(olsrr)
##
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
##
## rivers
NORMALLİK
library(dplyr)
library(haven) # SPSS dosyalarını R ortamına aktarmak için haven paketini kullanın.
screen <- read_sav("SCREEN.sav")
screen <- expss::drop_var_labs(screen)
head(screen) # Veri setinin ilk birkaç satırını görüntüle
## # A tibble: 6 × 8
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME EMPLMNT MSTATUS RACE
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 8 27 5 1 2 1
## 2 2 3 7 20 6 0 2 1
## 3 3 0 8 23 3 0 2 1
## 4 4 13 9 28 8 1 2 1
## 5 5 15 7 24 1 1 2 1
## 6 6 3 8 25 4 0 2 1
eksik veri düzenlemesi
screen <- screen %>%
mutate(INCOME = ifelse(is.na(INCOME), mean(INCOME, na.rm =TRUE),INCOME)) %>% na.omit()
summary(screen)
## SUBNO TIMEDRS ATTDRUG ATTHOUSE
## Min. : 1.0 Min. : 0.000 Min. : 5.00 Min. : 2.00
## 1st Qu.:136.8 1st Qu.: 2.000 1st Qu.: 7.00 1st Qu.:21.00
## Median :313.5 Median : 4.000 Median : 8.00 Median :24.00
## Mean :317.3 Mean : 7.914 Mean : 7.69 Mean :23.54
## 3rd Qu.:483.2 3rd Qu.:10.000 3rd Qu.: 9.00 3rd Qu.:27.00
## Max. :758.0 Max. :81.000 Max. :10.00 Max. :35.00
## INCOME EMPLMNT MSTATUS RACE
## Min. : 1.000 Min. :0.000 Min. :1.00 Min. :1.000
## 1st Qu.: 3.000 1st Qu.:0.000 1st Qu.:2.00 1st Qu.:1.000
## Median : 4.000 Median :0.000 Median :2.00 Median :1.000
## Mean : 4.208 Mean :0.472 Mean :1.78 Mean :1.086
## 3rd Qu.: 6.000 3rd Qu.:1.000 3rd Qu.:2.00 3rd Qu.:1.000
## Max. :10.000 Max. :1.000 Max. :2.00 Max. :2.000
x <- c(3,5,7,NA,9)
ifelse(is.na(x),mean(x,na.rm=TRUE),x)
## [1] 3 5 7 6 9
Kategorik değişkenler için:
library(dplyr)
table(screen$RACE)
##
## 1 2
## 424 40
library(summarytools)
freq(screen$RACE,
round.digits=2,report.nas = FALSE,
style = "rmarkdown")
## setting plain.ascii to FALSE
## ### Frequencies
## #### screen$RACE
## **Type:** Numeric
##
## | | Freq | % | % Cum. |
## |----------:|-----:|-------:|-------:|
## | **1** | 424 | 91.38 | 91.38 |
## | **2** | 40 | 8.62 | 100.00 |
## | **Total** | 464 | 100.00 | 100.00 |
library(knitr)
freq(screen$MSTATUS,report.nas = FALSE) %>%
kable(format='markdown',
caption="Frekans Tablosu",digits = 2)
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| 1 | 102 | 21.98 | 21.98 | 21.98 | 21.98 |
| 2 | 362 | 78.02 | 100.00 | 78.02 | 100.00 |
| 0 | NA | NA | 0.00 | 100.00 | |
| Total | 464 | 100.00 | 100.00 | 100.00 | 100.00 |
summarytools paketini incele.
Sürekli değişkenlerde uç değerler:
library(outliers)
z.scores <- screen %>%
select(2:5) %>%
scores(type = "z") %>%
round(2)
head(z.scores)
## TIMEDRS ATTDRUG ATTHOUSE INCOME
## 1 -0.63 0.27 0.77 0.34
## 2 -0.45 -0.60 -0.79 0.76
## 3 -0.72 0.27 -0.12 -0.51
## 4 0.46 1.13 0.99 1.61
## 5 0.65 -0.60 0.10 -1.36
## 6 -0.45 0.27 0.33 -0.09
summarytools::descr(z.scores,
stats = c("min", "max"),
transpose = TRUE,
headings = FALSE)
##
## Min Max
## -------------- ------- ------
## ATTDRUG -2.33 2.00
## ATTHOUSE -4.80 2.56
## INCOME -1.36 2.46
## TIMEDRS -0.72 6.67
library(DT)
DT::datatable(z.scores)
library(ggplot2)
ggplot(screen, aes(x = TIMEDRS)) +
geom_histogram(bins = 30L, fill = "#0c4c8a")
# library(ggpmisc)
ggplot(screen, aes(x = TIMEDRS)) + geom_histogram() +
geom_vline(xintercept =7.914, color = "red",
linetype = "dashed") +
annotate("text", label = "Ort = 7.913", x = 10, y = 100, color ="black")
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
ggplot(screen, aes(x = TIMEDRS)) +
geom_histogram(aes(y=..density..))+
geom_density(alpha=.5, fill="#0c4c8a") +
theme_minimal()
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
library(plotly)
plot_ly(x = screen$TIMEDRS, type = "histogram",
histnorm = "probability")
ggplot(screen, aes(y = TIMEDRS)) +
geom_boxplot()
out <- boxplot.stats(screen$TIMEDRS)$out
out
## [1] 60 23 39 33 38 34 27 30 25 49 60 27 27 52 24 57 52 58 57 43 37 75 29 30 25
## [26] 37 56 29 37 81 27 23
out_ind <- which(screen$TIMEDRS %in% c(out))
out_ind
## [1] 40 64 67 76 79 96 102 117 150 163 168 170 178 193 203 206 213 249 274
## [20] 278 285 289 309 342 344 362 367 374 388 404 408 443
plot_ly(y = screen$TIMEDRS, type = 'box')
plot_ly(y = screen$TIMEDRS, type = 'box') %>%
layout(title = 'Box Plot',
annotations = list( x = -0.01, y = boxplot.stats(screen$TIMEDRS)$out,
text = paste(out_ind), showarrow = FALSE,
xanchor = "right"))
ggplot(screen, aes(x = factor(MSTATUS),
y = TIMEDRS, fill = factor(MSTATUS))) +
geom_boxplot() +
theme_minimal()
ggplot(screen) + aes(x = ATTDRUG) +
geom_histogram( bins = 6, fill = "#0c4c8a")+
theme_minimal()
ggplot(screen) +
aes(x = ATTHOUSE) +
geom_histogram( bins = 10, fill = "darkgreen") +
theme_minimal()
plot_ly(y = screen$ATTHOUSE, type = 'box')
screen[c(260,298),]
## # A tibble: 2 × 8
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME EMPLMNT MSTATUS RACE
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 346 2 8 2 1 0 1 1
## 2 407 2 8 2 4 0 1 1
screen2 <- screen[-c(260,298),]
Mahalanobis Uzaklığı
library(psych)
veri <- screen2[,1:5]
md <- mahalanobis(veri, center = colMeans(veri), cov = cov(veri))
head(md,20)
## [1] 3.785517 4.541493 3.501077 7.281365 5.457240 2.896550 5.807898
## [8] 3.879478 4.751166 7.415405 10.602100 5.249121 6.073732 3.271885
## [15] 12.316463 4.440749 4.836160 6.362806 4.126524 10.797545
library(psych)
alpha <- .001
cutoff <- (qchisq(p = 1 - alpha, df = ncol(veri)))
cutoff
## [1] 20.51501
ucdegerler <- which(md > cutoff)
veri[ucdegerler, ]
## # A tibble: 9 × 5
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 48 60 7 24 1
## 2 235 60 10 29 4
## 3 276 57 9 24 2
## 4 291 52 8 19 1
## 5 330 58 7 29 4
## 6 370 57 8 23 4
## 7 398 75 9 33 9
## 8 502 56 8 19 3
## 9 548 81 8 24 9
data_temiz <- veri[-ucdegerler, ]
veri[ucdegerler, ]
## # A tibble: 9 × 5
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 48 60 7 24 1
## 2 235 60 10 29 4
## 3 276 57 9 24 2
## 4 291 52 8 19 1
## 5 330 58 7 29 4
## 6 370 57 8 23 4
## 7 398 75 9 33 9
## 8 502 56 8 19 3
## 9 548 81 8 24 9
Çok Değişkenli Normallik Sayıltısı
library(sur)
attach(screen)
skew(screen$TIMEDRS)
## [1] 3.234045
sew(data_temiz$TIMEDRS)
## NULL
se.skew(TIMEDRS)
## [1] 0.1133494
skew.ratio(TIMEDRS)
## [1] 28.53164
skew(TIMEDRS)/se.skew(TIMEDRS)
## [1] 28.53164
jarque.test fonksiyonu veri normal dağılımdan farklılaşmamaktadır yokluk hipotezini test etmektedir.
library(moments)
library(labelled)
jarque.test(remove_labels(TIMEDRS))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(TIMEDRS)
## JB = 4034.9, p-value < 2.2e-16
## alternative hypothesis: greater
jarque.test(remove_labels(ATTDRUG))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(ATTDRUG)
## JB = 5.0552, p-value = 0.07985
## alternative hypothesis: greater
skew.ratio(ATTDRUG)
## [1] -1.10762
jarque.test(remove_labels(ATTHOUSE))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(ATTHOUSE)
## JB = 61.092, p-value = 5.418e-14
## alternative hypothesis: greater
set.seed(0)
normal <- rnorm(200)
non_normal <- rexp(200, rate=3)
par(mfrow=c(1,2))
hist(normal, col='steelblue', main='Normal')
hist(non_normal, col='steelblue', main='Non-normal')
par(mfrow=c(1,2))
qqnorm(normal, main='Normal')
qqline(normal)
qqnorm(non_normal, main='Non-normal')
ggplot(data=screen, aes(sample=ATTHOUSE))+
geom_qq()+
geom_qq_line()
Burası ek not, derste yazmışım :)
##any_na(TRUSA)
#n_miss(TRUSA)
#prop_miss(TRUSA)
#TRUSA %>% is.na() %>% colSums()
#miss_var_summary(TRUSA)
#miss_var_table(TRUSA)
TRUSA$toplam <-rowSums(TRUSA[,3:37],na.rm =TRUE)
TRUSA$toplam
## [1] 10 18 8 11 6 21 19 18 19 20 12 6 7 28 15 12 26 18 9 25 13 7 5 22
## [25] 20 16 26 23 10 13 9 27 20 7 6 5 7 5 7 6 30 28 30 6 20 8 29 13
## [49] 12 5 8 9 11 7 15 12 21 22 12 7 4 6 27 13 22 20 7 15 3 10 6 28
## [73] 8 12 30 30 21 13 11 12 9 14 19 7 16 13 6 30 27 8 9 18 11 31 5 8
## [97] 5 10 27 9 3 7 8 16 22 6 18 10 9 3 7 16 14 7 21 8 8 9 7 14
## [121] 27 7 26 20 9 12 20 29 17 24 23 9 2 26 22 10 21 9 12 22 25 11 13 5
## [145] 9 6 16 13 10 7 15 11 15 27 21 4 14 12 10 9 13 6 9 11 5 13 15 21
## [169] 12 26 29 23 8 11 14 9 7 5 8 8 9 27 22 16 15 5 8 19 12 8 22 10
## [193] 10 6 24 19 16 16 8 17 6 12 24 9 9 8 7 21 18 11 9 7 15 9 26 6
## [217] 24 7 13 10 31 8 23 5 6 11 8 13 8 15 9 9 15 7 25 9 17 8 6 10
## [241] 31 28 32 32 11 13 17 28 7 21 4 13 8 7 12 26 25 11 7 18 21 12 16 24
## [265] 6 6 12 4 8 6 4 7 4 8 21 7 29 3 24 17 12 6 15 7 8 15 11 10
## [289] 17 8 3 6 9 16 13 6 17 11 4 6 16 8 10 6 6 6 14 26 12 7 16 2
## [313] 28 4 6 27 17 14 7 3 17 8 6 5 8 9 9 25 4 6 7 28 16 7 7 7
## [337] 3 13 26 8 25 19 9 8 12 15 18 16 12 13 19 29 5 32 10 10 22 4 6 9
## [361] 5 10 10 6 6 4 24 8 6 7 20 11 14 15 8 24 23 25 16 21 25 8 10 8
## [385] 5 9 9 8 6 6 28 7 27 6 13 9 6 5 8 21 20 9 8 10 16 28 4 11
## [409] 6 24 17 10 5 8 26 9 17 23 19 8 6 15 16 23 21 14 21 13 20 4 17 19
## [433] 5 28 32 7 22 8 19 20 27 13 8 29 33 16 12 26 22 19 20 11 10 1 1 1
## [457] 14 22 22 25 14 6 17 28 29 30 28 28 18 22 21 26 7 7 18 11 6 27 18 5
## [481] 17 23 19 22 15 9 22 29 12 23 27 17 23 8 23 14 18 24 28 28 32 21 15 10
## [505] 34 20 17 19 14 16 6 17 14 9 20 8 12 7 8 18 17 7 22 11 21 17 26 15
## [529] 11 27 5 9 4 23 6 23 19 16 20 8 6 14 24 7 19 29 8 22 24 21 10 22
## [553] 22 13 17 28 28 24 10 15 4 24 22 19 18 16 13 6 17 9 19 13 19 7 8 11
## [577] 11 19 16 11 19 16 22 22 31 16 30 30 19 25 11 8 19 5 30 12 15 28 20 19
## [601] 22 3 18 18 19 21 23 24 22 9 32 20 9 9 6 29 20 9 23 26 15 13 14 5
## [625] 10 15 17 32 31 31 20 10 15 21 22 22 30 31 17 20 32 20 11 9 12 13 10 17
## [649] 22 6 6 15 21 18 9 6 22 14 9 24 16 10 28 26 15 19 6 18 27 16 6 7
## [673] 4 8 9 16 30 15 10 10 23 28 18 9 20 12 18 25 31 20 28 27 10 20 24 30
## [697] 4 23 21 26 20 11 30 16 8 7 32 24 7 9 22 17 24 29 12 7 17 27 11 17
## [721] 9 8 15 6 9 27 30 7 11 10 11 19 27 15 1 8 8 6 8 14 12 15 6 18
## [745] 7 15 21 32 9 11 18 22 25 16 30 27 25 16 12 17 14 18 20 18 19 20 21 22
## [769] 19 29 24 29 23 11 16 11 17 13 23 22 5 8 9 5 10 7 7 14 4 17 22 10
## [793] 3 11 23 23 27 8 6 3 10 6 10 8 8 15 15 7 21 16 24 10 12 8 24 24
## [817] 25 18 17 11 13 17 17 12 26 26 15 13 6 15 20 22 8 29 22 7 11 17 10 22
## [841] 10 2 4 19 8 26 21 26 22 27 29 22 26 7 19 17 25 13 17 22 12 12 9 14
## [865] 19 20 16 24 33 29 12 25 30 24 16 23 26 15 6 29 29 17 18 7 7 25 31 17
## [889] 12 15 15 8 14 7 22 24 12 7 18 10 12 16 20 18 14 32 29 29 24 29 17 6
## [913] 10 13 24 22 18 30 21 8 7 20 27 25 10 23 26 11 11 12 27 24 28 23 26 19
## [937] 16 18 30 31 21 26 17 29 18 8 16 23 11 12 17 18 17 17 22 20 4 15 13 11
## [961] 20 20 12 19 30 15 11 20 32 14 28 6 16 17 25 23 16 10 24 29 8 8 23 22
## [985] 20 26 29 16 19 20 17 28 22 16 29 22 12 2 11 15 21 23 16 17 19 11 6 15
## [1009] 11 15 16 8 9 9 10 24 19 17 29 18 22 15 17 21 20 10 28 30 20 19 22 15
## [1033] 10 16 15 14 28 15 19 9 5 11 19 26 5 16 25 14 32 23 17 25 17 17 18 21
## [1057] 13 6 19 15 18 5 14 25 10 12 12 14 4 11 6 10 8 15 16 18 18 6 11 10
## [1081] 20 17 8 11 15 20 3 13 12 6 19 2 9 22 17 11 11 16 7 28 25 24 25 20
## [1105] 28 31 14 17 11 21 22 17 28 32 12 23 12 18 29 17 14 15 13 22 8 17 11 7
## [1129] 9 9 7 12 27 9 6 7 26 27 23 22 27 10 5 19 14 29 30 25 26 13 13
veri_1 <- TRUSA %>%
group_by(TRUSA$toplam) %>%
select(CNT)
## Adding missing grouping variables: `TRUSA$toplam`
veri_1
## # A tibble: 1,151 × 2
## # Groups: TRUSA$toplam [34]
## `TRUSA$toplam` CNT
## <dbl> <chr>
## 1 10 TUR
## 2 18 TUR
## 3 8 TUR
## 4 11 TUR
## 5 6 TUR
## 6 21 TUR
## 7 19 TUR
## 8 18 TUR
## 9 19 TUR
## 10 20 TUR
## # ℹ 1,141 more rows
R aynı yabancı dil gibi sürekli tekrar etmek gerekiyor. Bugün eksik veri üzerine konuştuk. Yeni nesil normallik testi olan jarque.test()i ilk kez bu derste duydum.