Bu ödevde, TIMSS 2015 uygulamasına ait bir kitapçığın Türkiye ve Amerika verilerini kullanacaksınız.
Veri seti adı: “TRUSA.RDS”. Bu veri setini R ortamına aktarınız.
Veri setinde eksik veri olup olmadığını kontrol ediniz.
Kitapçıktaki 35 maddenin toplamını hesaplayarak veri setine yeni bir sütun olarak ekleyiniz.
Toplam puanın her iki ülkeye göre betimsel istatistiklerini hesaplayınız.
Toplam puanın, Türkiye ve ABD örneklemlerinde farklılaşıp farklılaşmadığını t testi ile test ediniz.
Veri setinde %5, %10 ve %15 oranında eksik veriler oluşturunuz.
Oluşturulan eksik veri setlerinde önce eksik verinin rastgele olup olmadığını test ediniz. Ardından, liste bazında silme yöntemiyle eksik verileri temizleyerek e seçeneğinde gerçekleştirdiğiniz t testini tekrarlayınız. Tam veri ile elde edilen sonuçlarla karşılaştırınız.
f seçeneğinde oluşturulan veri setlerindeki eksik verileri, belirlediğiniz bir kayıp veri atama yöntemiyle doldurunuz. Daha sonra, e seçeneğinde gerçekleştirdiğiniz t testini tekrar ediniz ve tam veri ile elde edilen sonuçlarla karşılaştırınız.
Eksik veri oranının uygulanan yöntemlerin performansına etkisini açıklayınız.
İyi kodlamalar :)
Çözüm 1: Emrah hocanın versiyonu
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.5.3
## Warning: package 'dplyr' was built under R version 4.5.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.1 ✔ readr 2.1.6
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.2.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(psych)
##
## Attaching package: 'psych'
##
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
TRUSA <- readRDS("TRUSA.RDS")
TRUSA$toplam <- TRUSA %>% select(starts_with("M")) %>% rowSums()
TRUSA %>% group_by(CNT) %>% summarise(n=n(),
ort=mean(toplam),
sd=sd(toplam),
min=min(toplam),
max=max(toplam))
## # A tibble: 2 × 6
## CNT n ort sd min max
## <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 TUR 435 13.5 7.57 2 32
## 2 USA 716 17.0 7.53 1 34
t.test(toplam ~ CNT, data=TRUSA)
##
## Welch Two Sample t-test
##
## data: toplam by CNT
## t = -7.8242, df = 912.31, p-value = 1.41e-14
## alternative hypothesis: true difference in means between group TUR and group USA is not equal to 0
## 95 percent confidence interval:
## -4.494510 -2.691921
## sample estimates:
## mean in group TUR mean in group USA
## 13.45287 17.04609
Çözüm 2: Hocanın anlatımı
TRUSA <- readRDS("TRUSA.RDS")
#library(nainar)
#miss_var_table(TRUSA)
SUM <- TRUSA %>% dplyr::select(starts_with("M")) %>% rowSums()
TRUSA$SUM <-SUM
d)descriptive statistics of total score
library(dplyr)
library(psych)
describe.by(TRUSA$SUM, TRUSA$CNT)
## Warning in describe.by(TRUSA$SUM, TRUSA$CNT): describe.by is deprecated.
## Please use the describeBy function
##
## Descriptive statistics by group
## group: TUR
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 435 13.45 7.57 11 12.74 7.41 2 32 30 0.71 -0.61 0.36
## ------------------------------------------------------------
## group: USA
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 716 17.05 7.53 17 16.92 8.9 1 34 33 0.1 -0.9 0.28
library(effsize) #to calculate the effect size
##
## Attaching package: 'effsize'
## The following object is masked from 'package:psych':
##
## cohen.d
t_test_result <- t.test(SUM~CNT, data=TRUSA, var.equal=TRUE)
print(t_test_result)
##
## Two Sample t-test
##
## data: SUM by CNT
## t = -7.8348, df = 1149, p-value = 1.064e-14
## alternative hypothesis: true difference in means between group TUR and group USA is not equal to 0
## 95 percent confidence interval:
## -4.493049 -2.693382
## sample estimates:
## mean in group TUR mean in group USA
## 13.45287 17.04609
Calculate Cohen’s d
cohen_d_result <-effsize::cohen.d(TRUSA$SUM[TRUSA$CNT=="TUR"],
TRUSA$SUM[TRUSA$CNT=="USA"])
print(cohen_d_result)
##
## Cohen's d
##
## d estimate: -0.4762813 (small)
## 95 percent confidence interval:
## lower upper
## -0.5971341 -0.3554285
library(mvdalab)
##
## Attaching package: 'mvdalab'
## The following object is masked from 'package:psych':
##
## smc
TRUSA_5 <- introNAs(TRUSA, percent=5)
TRUSA_10 <- introNAs(TRUSA, percent = 10)
TRUSA_15 <- introNAs(TRUSA, percent= 15)
TRUSA_5 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 55 50 52 64 51 53 54 59
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 67 58 47 63 61 54 48 66
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 57 52 60 54 67 55 53 58
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 56 64 50 65 55 74 52 71
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 59 57 52 60 58 58 55
TRUSA_10 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 110 115 126 122 105 115 133 133
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 128 139 109 119 107 127 120 127
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 114 123 98 105 124 115 120 101
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 125 100 114 99 122 108 103 118
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 122 99 122 104 112 104 102
TRUSA_15 %>% is.na() %>% colSums()
## IDSTUD IDBOOK M042182 M042081 M042049 M042052 M042076 M042302A
## 193 174 162 180 170 188 174 202
## M042302B M042302C M042100 M042202 M042240 M042093 M042271 M042268
## 162 186 157 176 163 173 160 147
## M042159 M042164 M042167 M062208 M062208A M062208B M062208C M062208D
## 165 204 194 160 177 151 173 148
## M062153 M062111A M062111B M062237 M062314 M062074 M062183 M062202
## 187 164 179 168 176 188 173 176
## M062246 M062286 M062325 M062106 M062124 CNT SUM
## 190 166 151 164 169 178 165
TRUSA_5_lw <- na.omit(TRUSA_5)
TRUSA_10_lw <- na.omit(TRUSA_10)
TRUSA_15_lw <- na.omit(TRUSA_15)
ordinalsa poly
Çoklu atama yapmamız gerekir
#library(mice)
#TRUSA_5_im1 <- mice(TRUSA_5[,2:36],m=5,maxit = 50, method = 'logreg', seed=500)
#TRUSA_10_im1 <- mice(TRUSA_10[,2:36],m=5, maxit=50, method='logreg',seed=500)
#TRUSA_15_im1 <- mice(TRUSA_15[,2:36],m=5, maxit = 50, method='logreg',seed=500)
#completed_data_1 <- complete(TRUSA_5_im1, 1)
#completed_data_2 <- complete(TRUSA_5_im1,2)
#completed_data_3 <- complete(TRUSA_5_im1,3)
#completed_data_4 <- complete(TRUSA_5_im1, 4)
#completed_data_5 <- complete(TRUSA_5_im1, 5)
#t_test_result_1 <- t.test(SUM~CNT, data=completed_data_1, var.equal=TRUE)
#not: cok uzundu ormesi ondan sabitledim :(
library(tidyverse)
library(stevemisc)
##
## Attaching package: 'stevemisc'
## The following object is masked from 'package:lubridate':
##
## dst
## The following object is masked from 'package:dplyr':
##
## tbl_df
library(knitr)
library(haven)
library(summarytools)
##
## Attaching package: 'summarytools'
## The following object is masked from 'package:tibble':
##
## view
library(outliers)
##
## Attaching package: 'outliers'
## The following object is masked from 'package:psych':
##
## outlier
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggpmisc)
## Zorunlu paket yükleniyor: ggpp
## Registered S3 methods overwritten by 'ggpp':
## method from
## heightDetails.titleGrob ggplot2
## widthDetails.titleGrob ggplot2
##
## Attaching package: 'ggpp'
## The following object is masked from 'package:ggplot2':
##
## annotate
library(psych)
library(sur)
##
## Attaching package: 'sur'
## The following object is masked from 'package:psych':
##
## skew
library(moments)
library(corrplot)
## corrplot 0.95 loaded
library(olsrr)
##
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
##
## rivers
NORMALLİK
library(dplyr)
library(haven) # SPSS dosyalarını R ortamına aktarmak için haven paketini kullanın.
screen <- read_sav("SCREEN.sav")
screen <- expss::drop_var_labs(screen)
head(screen) # Veri setinin ilk birkaç satırını görüntüle
## # A tibble: 6 × 8
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME EMPLMNT MSTATUS RACE
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 8 27 5 1 2 1
## 2 2 3 7 20 6 0 2 1
## 3 3 0 8 23 3 0 2 1
## 4 4 13 9 28 8 1 2 1
## 5 5 15 7 24 1 1 2 1
## 6 6 3 8 25 4 0 2 1
eksik veri düzenlemesi
screen <- screen %>%
mutate(INCOME = ifelse(is.na(INCOME), mean(INCOME, na.rm =TRUE),INCOME)) %>% na.omit()
summary(screen)
## SUBNO TIMEDRS ATTDRUG ATTHOUSE
## Min. : 1.0 Min. : 0.000 Min. : 5.00 Min. : 2.00
## 1st Qu.:136.8 1st Qu.: 2.000 1st Qu.: 7.00 1st Qu.:21.00
## Median :313.5 Median : 4.000 Median : 8.00 Median :24.00
## Mean :317.3 Mean : 7.914 Mean : 7.69 Mean :23.54
## 3rd Qu.:483.2 3rd Qu.:10.000 3rd Qu.: 9.00 3rd Qu.:27.00
## Max. :758.0 Max. :81.000 Max. :10.00 Max. :35.00
## INCOME EMPLMNT MSTATUS RACE
## Min. : 1.000 Min. :0.000 Min. :1.00 Min. :1.000
## 1st Qu.: 3.000 1st Qu.:0.000 1st Qu.:2.00 1st Qu.:1.000
## Median : 4.000 Median :0.000 Median :2.00 Median :1.000
## Mean : 4.208 Mean :0.472 Mean :1.78 Mean :1.086
## 3rd Qu.: 6.000 3rd Qu.:1.000 3rd Qu.:2.00 3rd Qu.:1.000
## Max. :10.000 Max. :1.000 Max. :2.00 Max. :2.000
x <- c(3,5,7,NA,9)
ifelse(is.na(x),mean(x,na.rm=TRUE),x)
## [1] 3 5 7 6 9
Kategorik değişkenler için:
library(dplyr)
table(screen$RACE)
##
## 1 2
## 424 40
library(summarytools)
freq(screen$RACE,
round.digits=2,report.nas = FALSE,
style = "rmarkdown")
## setting plain.ascii to FALSE
## ### Frequencies
## #### screen$RACE
## **Type:** Numeric
##
## | | Freq | % | % Cum. |
## |----------:|-----:|-------:|-------:|
## | **1** | 424 | 91.38 | 91.38 |
## | **2** | 40 | 8.62 | 100.00 |
## | **Total** | 464 | 100.00 | 100.00 |
library(knitr)
freq(screen$MSTATUS,report.nas = FALSE) %>%
kable(format='markdown',
caption="Frekans Tablosu",digits = 2)
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| 1 | 102 | 21.98 | 21.98 | 21.98 | 21.98 |
| 2 | 362 | 78.02 | 100.00 | 78.02 | 100.00 |
| 0 | NA | NA | 0.00 | 100.00 | |
| Total | 464 | 100.00 | 100.00 | 100.00 | 100.00 |
summarytools paketini incele.
Sürekli değişkenlerde uç değerler:
library(outliers)
z.scores <- screen %>%
select(2:5) %>%
scores(type = "z") %>%
round(2)
head(z.scores)
## TIMEDRS ATTDRUG ATTHOUSE INCOME
## 1 -0.63 0.27 0.77 0.34
## 2 -0.45 -0.60 -0.79 0.76
## 3 -0.72 0.27 -0.12 -0.51
## 4 0.46 1.13 0.99 1.61
## 5 0.65 -0.60 0.10 -1.36
## 6 -0.45 0.27 0.33 -0.09
summarytools::descr(z.scores,
stats = c("min", "max"),
transpose = TRUE,
headings = FALSE)
##
## Min Max
## -------------- ------- ------
## ATTDRUG -2.33 2.00
## ATTHOUSE -4.80 2.56
## INCOME -1.36 2.46
## TIMEDRS -0.72 6.67
library(DT)
DT::datatable(z.scores)
library(ggplot2)
ggplot(screen, aes(x = TIMEDRS)) +
geom_histogram(bins = 30L, fill = "#0c4c8a")
# library(ggpmisc)
ggplot(screen, aes(x = TIMEDRS)) + geom_histogram() +
geom_vline(xintercept =7.914, color = "red",
linetype = "dashed") +
annotate("text", label = "Ort = 7.913", x = 10, y = 100, color ="black")
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
ggplot(screen, aes(x = TIMEDRS)) +
geom_histogram(aes(y=..density..))+
geom_density(alpha=.5, fill="#0c4c8a") +
theme_minimal()
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
library(plotly)
plot_ly(x = screen$TIMEDRS, type = "histogram",
histnorm = "probability")
ggplot(screen, aes(y = TIMEDRS)) +
geom_boxplot()
out <- boxplot.stats(screen$TIMEDRS)$out
out
## [1] 60 23 39 33 38 34 27 30 25 49 60 27 27 52 24 57 52 58 57 43 37 75 29 30 25
## [26] 37 56 29 37 81 27 23
out_ind <- which(screen$TIMEDRS %in% c(out))
out_ind
## [1] 40 64 67 76 79 96 102 117 150 163 168 170 178 193 203 206 213 249 274
## [20] 278 285 289 309 342 344 362 367 374 388 404 408 443
plot_ly(y = screen$TIMEDRS, type = 'box')
plot_ly(y = screen$TIMEDRS, type = 'box') %>%
layout(title = 'Box Plot',
annotations = list( x = -0.01, y = boxplot.stats(screen$TIMEDRS)$out,
text = paste(out_ind), showarrow = FALSE,
xanchor = "right"))
ggplot(screen, aes(x = factor(MSTATUS),
y = TIMEDRS, fill = factor(MSTATUS))) +
geom_boxplot() +
theme_minimal()
ggplot(screen) + aes(x = ATTDRUG) +
geom_histogram( bins = 6, fill = "#0c4c8a")+
theme_minimal()
ggplot(screen) +
aes(x = ATTHOUSE) +
geom_histogram( bins = 10, fill = "darkgreen") +
theme_minimal()
plot_ly(y = screen$ATTHOUSE, type = 'box')
screen[c(260,298),]
## # A tibble: 2 × 8
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME EMPLMNT MSTATUS RACE
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 346 2 8 2 1 0 1 1
## 2 407 2 8 2 4 0 1 1
screen2 <- screen[-c(260,298),]
Mahalanobis Uzaklığı
library(psych)
veri <- screen2[,1:5]
md <- mahalanobis(veri, center = colMeans(veri), cov = cov(veri))
head(md,20)
## [1] 3.785517 4.541493 3.501077 7.281365 5.457240 2.896550 5.807898
## [8] 3.879478 4.751166 7.415405 10.602100 5.249121 6.073732 3.271885
## [15] 12.316463 4.440749 4.836160 6.362806 4.126524 10.797545
library(psych)
alpha <- .001
cutoff <- (qchisq(p = 1 - alpha, df = ncol(veri)))
cutoff
## [1] 20.51501
ucdegerler <- which(md > cutoff)
veri[ucdegerler, ]
## # A tibble: 9 × 5
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 48 60 7 24 1
## 2 235 60 10 29 4
## 3 276 57 9 24 2
## 4 291 52 8 19 1
## 5 330 58 7 29 4
## 6 370 57 8 23 4
## 7 398 75 9 33 9
## 8 502 56 8 19 3
## 9 548 81 8 24 9
data_temiz <- veri[-ucdegerler, ]
veri[ucdegerler, ]
## # A tibble: 9 × 5
## SUBNO TIMEDRS ATTDRUG ATTHOUSE INCOME
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 48 60 7 24 1
## 2 235 60 10 29 4
## 3 276 57 9 24 2
## 4 291 52 8 19 1
## 5 330 58 7 29 4
## 6 370 57 8 23 4
## 7 398 75 9 33 9
## 8 502 56 8 19 3
## 9 548 81 8 24 9
Çok Değişkenli Normallik Sayıltısı
library(sur)
attach(screen)
skew(screen$TIMEDRS)
## [1] 3.234045
sew(data_temiz$TIMEDRS)
## NULL
se.skew(TIMEDRS)
## [1] 0.1133494
skew.ratio(TIMEDRS)
## [1] 28.53164
skew(TIMEDRS)/se.skew(TIMEDRS)
## [1] 28.53164
jarque.test fonksiyonu veri normal dağılımdan farklılaşmamaktadır yokluk hipotezini test etmektedir.
library(moments)
library(labelled)
jarque.test(remove_labels(TIMEDRS))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(TIMEDRS)
## JB = 4034.9, p-value < 2.2e-16
## alternative hypothesis: greater
jarque.test(remove_labels(ATTDRUG))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(ATTDRUG)
## JB = 5.0552, p-value = 0.07985
## alternative hypothesis: greater
skew.ratio(ATTDRUG)
## [1] -1.10762
jarque.test(remove_labels(ATTHOUSE))
##
## Jarque-Bera Normality Test
##
## data: remove_labels(ATTHOUSE)
## JB = 61.092, p-value = 5.418e-14
## alternative hypothesis: greater
set.seed(0)
normal <- rnorm(200)
non_normal <- rexp(200, rate=3)
par(mfrow=c(1,2))
hist(normal, col='steelblue', main='Normal')
hist(non_normal, col='steelblue', main='Non-normal')
par(mfrow=c(1,2))
qqnorm(normal, main='Normal')
qqline(normal)
qqnorm(non_normal, main='Non-normal')
ggplot(data=screen, aes(sample=ATTHOUSE))+
geom_qq()+
geom_qq_line()
Burası ek not, derste yazmışım :)
##any_na(TRUSA)
#n_miss(TRUSA)
#prop_miss(TRUSA)
#TRUSA %>% is.na() %>% colSums()
#miss_var_summary(TRUSA)
#miss_var_table(TRUSA)
TRUSA$toplam <-rowSums(TRUSA[,3:37],na.rm =TRUE)
TRUSA$toplam
## [1] 10 18 8 11 6 21 19 18 19 20 12 6 7 28 15 12 26 18 9 25 13 7 5 22
## [25] 20 16 26 23 10 13 9 27 20 7 6 5 7 5 7 6 30 28 30 6 20 8 29 13
## [49] 12 5 8 9 11 7 15 12 21 22 12 7 4 6 27 13 22 20 7 15 3 10 6 28
## [73] 8 12 30 30 21 13 11 12 9 14 19 7 16 13 6 30 27 8 9 18 11 31 5 8
## [97] 5 10 27 9 3 7 8 16 22 6 18 10 9 3 7 16 14 7 21 8 8 9 7 14
## [121] 27 7 26 20 9 12 20 29 17 24 23 9 2 26 22 10 21 9 12 22 25 11 13 5
## [145] 9 6 16 13 10 7 15 11 15 27 21 4 14 12 10 9 13 6 9 11 5 13 15 21
## [169] 12 26 29 23 8 11 14 9 7 5 8 8 9 27 22 16 15 5 8 19 12 8 22 10
## [193] 10 6 24 19 16 16 8 17 6 12 24 9 9 8 7 21 18 11 9 7 15 9 26 6
## [217] 24 7 13 10 31 8 23 5 6 11 8 13 8 15 9 9 15 7 25 9 17 8 6 10
## [241] 31 28 32 32 11 13 17 28 7 21 4 13 8 7 12 26 25 11 7 18 21 12 16 24
## [265] 6 6 12 4 8 6 4 7 4 8 21 7 29 3 24 17 12 6 15 7 8 15 11 10
## [289] 17 8 3 6 9 16 13 6 17 11 4 6 16 8 10 6 6 6 14 26 12 7 16 2
## [313] 28 4 6 27 17 14 7 3 17 8 6 5 8 9 9 25 4 6 7 28 16 7 7 7
## [337] 3 13 26 8 25 19 9 8 12 15 18 16 12 13 19 29 5 32 10 10 22 4 6 9
## [361] 5 10 10 6 6 4 24 8 6 7 20 11 14 15 8 24 23 25 16 21 25 8 10 8
## [385] 5 9 9 8 6 6 28 7 27 6 13 9 6 5 8 21 20 9 8 10 16 28 4 11
## [409] 6 24 17 10 5 8 26 9 17 23 19 8 6 15 16 23 21 14 21 13 20 4 17 19
## [433] 5 28 32 7 22 8 19 20 27 13 8 29 33 16 12 26 22 19 20 11 10 1 1 1
## [457] 14 22 22 25 14 6 17 28 29 30 28 28 18 22 21 26 7 7 18 11 6 27 18 5
## [481] 17 23 19 22 15 9 22 29 12 23 27 17 23 8 23 14 18 24 28 28 32 21 15 10
## [505] 34 20 17 19 14 16 6 17 14 9 20 8 12 7 8 18 17 7 22 11 21 17 26 15
## [529] 11 27 5 9 4 23 6 23 19 16 20 8 6 14 24 7 19 29 8 22 24 21 10 22
## [553] 22 13 17 28 28 24 10 15 4 24 22 19 18 16 13 6 17 9 19 13 19 7 8 11
## [577] 11 19 16 11 19 16 22 22 31 16 30 30 19 25 11 8 19 5 30 12 15 28 20 19
## [601] 22 3 18 18 19 21 23 24 22 9 32 20 9 9 6 29 20 9 23 26 15 13 14 5
## [625] 10 15 17 32 31 31 20 10 15 21 22 22 30 31 17 20 32 20 11 9 12 13 10 17
## [649] 22 6 6 15 21 18 9 6 22 14 9 24 16 10 28 26 15 19 6 18 27 16 6 7
## [673] 4 8 9 16 30 15 10 10 23 28 18 9 20 12 18 25 31 20 28 27 10 20 24 30
## [697] 4 23 21 26 20 11 30 16 8 7 32 24 7 9 22 17 24 29 12 7 17 27 11 17
## [721] 9 8 15 6 9 27 30 7 11 10 11 19 27 15 1 8 8 6 8 14 12 15 6 18
## [745] 7 15 21 32 9 11 18 22 25 16 30 27 25 16 12 17 14 18 20 18 19 20 21 22
## [769] 19 29 24 29 23 11 16 11 17 13 23 22 5 8 9 5 10 7 7 14 4 17 22 10
## [793] 3 11 23 23 27 8 6 3 10 6 10 8 8 15 15 7 21 16 24 10 12 8 24 24
## [817] 25 18 17 11 13 17 17 12 26 26 15 13 6 15 20 22 8 29 22 7 11 17 10 22
## [841] 10 2 4 19 8 26 21 26 22 27 29 22 26 7 19 17 25 13 17 22 12 12 9 14
## [865] 19 20 16 24 33 29 12 25 30 24 16 23 26 15 6 29 29 17 18 7 7 25 31 17
## [889] 12 15 15 8 14 7 22 24 12 7 18 10 12 16 20 18 14 32 29 29 24 29 17 6
## [913] 10 13 24 22 18 30 21 8 7 20 27 25 10 23 26 11 11 12 27 24 28 23 26 19
## [937] 16 18 30 31 21 26 17 29 18 8 16 23 11 12 17 18 17 17 22 20 4 15 13 11
## [961] 20 20 12 19 30 15 11 20 32 14 28 6 16 17 25 23 16 10 24 29 8 8 23 22
## [985] 20 26 29 16 19 20 17 28 22 16 29 22 12 2 11 15 21 23 16 17 19 11 6 15
## [1009] 11 15 16 8 9 9 10 24 19 17 29 18 22 15 17 21 20 10 28 30 20 19 22 15
## [1033] 10 16 15 14 28 15 19 9 5 11 19 26 5 16 25 14 32 23 17 25 17 17 18 21
## [1057] 13 6 19 15 18 5 14 25 10 12 12 14 4 11 6 10 8 15 16 18 18 6 11 10
## [1081] 20 17 8 11 15 20 3 13 12 6 19 2 9 22 17 11 11 16 7 28 25 24 25 20
## [1105] 28 31 14 17 11 21 22 17 28 32 12 23 12 18 29 17 14 15 13 22 8 17 11 7
## [1129] 9 9 7 12 27 9 6 7 26 27 23 22 27 10 5 19 14 29 30 25 26 13 13
veri_1 <- TRUSA %>%
group_by(TRUSA$toplam) %>%
select(CNT)
## Adding missing grouping variables: `TRUSA$toplam`
veri_1
## # A tibble: 1,151 × 2
## # Groups: TRUSA$toplam [34]
## `TRUSA$toplam` CNT
## <dbl> <chr>
## 1 10 TUR
## 2 18 TUR
## 3 8 TUR
## 4 11 TUR
## 5 6 TUR
## 6 21 TUR
## 7 19 TUR
## 8 18 TUR
## 9 19 TUR
## 10 20 TUR
## # ℹ 1,141 more rows
R aynı yabancı dil gibi sürekli tekrar etmek gerekiyor. Bugün eksik veri üzerine konuştuk. Yeni nesil normallik testi olan jarque.test()i ilk kez bu derste duydum.