ÖDEV 4

R Markdown

Şimdiye kadar R studio’da öğrendiğimiz ve uygulandığımız dersler ayrıltılı bir şekilde anlatmaya çalışırız.Bunun için sırayla ve kademe kademe işletiriz.

R MARKDOWN

Herhangi R’da bir veri seti seçtigimiz zaman R MARKDOWN ve RMARKDOWN paketi ’page_table’kullanarak verilerimiz tablolaştırabiliriz.

WOOLDRİDGE

O veriler uygulanmak için hazırlanmış bir pakettir.

Library(), Require()

Yüklenen paketler aktif haleye getirmek ve R programına hatırlamak için Library veya Require komutları kullanılır.

Şimdi pratiğe geçelim et tüm bunlar uygulayalım. Öncelikle bir veri seti seçelim. veri setimiz “alcohol”

library(wooldridge)
library(rmarkdown)
data("alcohol")
paged_table(alcohol)

Bir data verileri çekmek için iki yöntem vardır. İlk yöntem:

lm(abuse~status+unemrate+age+educ+married+famsize+white+exhealth,data = alcohol)

## 
## Call:
## lm(formula = abuse ~ status + unemrate + age + educ + married + 
##     famsize + white + exhealth, data = alcohol)
## 
## Coefficients:
## (Intercept)       status     unemrate          age         educ      married  
##   2.152e-01   -7.905e-03   -2.274e-03    1.518e-05   -3.496e-03   -8.039e-03  
##     famsize        white     exhealth  
##  -1.304e-02    1.975e-02   -2.297e-02

Ve ikinci yöntemimiz:

lm(formula = alcohol$abuse~alcohol$status+alcohol$unemrate+alcohol$age+alcohol$educ+alcohol$famsize+alcohol$white+alcohol$exhealth)

## 
## Call:
## lm(formula = alcohol$abuse ~ alcohol$status + alcohol$unemrate + 
##     alcohol$age + alcohol$educ + alcohol$famsize + alcohol$white + 
##     alcohol$exhealth)
## 
## Coefficients:
##      (Intercept)    alcohol$status  alcohol$unemrate       alcohol$age  
##        2.163e-01        -8.332e-03        -2.362e-03        -7.819e-05  
##     alcohol$educ   alcohol$famsize     alcohol$white  alcohol$exhealth  
##       -3.466e-03        -1.387e-02         1.927e-02        -2.315e-02

Birinci yada ikinci hangisi kullanırsak olsun sonuçlar aynıdır.

SUMMARY

Veriler kolay anlaşılmasını ve özetlemesine summary kumutu kullanıyoruz. Hemen pratiğe geçelim.

summary(lm(formula = abuse~status+unemrate+age+educ+married+famsize+white+exhealth, data=alcohol))

## 
## Call:
## lm(formula = abuse ~ status + unemrate + age + educ + married + 
##     famsize + white + exhealth, data = alcohol)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.1935 -0.1154 -0.0957 -0.0713  0.9961 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.152e-01  2.863e-02   7.516 6.14e-14 ***
## status      -7.905e-03  5.890e-03  -1.342 0.179610    
## unemrate    -2.274e-03  2.004e-03  -1.135 0.256500    
## age          1.518e-05  3.330e-04   0.046 0.963650    
## educ        -3.496e-03  1.079e-03  -3.239 0.001203 ** 
## married     -8.039e-03  8.975e-03  -0.896 0.370477    
## famsize     -1.304e-02  2.187e-03  -5.964 2.55e-09 ***
## white        1.975e-02  8.634e-03   2.288 0.022173 *  
## exhealth    -2.297e-02  6.280e-03  -3.658 0.000256 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2977 on 9813 degrees of freedom
## Multiple R-squared:  0.008511,   Adjusted R-squared:  0.007702 
## F-statistic: 10.53 on 8 and 9813 DF,  p-value: 8.005e-15

STARGAZER

Stargazer, sonuçlarımızı daha açıklayıcı tablolarda gösterbilmemizi veya bir ve birden fazla değişkenle karşılaştırbilmemizi sağlar. Farklı regresyon tablolarını bir ararda göstermek için Stargazer paketini kullanırız.

model1<-lm(abuse~I(status*2)+unemrate+age+educ+married+famsize+white+exhealth,data = alcohol)
model2<-lm(abuse~status+unemrate+I(age/2)+educ+married+famsize+white+exhealth,data = alcohol)

library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

stargazer(list(model1,model2),type = "text")

## 
## ============================================================
##                                     Dependent variable:     
##                                 ----------------------------
##                                            abuse            
##                                      (1)            (2)     
## ------------------------------------------------------------
## I(status * 2)                       -0.004                  
##                                    (0.003)                  
##                                                             
## status                                            -0.008    
##                                                   (0.006)   
##                                                             
## unemrate                            -0.002        -0.002    
##                                    (0.002)        (0.002)   
##                                                             
## age                                0.00002                  
##                                    (0.0003)                 
##                                                             
## I(age/2)                                          0.00003   
##                                                   (0.001)   
##                                                             
## educ                              -0.003***      -0.003***  
##                                    (0.001)        (0.001)   
##                                                             
## married                             -0.008        -0.008    
##                                    (0.009)        (0.009)   
##                                                             
## famsize                           -0.013***      -0.013***  
##                                    (0.002)        (0.002)   
##                                                             
## white                              0.020**        0.020**   
##                                    (0.009)        (0.009)   
##                                                             
## exhealth                          -0.023***      -0.023***  
##                                    (0.006)        (0.006)   
##                                                             
## Constant                           0.215***      0.215***   
##                                    (0.029)        (0.029)   
##                                                             
## ------------------------------------------------------------
## Observations                        9,822          9,822    
## R2                                  0.009          0.009    
## Adjusted R2                         0.008          0.008    
## Residual Std. Error (df = 9813)     0.298          0.298    
## F Statistic (df = 8; 9813)        10.529***      10.529***  
## ============================================================
## Note:                            *p<0.1; **p<0.05; ***p<0.01

status değeri: -0.004

age değeri: 0.00003

STANDARDİZASYON

lm(scale(abuse)~scale(status)+scale(unemrate)+scale(age)+scale(educ)+scale(married)+scale(famsize)+scale(white)+scale(exhealth),data = alcohol)

## 
## Call:
## lm(formula = scale(abuse) ~ scale(status) + scale(unemrate) + 
##     scale(age) + scale(educ) + scale(married) + scale(famsize) + 
##     scale(white) + scale(exhealth), data = alcohol)
## 
## Coefficients:
##     (Intercept)    scale(status)  scale(unemrate)       scale(age)  
##       1.252e-14       -1.403e-02       -1.145e-02        4.894e-04  
##     scale(educ)   scale(married)   scale(famsize)     scale(white)  
##      -3.391e-02       -1.041e-02       -6.651e-02        2.340e-02  
## scale(exhealth)  
##      -3.788e-02

DPLYR

Dplyr paketi veriişlemenin grameri olarak bilinir.6 ayrı fonksiyona sahip olduğu için bir sürü değişim,düzenleme vs...yardımcı oluyor. Bunlar:Select(),Filter(),Mutate(),Group_by()ve son olarak Summarize()


Şimdi verilemiz hazırlayalım.

library(wooldridge)
library(rmarkdown)
data("alcohol")
paged_table(alcohol)

Dplyr'imiz indirdik,veri setimizde değişiklikler yapalım.

require(dplyr)

## Loading required package: dplyr

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

alcohol%>%
  group_by(educ)%>%
  summarise("ortalama büyüklük(unemrate)"=mean(unemrate))

## # A tibble: 20 × 2
##     educ `ortalama büyüklük(unemrate)`
##    <int>                         <dbl>
##  1     0                          6.02
##  2     1                          5.79
##  3     2                          6.07
##  4     3                          5.78
##  5     4                          5.86
##  6     5                          5.71
##  7     6                          5.55
##  8     7                          5.73
##  9     8                          5.53
## 10     9                          5.62
## 11    10                          5.71
## 12    11                          5.82
## 13    12                          5.56
## 14    13                          5.64
## 15    14                          5.54
## 16    15                          5.76
## 17    16                          5.47
## 18    17                          5.51
## 19    18                          5.51
## 20    19                          5.62

Bütün değişkenlerin yaş’a göre ortalamısını görmek için de across(everything())komutunu kullanabiliriz.

require(dplyr)
alcohol%>%
  group_by(age) %>%
  summarize(across(everything(),mean))

## # A tibble: 35 × 33
##      age  abuse status unemrate  educ married famsize white exhealth vghealth
##    <int>  <dbl>  <dbl>    <dbl> <dbl>   <dbl>   <dbl> <dbl>    <dbl>    <dbl>
##  1    25 0.126    2.82     5.54  13.3   0.489    2.16 0.861    0.434    0.344
##  2    26 0.1      2.86     5.54  13.2   0.578    2.22 0.867    0.453    0.325
##  3    27 0.0815   2.86     5.53  13.2   0.614    2.43 0.834    0.497    0.307
##  4    28 0.0948   2.85     5.61  13.3   0.641    2.43 0.855    0.456    0.302
##  5    29 0.0894   2.85     5.54  13.5   0.709    2.65 0.849    0.453    0.341
##  6    30 0.0921   2.85     5.55  13.4   0.710    2.78 0.832    0.461    0.344
##  7    31 0.0823   2.88     5.61  13.5   0.769    2.80 0.835    0.468    0.293
##  8    32 0.0865   2.89     5.59  13.6   0.749    2.74 0.896    0.539    0.268
##  9    33 0.0877   2.88     5.60  13.4   0.777    2.98 0.832    0.436    0.338
## 10    34 0.111    2.87     5.48  13.7   0.785    2.93 0.848    0.459    0.299
## # … with 25 more rows, and 23 more variables: goodhealth <dbl>,
## #   fairhealth <dbl>, northeast <dbl>, midwest <dbl>, south <dbl>,
## #   centcity <dbl>, outercity <dbl>, qrt1 <dbl>, qrt2 <dbl>, qrt3 <dbl>,
## #   beertax <dbl>, cigtax <dbl>, ethanol <dbl>, mothalc <dbl>, fathalc <dbl>,
## #   livealc <dbl>, inwf <dbl>, employ <dbl>, agesq <dbl>, beertaxsq <dbl>,
## #   cigtaxsq <dbl>, ethanolsq <dbl>, educsq <dbl>

ÖDEV 4

YOUSSOUF ABDEL-MOUNTALİB MAHAMAT SALEH

2022-11-24

R Markdown