1 rmarkdown (why we use rmarkdown in our r program -> when we want to tabulate the data on that time we use rmakdown for exmaple page_table and let’s clarify with example below)
library(wooldridge)
library(rmarkdown)
data("crime1")
paged_table(crime1)
now we understand why we use rmarkdown
SUMMARY() let’s move to another topic 2 summary() (when We want to pull data from a data in two different ways and the example is below )
summary(lm(formula = narr86 ~ nfarr86 + nparr86 + pcnv + avgsen + tottime + ptime86 + qemp86 +inc86, data = crime1))
##
## Call:
## lm(formula = narr86 ~ nfarr86 + nparr86 + pcnv + avgsen + tottime +
## ptime86 + qemp86 + inc86, data = crime1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3416 -0.1585 -0.1193 -0.0634 6.1119
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.1777307 0.0202125 8.793 < 2e-16 ***
## nfarr86 0.9459576 0.0206513 45.806 < 2e-16 ***
## nparr86 0.4411978 0.0245568 17.966 < 2e-16 ***
## pcnv -0.0583139 0.0232363 -2.510 0.01214 *
## avgsen -0.0004440 0.0070463 -0.063 0.94976
## tottime 0.0048042 0.0054347 0.884 0.37678
## ptime86 -0.0139259 0.0050745 -2.744 0.00610 **
## qemp86 0.0029124 0.0083282 0.350 0.72659
## inc86 -0.0006131 0.0001963 -3.123 0.00181 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4773 on 2716 degrees of freedom
## Multiple R-squared: 0.6922, Adjusted R-squared: 0.6913
## F-statistic: 763.6 on 8 and 2716 DF, p-value: < 2.2e-16
2 the second way is
standardization in fact we can do standardiztion by adding scale in our regression let see the example below
lm(scale(qemp86) ~ scale(narr86) + scale(nfarr86) + scale(nparr86) + scale(pcnv) + scale(avgsen) + scale(tottime) + scale(ptime86) + scale(inc86), data = crime1 )
##
## Call:
## lm(formula = scale(qemp86) ~ scale(narr86) + scale(nfarr86) +
## scale(nparr86) + scale(pcnv) + scale(avgsen) + scale(tottime) +
## scale(ptime86) + scale(inc86), data = crime1)
##
## Coefficients:
## (Intercept) scale(narr86) scale(nfarr86) scale(nparr86) scale(pcnv)
## -7.665e-15 8.247e-03 -7.891e-02 8.734e-03 6.749e-03
## scale(avgsen) scale(tottime) scale(ptime86) scale(inc86)
## 3.261e-02 -3.950e-02 -1.555e-01 6.748e-01
yukardiyi bakabiliriz ki ayni sounc elde etik move to other topic
ANOVA
useage of anova the anova table will be able to show us which variable adds more explanatory power fo example
library(car)
## Le chargement a nécessité le package : carData
we can see that the temp is mor expalantory variables in this model INTERCEPT linear regression with intercept and linear regression with our intercept linear regression with intercept
linear regression without intercept
plot now lets study to plot the linear regression
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 0.3.5
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::recode() masks car::recode()
## ✖ purrr::some() masks car::some()
qplot(crime1$qemp86, crime1$narr86)
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
library(wooldridge)
data("crime1")
New York Air Quality Measurements Daily readings of the following air quality values for May 1, 1973 (a Tuesday) to September 30, 1973. bu veri bizi America da New York eyalettın dan doğan hava quality veriyor ve bunu wooldridge bu veri Wind ve teperture kullanarak açıklamak istiyor aslında bu verıdı hava quality ve rüzgar ve temperture arasında bağlant varmı yokmı onu bakmak ıstıyoruz ve kontrol altında değişkenler wind tempruture month day dır bu regressıon da temel yıl 1973 alınmıştır ve kesen değişkenimiz 68.. burda t ıstatıstımığız çok yüksek çektı kı 15 dır burada cormuz 0.3 çıktı ki bu aır qualıty ve değışkenlerımız arasınd bağlant olduğun göstreyr ve en açaklayıcı değışkenımız monthdır ve deyibiliriz ki null hypothesis red ediyor ve 1 hypothesis anlamlıdır (significanttır) ve ayı değıştıkçı aır qualıty de değışıyour mesela mayıs ayında hava daha qulıtysı ıyı oldu hama kışı donemında hava kalıtıs bıraz kotu oluyr ve en son deyibiliriz ki hypothesis significanttır dplyer and the last topic is dplyer and dplyer introduces you to dplyrs basic set of tools and shows you how to apply them to data frames. dplyr also supports databases via the dbplyr package lets do some example of it select, select columns from data filter, subsets row of data mutate, creats new colum arrange, sort new data group_by, aggregates data summarise, calculating summary istatistig ki bunu ben de tamm olarak anlamadım anlamyı çalışıyorm sonur bunu okuyceğız