This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
housedt %>% group_by(condition) %>% summarise(pr=mean(price))
## # A tibble: 5 x 2
## condition pr
## <int> <dbl>
## 1 1 334431.7
## 2 2 327287.1
## 3 3 542012.6
## 4 4 521200.4
## 5 5 612418.1
housedt %>% group_by(zipcode) %>% filter(zipcode==98001) %>% ggplot(.,aes(x=yr_built,y=condition))+geom_col()
housedt %>% group_by(yr_built) %>% summarise(pr=mean(condition)) %>% arrange(-pr)
## # A tibble: 116 x 2
## yr_built pr
## <int> <dbl>
## 1 1917 3.910714
## 2 1916 3.898734
## 3 1909 3.829787
## 4 1931 3.819672
## 5 1911 3.808219
## 6 1904 3.800000
## 7 1925 3.781818
## 8 1928 3.777778
## 9 1905 3.770270
## 10 1908 3.767442
## # ... with 106 more rows
housedt %>% group_by(zipcode,price) %>% summarise(cd=mean(condition)) %>% arrange(-cd)
## # A tibble: 14,796 x 3
## # Groups: zipcode [70]
## zipcode price cd
## <int> <dbl> <dbl>
## 1 98001 170000 5
## 2 98001 210500 5
## 3 98001 227950 5
## 4 98001 246900 5
## 5 98001 247000 5
## 6 98001 254000 5
## 7 98001 262500 5
## 8 98001 329500 5
## 9 98002 159995 5
## 10 98002 161500 5
## # ... with 14,786 more rows
housedt %>% group_by(zipcode,price,floors) %>% summarise(cd=mean(condition)) %>% arrange(cd)
## # A tibble: 17,023 x 4
## # Groups: zipcode, price [14,796]
## zipcode price floors cd
## <int> <dbl> <dbl> <dbl>
## 1 98006 380000 1.0 1
## 2 98011 270000 1.0 1
## 3 98023 150000 1.5 1
## 4 98024 142000 1.0 1
## 5 98028 196000 1.0 1
## 6 98033 535000 1.0 1
## 7 98065 235000 1.0 1
## 8 98103 352950 1.5 1
## 9 98106 125000 1.0 1
## 10 98112 427000 1.5 1
## # ... with 17,013 more rows
housedt %>% group_by(zipcode,price,floors,bathrooms) %>% summarise(cd=mean(condition)) %>% arrange(-cd)
## # A tibble: 19,641 x 5
## # Groups: zipcode, price, floors [17,023]
## zipcode price floors bathrooms cd
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 98001 170000 1.5 1.00 5
## 2 98001 210500 1.0 1.00 5
## 3 98001 215000 1.0 2.00 5
## 4 98001 227950 1.0 1.50 5
## 5 98001 240000 1.0 1.75 5
## 6 98001 246900 1.0 1.50 5
## 7 98001 247000 1.0 2.00 5
## 8 98001 254000 1.0 2.00 5
## 9 98001 262500 1.5 1.75 5
## 10 98001 329500 1.0 2.50 5
## # ... with 19,631 more rows
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.