This is document refreshing your knowledge of R Markdown and introducing a few new functions necessary to publish the data analysis results as HTML documents.
Install the packages first:
library("knitr")
library("rmarkdown")
library("summarytools")
## Warning: package 'summarytools' was built under R version 4.2.3
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a new document will be generated that includes both content as well as the output of any embedded R code chunks within the document. This will be saved as a HTML fail in your project folder. Press the Knit button now to see if it works.
A code chunk starts with three back ticks followed by curly brackets
{} where r code goes in, and ends with three back
ticks.
Figure 1. Screenshot of R chunk
Explore the cog icon on the right side to see how you can quickly change settings of the r chunk.
And as a code and an output it will look like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
Cars is a demo data for R. Note that after ‘r’ in the chunk I added a chunk name. This is useful, as chunk names are used as output names when saved.
You can write as much code as you want in a r chunk, but it is better not to embed to many operations in one chunk as it is more difficult to find a problem.
Inline R code is embedded in the narratives of the document using one
back tick on each side of a code r.
You can embed any outputs, such as graphs, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
kable function from “knitr” package produces good
looking tables in HTML format. This
section of an R Markdown book provides more examples on how to use
it.
Let’s display the same first 5 rows of our data using kable function.
kable(cars[1:5, ], caption = "Table 1. Overview of first 5 rows in data")
| speed | dist |
|---|---|
| 4 | 2 |
| 4 | 10 |
| 7 | 4 |
| 7 | 22 |
| 8 | 16 |
A more informative table would be some summary statistics of
variables in the dataset. Note that for functions from the
“summarytools” package we had to add plain.ascii = FALSE,
style = 'rmarkdown' in the table options and set the knitr
chunk option results="asis" so they work in RMarkdown.
descr(cars, plain.ascii = FALSE, style = 'rmarkdown')
N: 50
| dist | speed | |
|---|---|---|
| Mean | 42.98 | 15.40 |
| Std.Dev | 25.77 | 5.29 |
| Min | 2.00 | 4.00 |
| Q1 | 26.00 | 12.00 |
| Median | 36.00 | 15.00 |
| Q3 | 56.00 | 19.00 |
| Max | 120.00 | 25.00 |
| MAD | 23.72 | 5.93 |
| IQR | 30.00 | 7.00 |
| CV | 0.60 | 0.34 |
| Skewness | 0.76 | -0.11 |
| SE.Skewness | 0.34 | 0.34 |
| Kurtosis | 0.12 | -0.67 |
| N.Valid | 50.00 | 50.00 |
| Pct.Valid | 100.00 | 100.00 |
Now let’s improve the table by tidying it up a little.
descr(cars, stats = c("mean", "sd", "min", "max"), transpose = TRUE, headings = FALSE, caption="Table 1. Descriptive statistics for key variables")
| Mean | Std.Dev | Min | Max | |
|---|---|---|---|---|
| dist | 42.98 | 25.77 | 2.00 | 120.00 |
| speed | 15.40 | 5.29 | 4.00 | 25.00 |
Read more on the descr fucntion in “summarytools”
package here.
Look into basic formatting options of R Markdown and play with the text below: https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf
Make this word bold
Make this word italics
###Make this line a 3-level header
1.Create two level list:
This a quote, so use formatting to create a block quote.
Add a hyperlink to this word to the SMI website.
Copy the r chunk with the plot twice and change the r chunk options so:
plot(pressure)
plot(pressure)
Embed a picture - e.g. SMI logo below
Picture of SMI Logo
Improve the below table so it is displayed in R Markdown friendly format (not with ASCII language). Decide what descriptive statistics should be displayed. Add table number and title.
descr(cars)
## Descriptive Statistics
## cars
## N: 50
##
## dist speed
## ----------------- -------- --------
## Mean 42.98 15.40
## Std.Dev 25.77 5.29
## Min 2.00 4.00
## Q1 26.00 12.00
## Median 36.00 15.00
## Q3 56.00 19.00
## Max 120.00 25.00
## MAD 23.72 5.93
## IQR 30.00 7.00
## CV 0.60 0.34
## Skewness 0.76 -0.11
## SE.Skewness 0.34 0.34
## Kurtosis 0.12 -0.67
## N.Valid 50.00 50.00
## Pct.Valid 100.00 100.00
Look into the package summarytools here: https://cran.r-project.org/web/packages/summarytools/vignettes/introduction.html
and use freq function to create another table for the same
data.
freq(cars)
## Frequencies
## cars$speed
## Type: Numeric
##
## Freq % Valid % Valid Cum. % Total % Total Cum.
## ----------- ------ --------- -------------- --------- --------------
## 4 2 4.00 4.00 4.00 4.00
## 7 2 4.00 8.00 4.00 8.00
## 8 1 2.00 10.00 2.00 10.00
## 9 1 2.00 12.00 2.00 12.00
## 10 3 6.00 18.00 6.00 18.00
## 11 2 4.00 22.00 4.00 22.00
## 12 4 8.00 30.00 8.00 30.00
## 13 4 8.00 38.00 8.00 38.00
## 14 4 8.00 46.00 8.00 46.00
## 15 3 6.00 52.00 6.00 52.00
## 16 2 4.00 56.00 4.00 56.00
## 17 3 6.00 62.00 6.00 62.00
## 18 4 8.00 70.00 8.00 70.00
## 19 3 6.00 76.00 6.00 76.00
## 20 5 10.00 86.00 10.00 86.00
## 22 1 2.00 88.00 2.00 88.00
## 23 1 2.00 90.00 2.00 90.00
## 24 4 8.00 98.00 8.00 98.00
## 25 1 2.00 100.00 2.00 100.00
## <NA> 0 0.00 100.00
## Total 50 100.00 100.00 100.00 100.00
Add table of content option in the YAML settings at the top of the document.
Change date and your name as the author.
Knit as a HTML page, click the publish button on the right top, and publish on your Rpubs account. Send the link to Aneta for feedback when all is done.
sessionInfo()
## R version 4.2.2 (2022-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.utf8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] summarytools_1.0.1 rmarkdown_2.20 knitr_1.42
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.10 highr_0.10 plyr_1.8.8 bslib_0.4.2
## [5] compiler_4.2.2 pillar_1.8.1 pryr_0.1.6 jquerylib_0.1.4
## [9] base64enc_0.1-3 tools_4.2.2 digest_0.6.31 jsonlite_1.8.4
## [13] lubridate_1.9.2 evaluate_0.20 lifecycle_1.0.3 tibble_3.1.8
## [17] checkmate_2.1.0 timechange_0.2.0 pkgconfig_2.0.3 rlang_1.0.6
## [21] cli_3.6.0 rstudioapi_0.14 magick_2.7.4 yaml_2.3.7
## [25] xfun_0.37 fastmap_1.1.0 withr_2.5.0 stringr_1.5.0
## [29] dplyr_1.1.0 generics_0.1.3 vctrs_0.5.2 sass_0.4.5
## [33] tidyselect_1.2.0 glue_1.6.2 R6_2.5.1 fansi_1.0.4
## [37] tcltk_4.2.2 purrr_1.0.1 tidyr_1.3.0 reshape2_1.4.4
## [41] pander_0.6.5 magrittr_2.0.3 MASS_7.3-58.1 rapportools_1.1
## [45] backports_1.4.1 codetools_0.2-18 htmltools_0.5.4 matrixStats_0.63.0
## [49] utf8_1.2.3 stringi_1.7.12 cachem_1.0.6