class: center, middle, inverse, title-slide .title[ # An Introduction to
[ comment ]
and RStudio for Educational Researchers ] .subtitle[ ## Introduction to
[ comment ]
and
] .author[ ### Jorge Sinval ] .date[ ### 2025-11-18 ] --- <style> .orange { color: #EB811B; } .kbd { display: inline-block; padding: .2em .5em; font-size: 0.75em; line-height: 1.75; color: #555; vertical-align: middle; background-color: #fcfcfc; border: solid 1px #ccc; border-bottom-color: #bbb; border-radius: 3px; box-shadow: inset 0 -1px 0 #bbb } </style>
# Introduction to <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> .pull-left[ <br> - _R_ and _RStudio_ - Working with projects - Objects and functions - Packages ] .pull-right[ <img src="assets/img/art.jpg"> ] --- # 1.1 Histo<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>y <br> <br> <center> <img src="assets/img/R_logo.svg.png" style = "display: block; margin-left: auto; margin-right: auto;" width = 45%> </center> --- # 1.1 Histo<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>y - _S_ was created by John Chambers in 1976 at Bell Labs. -- - _R_ is a programming language born from _S_. -- - The project was conceived in 1992, with an initial version released in 1995 and the beta version in (v1.0) on February 29, 2000. -- - Developed at Auckland University by Ross Ihaka and Robert Gentleman. -- - It is widely used by statisticians and scientists -- - It is available **free** for Windows, Linux and macOS. -- - It is currently developed by the R Development Core Team. -- - The name <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> is partly due to the name of the first two authors and partly due to the language name _S_. --- # _RStudio_ <center> <img src="assets/img/rstudio_logo.png" width = 50%> </center> -- <br> _RStudio_ is a mature and feature-rich integrated development environment (IDE), optimized for programming with <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>, has become popular among programmers of this statistical programming language. -- <br> The _Open Source Edition_ is completely **_open source_** (as seen in the project's GitHub repository). It can be installed on all major operating systems via the [_RStudio_](rstudio.com "RStudio Website") website. --- # _RStudio_ _RStudio_ has many keyboard shortcuts which help a lot (.kbd[Shift]+.kbd[Alt]+.kbd[K]). To acquire good habits early on, it's best to start trying to use _RStudio_ without using the mouse<sup>🤓</sup>. -- _RStudio_ has 4 windows (or panels) <sup>🤔</sup> presented on the following slide. -- It is also possible to zoom out or zoom in a panel<sup>💡</sup>. -- .footnote[ 🤓 A list of functionalities and keyboard shortcuts [here](https://raw.githubusercontent.com/rstudio/cheatsheets/main/rstudio-ide.pdf "RStudio IDE Cheat Sheet") 🤔 In the first use use only 3 panels are open. Later you will see how to have this 4th panel open. 💡 In the upper right corner of the panels are options to minimize or maximize. By dragging the center edges between panels you can adjust their width or height.] --- # _RStudio_: Panels <center> <img src="assets/img/rstudio_panels_blank.png" width = 65%> <br> 💡 </center> -- .footnote[💡 By default, the order of the panels is in a different order than what is suggested here. It is possible to change the arrangement of the panels in the _RStudio_ preferences (`Tools -> Global Options... -> Pane Layout`. .orange[The arrangement shown below is suggested...]] --- # _RStudio_: Panels <center> <img src="assets/img/RStudio_panels_description.png" width = 65%> </center> -- See panel by panel in detail<sup>👀</sup>... --- # .orange[Source] <br> <center> <img src="assets/img/source.png" width = 65%> </center> <center> Data from <i>Short Index of Job Satisfaction</i> (SIJS) <sup>📜</sup>. </center> .footnote[ <sup>📜</sup> Sinval, J., & Marôco, J. (2020). Short Index of Job Satisfaction: Validity evidence from Portugal and Brazil. _PLoS ONE, 15_(4), 1–21. [https://doi.org/10.1371/journal.pone.0231474](https://doi.org/10.1371/journal.pone.0231474)] --- # .orange[Source] **_Source_**: Edit, save, and send _R_ code to **_console_**. By default this panel does not exist when starting _RStudio_: it appears when opening a _R_ script, for example. via `File -> New File -> R Script`. A common task on this panel is to send code from the selected line(s) to **_console_**, via .kbd[Ctrl]+.kbd[Enter]. -- There are two main types of documents that are used in `Source`: - `*.R` documents that only have _R_ script, where **does not exist** _chunks_<sup>🤔</sup>. - `*.Rmd` (known as `R Markdown`) dynamic documents where you can add text, code `chunks` and `YAML` metadata. -- <br> <br> See in detail 👀... -- .footnote[🤔 _chunks_ are pieces of code that are embedded in a document that may have other elements (e.g. text, images, video)] --- # .orange[Source]: `*.R` <br> The `R Script` documents allow us to create a draft of our `R` code. It consists of a simple text document in which code can be written. To create a new `*.R` one should select `File -> New File -> R script`. RStudio will place a new document in the `Source` panel. -- <br> It is possible to add comments with `#` at the beginning, everything beyond the hash mark (on the same line) will not be executed as code. Again, it is **highly recommended** that the code is written and edited here. The `Console` panel should be only used to execute small pieces of code. Thus, when the code writing is finished, it will be possible to save the document and start over without losing any of the written code. --- # .orange[Source]: `*.R` _Scripts_ are very useful for editing and reviewing code. To save a _script_, just use .kbd[Ctrl]+.kbd[s] or go to `File -> Save`. By clicking `Run` or .kbd[Ctrl]+.kbd[Enter], _R_ will execute the line of code where the cursor is blinking. If multiple lines are selected, _R_ will execute the selected code. By clicking on `Source` all the code in the `*.R` document will be executed in the `Console`<sup>💡</sup>. <center> <img src="assets/img/source.png" width = 65%> </center> <br> .footnote[💡 you can switch between `Source` and `Console` panels with .kbd[Ctrl]+.kbd[2] (cursor moves to `Console`) and with .kbd[Ctrl</kbd >+.kbd[1] (cursor moves to `Source`)]] --- # .orange[Source]: `*.Rmd` <br> Documents `R Markdown` are documents that contain three main elements: **code**, **text** and **`YAML`** metadata. The code included can be from several languages at the same time(!): `R`, `Python`, `Julia`, `Bash`, `Stan` (among others). The text can be simply formatted with markdown (e.g. italics, bold, headlines) or use other languages like `*.html` or `css`. -- <br> `R Markdown` is a great solution to use in favor of **reproducibility** of the research data, and code<sup>📜</sup>. And also for producing reports<sup>🤓</sup>... Allows to create output in various formats (eg `*.html`, `*.pdf` , and `*.doc`). -- <br> .footnote[📜 Goodman, S. N., Fanelli, D., & Ioannidis, J. P. A. (2016). What does research reproducibility mean? _Science Translational Medicine, 8_(341), 341ps12-341ps12. [https://doi.org/10.1126/scitranslmed.aaf5027](https://doi.org/10.1126/scitranslmed.aaf5027) 🤓 These slides are the result of a `R Markdown` document 😯] --- # .orange[Source]: `*.Rmd` To create a new `*.Rmd` document you should `File -> New File -> R Markdown...`. -- When the `*.Rmd` file is open there is a `Knit` option (top) that allows you to generate the document in the desired format, i.e., export the `*.Rmd` to another format: `html`, `pdf`, `doc`. -- In order to properly combine text, and code, the `R` code `chunks` should start with the following in the (first line of `chunk`): ````{` followed by `r` , `chunk` name<sup>💡</sup> and other arguments (optional), close with `}`<sup>🧙♂️</sup>. Examples: -- - ````{r}` or -- - ````{r chunk_name, tidy=TRUE}` -- In the following lines, write the desired `R` code. The `chunks` end with ` ``` `. -- .footnote[🧙♂️ .kbd[Ctrl]+.kbd[Alt]+.kbd[i] opens a `chunk`. It's one of the most useful shortcuts! <sup>💡</sup> Useful when there are errors, makes it easy to identify the `chunk` the error is on.] --- # .orange[Source]: `*.Rmd` It is also possible to create parameters to generate documents in sequence. Allowing to generate a document that contains the code that other interested parties will need to reproduce the work, along with the description that the reader needs to understand what was done. -- Useful options for `chunk`: -- - `tidy = TRUE` makes the <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> code more readable (proper spacing) -- - `results = 'hide'` hides `chunk` results (does not show them) -- - `results = 'hold'` keeps `chunk` results until all commands in the chunk have been executed -- - `warning = FALSE` does not show any warning messages (e.g., when `ggplot2` removes observations) -- - `message = FALSE` does not show any messages (e.g., when packages are loaded) --- # .orange[Source]: `*.Rmd` <center> <img src="assets/img/source_rmd.png" width = 55%> <br> 🤔 </center> -- .footnote[ 🤔 there is a [_R Markdown Cheat Sheet_](https://rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf "R Markdown Cheat Sheet") with all the resumed information about `R Markdown` in _RStudio_, and one [_R Markdown Reference Guide_](https://rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf "R Markdown Cheat Sheet") documenting `R Markdown` style options.] --- # .orange[Console] <center> <img src="assets/img/console.png" width = 120%> </center> <center> The result of the executed code... --- # .orange[Console] **_Console_**: Any code entered here is processed by <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>, line by line. This panel is ideal for interactively testing ideas before saving the final results in the **Source** panel <sup>🤓,⚠️,🤔</sup>. You can have the output colored on Linux or Mac OS with the package [`colorout`](https://github.com/jalvesaq/colorout "Package page in github") -- .footnote[🤓 The .kbd[Ctrl]+.kbd[l] combination will clean all the code from the `Console`. ⚠️ the execution of some processes can be long, to cancel a process that is running just use .kbd[Ctrl]+.kbd[c]. 🤔 Clicking .kbd[Ctrl]+.kbd[↑] will scroll through the history of executed code. Knowing the first letters with which you started the line of code you are looking for, just type the first letters and then press .kbd[Ctrl]+.kbd[↑], you will only search for a subset of the history corresponding to the initial characters.] --- # .orange[Environment/.../Connections] <center> <img src="assets/img/environment.png" width = 120%> </center> --- # .orange[Environment/.../Connections]<sup>🤓</sup> **_Environment_**: contains information about the objects currently loaded in the working environment, including the object's class, dimension (if they are a `data.frame`) and name. Allows to see the observations and lines of an object. It has a useful `Import Dataset` option that opens a window through which a document can be selected and imported to a <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> object. -- **_History_**: (searchable) history that was sent to the `Console` -- **_Connections_**: Connection to databases of various types (e.g. _Livy_, _Spark_) -- .footnote[🤓 Other options such as **Build** and **Git** (if applicable to the `project` in question) may appear.] --- # .orange[Files/Plots/.../Help/Viewer] <center> <img src="assets/img/files.png" width = 85%> </center> --- # .orange[Files/Plots/.../Help/Viewer] **_Files_**: Contains a (simple!) file explorer, which gives access to the contents of the system. Once in the folder where it is desired to read, and write documents, the folder is question can be set as .orange[Working Directory] via the `More -> Set As Working Directory` option. -- **_Plots_**: Allows to view plots, there are options to open the plot in a separate window, and to export it as `*.pdf`, `*.jpg`, `*.png`, `*.tiff`, `*.bmp`, or `*.eps`. -- **_Packages_**: Shows a complete list of all installed packages, and indicates with a `✓` whether they are loaded in the current session. It facilitates installing, and uninstalling packages, as well as loading the package from the current session. -- **_Help_**: Allows to search for the help documentation of the desired function. <sup>🤓</sup> -- **_Viewer_**: allows the (interactive) visualization of outputs from <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>, such as `widgets` HTML<sup>▶️</sup>. -- .footnote[🤓 the `help()` function or a `?` allows to use the help panel (e.g. `help(hist)` or `?hist` will display the page from help for the `hist` function. <sup>▶️</sup> one example in the next slide.] --- exclude: false # .orange[Viewer]: `HTML Widgets` ```r DT::datatable( head(readr::read_csv("https://ndownloader.figshare.com/files/22299075")[,1:5], 10), fillContainer = FALSE, options = list(pageLength = 8) ) ``` <center> <img src="assets/img/viewer.png" width = 75%> </center> --- # _RStudio_: Panels Altogether, the four panels are used in conjunction in a common data analysis workflow <sup>👨💻</sup> -- <center> <img src="assets/img/RStudio_panels_worked.png" width = 60%> </center> -- .footnote[👨💻 in `Tools -> Global Options... -> Appearance`, it is possible to select a `Theme` which improves code writing and reading, and that matches the personal taste.] --- # Import .pull-left[ <br> - Text files - Excel files - Files from other software (SAS, SPSS) ] .pull-right[ <br> <img src="assets/img/arte_janitor.png"> ] --- # Wrangling .pull-left[ <br> - Select columns - Filter lines - Create or modify columns - Group and summarize - Join tables ] .pull-right[ <img src="assets/img/art_dplyr.png"> ] --- # Visualization <center> <img src="assets/img/art_ggplot2.png" width = 60%> </center> --- # Communication - Automated reports - Dashboards <center> <img src="assets/img/art_rmarkdown.png" width = 58%> </center> --- class: inverse, center, middle # 👩💻 Programming? Why? 👨💻 <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Programming? Why? ## It is necessary to communicate with the computer <img src="assets/img/diagram_programming.png" style="display: block; margin-left: auto; margin-right: auto; width: 50%"> ??? Writing code is really about creative problem solving. When working with code, we can define problem solving as writing a piece of code that performs a particular set of tasks given a set of constraints. Writing good code often involve trading off two types of constraints: cognition time and computation time. Hadley Wickham, author of many great R packages, describes a data analysis workflow as follows Before we write any code we need to think about the problem and we need to describe exactly which steps are needed to achieve a given solution. Both steps can be done without opening your computer and writing one line of code, and they primarily involve cognitive constraints. When we’re done thinking, we need the program to execute our code and deliver the output. How fast and reliable this is done we can call the computational constraint. One simple way to think about the difference between big and small/medium sized data is as follows - cognition time > computation time: you’re dealing with small/medium sized data - computation time > cognition time: you could be dealing with big data. --- # Programming? Why? ## Code is text ### It is possible to copy and paste <img src="assets/img/copy-paste.png" style="width: 50%"> --- # Programming? Why? ## Code documents the data analysis ```r image_read("https://jeroen.github.io/images/superfrink.gif") %>% image_rotate(270) %>% image_background("blue", flatten = TRUE) %>% image_border("red", "5x5") %>% image_annotate( " Dispite their different outlook,\n programming languages remain\n languages!", color = "black", size = 28) ``` -- .pull-left[ <img src = "https://jeroen.github.io/images/superfrink.gif" style="width:40%"> ] -- .pull-right[ <img src="data:image/png;base64,#slides2of9_files/figure-html/unnamed-chunk-4-1.gif" width="45%" /> ] --- # Code is "open" All modern programming languages are free and open: - Students can use the same tools as professionals. -- - Everyone can use the best tools regardless of financial power. -- - The code facilitates the reproducibility of analyses. -- - Possibility to fix problems. -- - Possibility to develop your own functions. --- class: inverse, center, middle # Use <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>? Why? <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Data science cicle <img src="assets/img/data-science-cicle.png" style = "width:80%; display: block; margin-left: auto; margin-right: auto;"> --- # Data science cicle with <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> <img src="assets/img/data-science-cicle-packages.png" style = "width:80%; display: block; margin-left: auto; margin-right: auto;"> --- class: inverse, center, middle # Important suggestions <sup>💡</sup> <html><div style='float:left'></div><hr color='#EB811B' size=1px width=800px></html> --- # Lea<svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg>ning **Just like any language, the only way to learn a programming language (whether statistical or not) is with practice.** <br> <img src="https://media.giphy.com/media/o0vwzuFwCGAFO/giphy.gif" style = "display: block; margin-left: auto; margin-right: auto;"> --- # Asking for help .pull-left[ <br> - <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> documentation - Google - Stack Overflow ] .pull-right[ <img src="assets/img/stack.png" width = 350%> ] --- # Rules, best practices, and style -- - Rules: must be followed for the code to work (syntax, vocabulary) -- - Best practices: it is recommended that they be followed for the creation of readable codes (spacing, names, organization) -- - Style: everyone can choose the one they feel most comfortable with (indentation types, formatting) --- class: center, bottom, inverse # More info -- Slides created with the <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> package [`xaringan`](https://github.com/yihui/xaringan). -- <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;fill:currentColor;position:relative;display:inline-block;top:.1em;"> <g label="icon" id="layer6" groupmode="layer"> <path id="path2" d="M 132.62426,316.69067 C 119.2805,301.94483 112.56962,274.5073 112.56962,234.39862 v -54.79191 c 0,-37.32217 -5.81677,-63.58084 -17.532347,-78.83466 -11.6757,-15.293118 -31.159702,-22.922596 -58.353466,-22.922596 -5.958581,0 -11.409226,0.22492 -16.45319,0.5917 -5.04455,0.427121 -9.742846,1.037046 -14.1564111,1.83092 V 95.057199 H 16.671281 c 12.325533,0 20.908335,3.82414 25.667559,11.532201 4.77973,7.74964 7.139712,25.48587 7.139712,53.14663 v 68.01321 c 0,42.12298 13.016861,74.19672 39.233939,96.16314 19.627549,16.47424 46.636229,27.23363 81.030059,32.40064 v -20.17708 c -16.3928,-4.27176 -29.04346,-10.51565 -37.11829,-19.44413 z m 246.75144,0 c 13.34377,-14.74584 20.05466,-42.18337 20.05466,-82.29205 v -54.79191 c 0,-37.32217 5.81673,-63.58084 17.53235,-78.83466 11.67568,-15.293118 31.15971,-22.922596 58.35348,-22.922596 5.95858,0 11.40922,0.22492 16.45315,0.5917 5.04457,0.427121 9.74287,1.037046 14.15645,1.83092 v 14.785125 h -10.59712 c -12.32549,0 -20.90826,3.82414 -25.66752,11.532201 -4.77974,7.74964 -7.13972,25.48587 -7.13972,53.14663 v 68.01321 c 0,42.12298 -13.01688,74.19672 -39.23394,96.16314 -19.6275,16.47424 -46.63622,27.23363 -81.03006,32.40064 v -20.17708 c 16.39279,-4.27176 29.04347,-10.51565 37.11827,-19.44413 z M 303.95857,87.165762 c 8.42049,-6.691524 25.52576,-10.536158 51.23486,-11.492333 V 63.999997 H 156.80716 v 11.673432 c 26.1755,0.956175 43.38268,4.800809 51.68248,11.492333 8.31852,6.73139 12.40691,20.033568 12.40691,39.904818 V 384.6851 c 0,20.80641 -4.08839,34.5146 -12.40691,41.02332 -8.2998,6.56905 -25.50698,10.10729 -51.68248,10.65744 V 448 h 197.71597 l 0.67087,-11.63414 c -25.50471,-0.54955 -42.56835,-4.35266 -51.07201,-11.40918 -8.4182,-6.95638 -12.73153,-20.44184 -12.73153,-40.27158 V 127.07058 c 0,-19.87125 4.16983,-33.173428 12.56922,-39.904818 z" style="stroke-width:0.0753388"></path> </g></svg> + <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> = <svg viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:red;"> [ comment ] <path d="M462.3 62.6C407.5 15.9 326 24.3 275.7 76.2L256 96.5l-19.7-20.3C186.1 24.3 104.5 15.9 49.7 62.6c-62.8 53.6-66.1 149.8-9.9 207.9l193.5 199.8c12.5 12.9 32.8 12.9 45.3 0l193.5-199.8c56.3-58.1 53-154.3-9.8-207.9z"></path></svg> -- <svg viewBox="0 0 581 512" xmlns="http://www.w3.org/2000/svg" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#384CB7;"> [ comment ] <path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"></path></svg> has infinite possibilities. -- Practice is the best strategy for learning. -- . -- _In God we trust, all others bring data_ -- Edwards Deming -- . -- . -- . -- THE END --- class: center, bottom, inverse 