Hello! This document was created as part of the CC Bio INSITES Quantitative Workshop #2, specifically as a guide for the pre-activity to be completed before our meeting on April 10th, 2021. This guide has been written assuming that you have little to no familiarity with R, and so begins with a brief overview of the software installation. You shouldn’t feel pressured to memorize or even completely understand everything written here – this is designed as an introduction and a future resource, and as such can be a bit overwhelming at first.
This guide will have sections of text, images, R code, and R output. The text and image sections are straightforward, but the R code/output sections can be confusing. R code will always be displayed in gray boxes, with white boxes directly beneath it showing the output of the code:
Sometimes there will be code with no output. In either case, to use the code yourself, all you have to do is copy/paste or re-type what’s in the gray box!
Your worksheet is the hub for this assignment. This document and the others like it will contact extra instructions and examples, but the worksheet is designed as the touchstone for your to return to.
If you run into questions, please feel free to send us an email! You can reach us at mailto:hlperkin@purdue.edu and mailto:seddy@fiu.edu, and we’re happy to help you out if you run into any tangles. Also, please let us know if you spot any typos or errors so we can fix them!
Installing R can be unexpectedly complicated at first, as there are two programs from separate websites that need to be installed. This page provides a good overview for PC and Mac users, but in case it isn’t available (or you would just prefer to stay on this page) we’ll also provide directions for Windows and Macs here. If you use Linux, the link also provides step-by-step directions for Ubuntu towards the bottom of the page.
Head over to CRAN to download the current version of R. The link you need is right on the front page, labeled ‘Download R for Windows’; from there, look for the bolded ‘install R for the first time’ link and click to download the software. You can save the downloaded EXE where ever you like; when run, it will install R into the default directory that you shouldn’t need to alter.
The download page for RStudio makes it very easy to install RStudio for Windows. Hit the big blue button and to get the installer, and then run to get RStudio ready to run. After that, you’re done! The link provided above will walk you through some additional stuff, but we’ll also cover that on our own so you don’t have to worry about it now.
Head over to CRAN to download the current version of R. The link you need is right on the front page, labeled ‘Download R for (Mac) OS X’; from there, look for the section titled ‘Latest release’ and download the indicated PKG. When run, it will install R into the default directory that you shouldn’t need to alter.
On the download page for RStudio check out the section titled ‘All Installers’, and then locate the link for macOS. Save the indicated file and run; once again, you should accept the defaults (unless you’re familiar enough with your system to know why you shouldn’t). After that, you’re done! The link provided above will walk you through some additional stuff, but we’ll also cover that on our own so you don’t have to worry about it now.
Now that you’ve got everything installed, here’s a quick overview of what’s what in the program. When you open RStudio for the first time, you should see this:
The only thing that won’t match up is the contents of your ‘Files’ window, but that’s okay. Now for a quick breakdown of what the various windows are:
This window is where most of the output will be displayed, including errors and warnings. Everything in this window is text – if you create a plot, for instance, it will appear in the ‘Plots’ tab (the same windows as the ‘Files’ tab). The window does have a limit; if you run a bunch of code, or ask R to display something lengthy, some of your older code may get cut off as the window fills with new output. You can write code here, but it won’t be saved, so make sure you’re writing all of your code in the script window (#4, discussed more below).
This window lets you keep track of the objects you’re working with. If you import a dataset, that will be one object (usually a ‘dataframe’ that holds all of your rows and columns). If you run a regression and save the output, that will be another object (usually an ‘lm’, a collection of all the output produced by the test, such as coefficients, residuals, fitted values, and so on). Sometimes you can click objects in this window (e.g., a dataframe) and the object will be output into the console or it will open in a new window. There are some other tabs here as well, but they aren’t used often.
Lots of stuff can be displayed in the tabs of this window, but it’s not as interactive as the others.
This is the window where you’ll spend most of your time, but it doesn’t show up automatically. This window only appears when you create a code file (File > New File) for writing and saving code. The files you create will be Untitled and unsaved, like other coding software or word processing programs you might be familiar with, so don’t forget to save! RStudio has a handy default setting that lets it save your workspace, specifically all of the code windows you have open and all of the objects in your environment, but sometimes there are issues so don’t rely on it too much. Note: when you save files, the save location will default to your current working directory.
Once you’ve entered code into your script, you have to run it to make anything happen.
If the code is a single line: You can hit the ‘Run’ button in the top-right corner of the Script window, or hit Ctrl-Enter on your keyboard. (For Macs, the ‘Run’ button has a dropdown that shows you the shortcuts for your system).
If the code is multiple lines: You can highlight the lines of code and hit ‘Run’ or Ctrl-Enter, or you can run them one line at a time (also with ‘Run’ or Ctrl-Enter). I recommend using the Ctrl-Enter shortcut – it can make things a lot faster!
The best way to get familiar with R is to use it, so let’s get started. First step: creating a Project!
Projects are a useful way of organizing your work environment, working directory, and file structure. They’re especially useful if you’re working on multiple projects (lowercase p). For instance, you might have a dataset consisting of information about your students’ grades, science interest, science self-efficacy, and growth mindsets. One collaborator might be interested in analyzing the relationship between grades and mindsets, while another wants to examine the relationship between interest and self-efficacy. You can, of course, run both analyses from the same Project, but splitting them helps keep things organized. For instance, one project might limit the dataset to first-year students, while the other looks at the entire sample, or even requires you to combine your data with some collected by your collaborator. Keeping track of any such decisions you’ve made is easier when you keep the projects in separate Projects.
They also make switching between projects easier. If you’re going from a meeting with Collaborator 1 to Collaborator 2, all you have to do is pull up the second project and everything – your working directory, the scripts you had open, the objects in your environment, etc. – will all come up exactly as you left them last. Your working directory is the current location that R will use to output files, and so it’s easy to forget about until something goes wrong (e.g., you save something to the wrong location and can’t find it, or overwrite a file you didn’t intend to). So again, Projects help you avoid this hassle by setting your working directory and allowing you to focus on other things.
In the top right corner of your screen is a little blue box and a label that says Project: (None) (assuming, of course, that you don’t have a project open. Selecting ‘New Project’ will open a new window, and the option to create a project in a new directory.
The next screen provides a list of potential project types. In this case, we want to select ‘New Project’ again.
The last screen lets you name your project and select the project’s working directory. You don’t need to create a new folder just for your new project! Whatever you input as the directory name will become the new folder for your project. If you created the folder structure recommended by the previous workshop, you can navigate to this location and select it as your new project subdirectory, or you can move your previously created folders into your new project location ourside of R. I titled my project ‘Quant Workshop 2’.
When complete, the project button will have the name of your new project, and the Files window will change to reflect your new working directory. If you look closely at the Files window in the screenshot below, you can see that the full directory for my new project is D:ProjectsWorkshop 2.
And you’re done! Your new project is ready for you to start creating script files, adding code, and importing your data.
File > New File will produce a list of all the programming languages that RStudio can handle, but the most commonly used is the R Script. Once you’ve created the new script, File > Save As will let you save it. The default location will be your project subdirectory, although you can save it outside this folder as well (not recommended).
Now that you’ve got R open and your first script file. How are you feeling about R? Does anything worry you? Does anything excite you? Head back over to the worksheet to share your thoughts and get the next step.