Lab 2 - Instructions
Introduction
Last week you did the following:
- Installed R and RStudio
- Downloaded and opened a script in RStudio
- Learned how to write your own documentation with the
#sign - Used the function
runif()to generate random numbers from a uniform distribution - Learned how to ask R to give you more information about a function by typing
?in front of it - Created objects by storing the results of multiple runs of
runif()to variables you named- (e.g.,
random.2 <- runif(n=10, min=1, max=10)stored 10 random uniform values between 1 and 10 in the objectrandom.2– and when you typedrandom.2into the console multiple times, those same 10 numbers kept showing up
- (e.g.,
Remember how doable that lab felt? Take a minute to feel good about that. You’re already figuring out the basics of a new language.
Today we are going to build on those basics with a more formal introduction to programming in R. This lab will proceed in four parts.
- Setting your working directory
- Using the interface in RStudio
- Working with R’s four different data types (i.e., character, logical, numeric, and factor)
- Storing data as “objects,” AKA “data structures” (e.g., vectors and data frames)
We will do parts 1 and 2 (as well as “Part 0”) together. Parts 3 and 4 you will complete with a partner. Please answer all questions completely as instructed.
0. Find Your SOCY 392/data Folder and Save the Lab 2 Data
Before we begin, find where the data folder you made last class, which is nested in the SOCY 392 folder you also made during Lab 1. Now download the file called lab2data.rda, and save that file in the data folder. You will explore this data set if you have time at the end of the lab.
The file you just downloaded and saved to SOCY 392/data includes data about the songs of musical artists people mentioned in the extra credit “Bonus Homework Assignment.” If you submit your response to this homework assignment by today at 11:59pm, your favorite musicians will also be added to the data set.
1. Setting your Working Directory
The first thing R always needs to know is where to read in any files that you want to work with, and where to export any files you want to save after cleaning up your data or doing an analysis. There are two ways to tell R how to do this. The first way you learned in Lab #1 - clicking "Session > Set Working Directory > Choose Directory…" and then locating the "SOCY 392/data" folder.
The second way - which is cleaner - is to write some code telling R where the working directory is. Try it out:
Type
getwd()into the console (the panel to the left of RStudio when you open it) and hitenter. The console will now show you where your default working directory is on your computer. On my computer, it says this:The result (which might be different on your computer) shows you where files will be read and written unless you change the working directory. You definitely want to change the working directory, or your file system will become a mess. Change the working directory to the
"data"folder within the"SOCY 392"folder you made last lab. Finding the name of that file path is different on Macs and PCs:- If you’re on a Mac: use the Finder (look this up if you don’t know what it is) and navigate to the
datafolder. Right-click it. A menu should appear. Press and hold theOptionbutton on your keyboard, and the menu will change (thanks Viki for discovering this). Click the option that says “Copydataas Pathname.” - If you’re on a PC: use the File Explorer (look this up if you don’t know what it is) and navigate to the
datafolder. Right-click the folder while holdingShift, and select “Copy as path.”
Head back to the console, and use R’s special function for setting the working directory:
setwd(). In the parentheses, paste the file pathway you just copied, and make sure the whole thing is surrounded by quotes. All slashes in the pathway need to be forward slashes (/), not back slashes (\). If you have a PC, you will need to replace all of the back slashes with forward slashes after pasting. Check to make sure it worked by typinggetwd()into the console once you finish.Here’s what the result looks like on my computer (which might be different than yours):
- If you’re on a Mac: use the Finder (look this up if you don’t know what it is) and navigate to the
Now that you have done this, you can tell R to go find the data you downloaded and saved in the
datafolder in Part 0. Try this now: typeload("lab2data.rda")into the console. Then typehead(lab2data).Want to see who else is in the data set beyond Luke Combs so far? Type
unique(lab2data$artist_name)into the console and run the line of code.
2. Exploring the RStudio Interface
Whenever you open a clean instance of RStudio, you will usually see three panels like those in the screenshot below.
The left panel is called the “Console.”
The top right panel includes details about your “Environment.”
The bottom right panel will show you you “Help” (recall from Lab 1), “Plots” and “Packages” (which we’ll get to in Lab 3).
The Console is a big calculator. It’s also the place where all of the code you ever write in R will “execute” - though you might save all of that code in a separate “script,” like you did in Lab 1 (and will do in Lab 2, in just a few minutes).
Click inside of the console, type
2+3, then hitenteron your keyboard. You can multiply (*), divide (/), add (+), and subtract (-) numbers here.You can also run functions here. Remember the function we used in Lab 1 to generate random numbers? Use it in the console now. Enter
runif(n=10, min=0, max=1)into the console and see what happens.You can even create objects here. Let’s create 20 random numbers and assign the result to an object called
potato(remember, we can call objects whatever we want to - R does not recognize them as functions). Typepotato <- runif(n=10, min=0, max=1)into the console and hit enter. To check and see if this worked, typepotatoand hit enter, or get a little fancier and typehead(potato)to show the first 10 numbers of the object you created calledpotato.
The Environment (upper right tab) will show you all of the objects and datasets that are stored in R’s working memory. If you’ve been following along and opened a new RStudio session when we started this lab, you should now have two objects in your Environment tab:
lab2dataandpotato. This panel of RStudio has other tabs that we will not use much in this class.The Help, Plots, and Packages tabs all appear in the bottom-right panel. We will still only work with Help in Lab 2. We will explore the Plots and Packages tabs in Lab 3.
- Whenever you query R for information about a function (i.e., when you put a
?in front of a function), the result will pop up in the Help tab. Try that now - type?runif()in the console. Let’s also try if for another function you used during this lab: type?head().
- Whenever you query R for information about a function (i.e., when you put a
Now that you are familiar with these three panels of RStudio, it is time to add a 4th panel - the Script panel. Download the R script for your Lab 2 problem set (lab2script.R on BlackBoard), save it to
SOCY 392/programs, and open it in RStudio. Notice that it opens above the Console panel.The Console panel and the Script panel serve a similar purpose: they are both places where you enter code. But there are a few huge differences:
In the Console, whatever code you type gets executed right away.
In a Script, no code executes until you highlight the lines you’d like to run, and click the Run button in the top-right corner of the panel. You also keep a record of all of your work, which you can use again if you ever need to re-analyze your data or write a similar program.
Regardless of where you run the code from (the Script or the Console), all of the results will show up in the Console only.
We will almost exclusively use the Script panel for writing code in this class. But you must always check the console after running chunks of your script to see what happened and to check your work.
Begin Your Problem Set (Parts 3 & 4)
You have completed Parts 1, and 2. Get going on Parts 3 and 4 with a lab partner. Read through the instructions in the lab2script.R file you opened in RStudio carefully, and answer the questions completely by writing code, and by explaining your answers when appropriate with the comment function (#)
When you are finished, upload your answers to BlackBoard’s submission page. Use the file naming convention lastname_firstname_lab2.R.