Lab 2 - Instructions

Author

Joseph Quinn, PhD

Introduction

Last week you did the following:

  • Installed R and RStudio
  • Downloaded and opened a script in RStudio
  • Learned how to write your own documentation with the # sign
  • Used the function runif() to generate random numbers from a uniform distribution
  • Learned how to ask R to give you more information about a function by typing ? in front of it
  • Created objects by storing the results of multiple runs of runif() to variables you named
    • (e.g., random.2 <- runif(n=10, min=1, max=10) stored 10 random uniform values between 1 and 10 in the object random.2 – and when you typed random.2 into the console multiple times, those same 10 numbers kept showing up

Remember how doable that lab felt? Take a minute to feel good about that. You’re already figuring out the basics of a new language.

Today we are going to build on those basics with a more formal introduction to programming in R. This lab will proceed in four parts.

  1. Setting your working directory
  2. Using the interface in RStudio
  3. Working with R’s four different data types (i.e., character, logical, numeric, and factor)
  4. Storing data as “objects,” AKA “data structures” (e.g., vectors and data frames)

We will do parts 1 and 2 (as well as “Part 0”) together. Parts 3 and 4 you will complete with a partner. Please answer all questions completely as instructed.

0. Find Your SOCY 392/data Folder and Save the Lab 2 Data

Before we begin, find where the data folder you made last class, which is nested in the SOCY 392 folder you also made during Lab 1. Now download the file called lab2data.rda, and save that file in the data folder. You will explore this data set if you have time at the end of the lab.

The file you just downloaded and saved to SOCY 392/data includes data about the songs of musical artists people mentioned in the extra credit “Bonus Homework Assignment.” If you submit your response to this homework assignment by today at 11:59pm, your favorite musicians will also be added to the data set.

1. Setting your Working Directory

The first thing R always needs to know is where to read in any files that you want to work with, and where to export any files you want to save after cleaning up your data or doing an analysis. There are two ways to tell R how to do this. The first way you learned in Lab #1 - clicking "Session > Set Working Directory > Choose Directory…" and then locating the "SOCY 392/data" folder.

The second way - which is cleaner - is to write some code telling R where the working directory is. Try it out:

  • Type getwd() into the console (the panel to the left of RStudio when you open it) and hit enter. The console will now show you where your default working directory is on your computer. On my computer, it says this:

    The result (which might be different on your computer) shows you where files will be read and written unless you change the working directory. You definitely want to change the working directory, or your file system will become a mess. Change the working directory to the "data" folder within the "SOCY 392" folder you made last lab. Finding the name of that file path is different on Macs and PCs:

    • If you’re on a Mac: use the Finder (look this up if you don’t know what it is) and navigate to the data folder. Right-click it. A menu should appear. Press and hold the Option button on your keyboard, and the menu will change (thanks Viki for discovering this). Click the option that says “Copy data as Pathname.”
    • If you’re on a PC: use the File Explorer (look this up if you don’t know what it is) and navigate to the data folder. Right-click the folder while holding Shift , and select “Copy as path.”

    Head back to the console, and use R’s special function for setting the working directory: setwd(). In the parentheses, paste the file pathway you just copied, and make sure the whole thing is surrounded by quotes. All slashes in the pathway need to be forward slashes (/), not back slashes (\). If you have a PC, you will need to replace all of the back slashes with forward slashes after pasting. Check to make sure it worked by typing getwd() into the console once you finish.

    Here’s what the result looks like on my computer (which might be different than yours):

  • Now that you have done this, you can tell R to go find the data you downloaded and saved in the data folder in Part 0. Try this now: type load("lab2data.rda") into the console. Then type head(lab2data).

  • Want to see who else is in the data set beyond Luke Combs so far? Type unique(lab2data$artist_name) into the console and run the line of code.

2. Exploring the RStudio Interface

Whenever you open a clean instance of RStudio, you will usually see three panels like those in the screenshot below.

  • The left panel is called the “Console.”

  • The top right panel includes details about your “Environment.”

  • The bottom right panel will show you you “Help” (recall from Lab 1), “Plots” and “Packages” (which we’ll get to in Lab 3).

  • The Console is a big calculator. It’s also the place where all of the code you ever write in R will “execute” - though you might save all of that code in a separate “script,” like you did in Lab 1 (and will do in Lab 2, in just a few minutes).

    • Click inside of the console, type 2+3, then hit enter on your keyboard. You can multiply (*), divide (/), add (+), and subtract (-) numbers here.

    • You can also run functions here. Remember the function we used in Lab 1 to generate random numbers? Use it in the console now. Enter runif(n=10, min=0, max=1) into the console and see what happens.

    • You can even create objects here. Let’s create 20 random numbers and assign the result to an object called potato (remember, we can call objects whatever we want to - R does not recognize them as functions). Type potato <- runif(n=10, min=0, max=1) into the console and hit enter. To check and see if this worked, type potato and hit enter, or get a little fancier and type head(potato) to show the first 10 numbers of the object you created called potato.

  • The Environment (upper right tab) will show you all of the objects and datasets that are stored in R’s working memory. If you’ve been following along and opened a new RStudio session when we started this lab, you should now have two objects in your Environment tab: lab2data and potato. This panel of RStudio has other tabs that we will not use much in this class.

  • The Help, Plots, and Packages tabs all appear in the bottom-right panel. We will still only work with Help in Lab 2. We will explore the Plots and Packages tabs in Lab 3.

    • Whenever you query R for information about a function (i.e., when you put a ? in front of a function), the result will pop up in the Help tab. Try that now - type ?runif() in the console. Let’s also try if for another function you used during this lab: type ?head().
  • Now that you are familiar with these three panels of RStudio, it is time to add a 4th panel - the Script panel. Download the R script for your Lab 2 problem set (lab2script.R on BlackBoard), save it to SOCY 392/programs, and open it in RStudio. Notice that it opens above the Console panel.

    • The Console panel and the Script panel serve a similar purpose: they are both places where you enter code. But there are a few huge differences:

      • In the Console, whatever code you type gets executed right away.

      • In a Script, no code executes until you highlight the lines you’d like to run, and click the Run button in the top-right corner of the panel. You also keep a record of all of your work, which you can use again if you ever need to re-analyze your data or write a similar program.

      • Regardless of where you run the code from (the Script or the Console), all of the results will show up in the Console only.

    • We will almost exclusively use the Script panel for writing code in this class. But you must always check the console after running chunks of your script to see what happened and to check your work.

Begin Your Problem Set (Parts 3 & 4)

You have completed Parts 1, and 2. Get going on Parts 3 and 4 with a lab partner. Read through the instructions in the lab2script.R file you opened in RStudio carefully, and answer the questions completely by writing code, and by explaining your answers when appropriate with the comment function (#)

When you are finished, upload your answers to BlackBoard’s submission page. Use the file naming convention lastname_firstname_lab2.R.