1 Introduction

Today you will get familiar with the RStudio environment, specifically working in the Console window. The Console window will display executed code output and can also be used for quick code execution. However, any work done in the Console window will be lost once you exit RStudio.

2 Console computation

The console can be used as a calculator.

  • addition: +
  • subtraction: -
  • division: /
  • multiplication: *
  • modulus: %%
  • integer division: %/%
  • raise to power: ^

Evaluate the following expressions in the Console window.

  1. 3 + 4 * 0 - (100 / 3)
  2. (4 + 6) * (2 ^ 6)
  3. 1 / 0
  4. 10 ^ 10 ^ 10 ^ 10
  5. 0 / 0
  6. 0.0000003 * 2

When you launch RStudio numerous functions are immediately available to you. These include many of the mathematical and statistical operations you know.

R function Purpose
abs absolute value
sin sine
cos cosine
tan tangent
log logarithm
exp exponential
mean arithmetic mean
median median
sd standard deviation

Evaluate the following expressions in the Console window.

  1. abs(7)
  2. sin(3.1415)
  3. exp(1)
  4. logarithm of expressions 1 - 6

What logarithm did you just take? Was it the natural log, base 10, base 2? Type ?log in the console. A question mark that precedes a function’s name or built in data object will open the help.

Type the following in the Console window.

  1. ?sd
  2. ?mtcars
  3. ?longley

What are mtcars and longley?

The most important aspects of R’s help resource will be the description and examples given. Examples are always at the end of the help reference. How many examples are given in the help of sd?

  1. Run the example provided in the help for sd.

Investigate what the following functions do:

  1. sqrt
  2. round
  3. floor
  4. ceiling

3 Longley data

3.1 Examine the data

Consider Longley’s Economic Regression Data. This data set is built-in to R. That means it is available immediately once RStudio is launched. Type longley in your console to see the entire data set. The same data is given below.

Answer the following questions about the longley data set.

  1. How many rows and columns does longley have?
  2. What is the difference between the first column of years and the column with the label Year?

Type head(longley) in the Console window. What does this do? How about tail(longley)?

3.2 Summary statistics

The data set longley is stored in R as a data frame. Each column is a vector of the same variable type. We will learn about these details later. For now, to access a specific vector use longley$variable_name, where variable_name is one of the variables in longley. For example,

longley$GNP
 [1] 234.289 259.426 258.054 284.599 328.975 346.999 365.385 363.112
 [9] 397.469 419.180 442.769 444.546 482.704 502.601 518.173 554.894
longley$Year
 [1] 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
[15] 1961 1962

give the GNP and Year vectors of data.

In your Console window get the following vectors:

  1. Unemployed
  2. Population
  3. Employed

In your Console window compute the mean, median, and standard deviation for each of the vectors in 3 - 5. You can press the up-arrow on your keyboard to cycle through previous inputs in the Console.

Compute the maximum and minimum for each of the vectors in 3 - 5.

The summary function in R will give you many of these statistics. For example,

summary(longley$GNP)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  234.3   317.9   381.4   387.7   454.1   554.9 

gives us the minimum, maximum, mean, and quartiles of the GNP vector of data.

Use the summary function on two variables of your choice in longley. Do you think it makes sense to use the summary function on the variable Year in longley?

3.3 Employment investigation

Suppose it is 1962. Two economists are discussing employment. Each makes the following claim.

Economist A: Employment has never been higher in the past 15 years, we have seen a gradual increase from 1947 to 1962.

Economist B: Employment has been range bound since 1947 and is at its lowest level since 1947.

Which economist is correct?

To make a simple plot use the function plot. Let’s look at the variable Employed across time.

plot(x = longley$Year, y = longley$Employed)

Is this the best way to look at employment over time? Discuss this with those in your team. Think of and create a more meaningful representation of employment and plot it.

Which economist is correct?

The graphic doesn’t look as nice as the election plots we examined last time. The plot function is part of the base R graphics. The plots used last time were constructed from function in the ggplot2 package. We will use the package extensively in the future. However, we can tidy the above plot with the following code.

plot(x = longley$Year, y = longley$Employed, xlab = "Year", ylab = "Employed")

Tidy up the plot you created by adding labels. Feel free to add colors or other features.

Later we will learn about more advanced graphics, such as the one below. Place your cursor on each point to see that year’s GNP.

4 References

  1. J. W. Longley (1967) An appraisal of least-squares programs from the point of view of the user. Journal of the American Statistical Association 62, 819-841.