Try to work through this Problem Set 0. You will have time to discuss questions you have with the TAs.
The simplest thing we can do is to use R as a calculator. For
instance, we can type 2 + 3 * (6 - 4)^5 after the
> and then hit Enter. Below, is a code-chunk
showing the output of typing 2 + 3 * (6 - 4)^5 after the
> and then hitting
We see that R returns the value of the expression \(2 + 3 \times (6 - 4)^5\) immediately underneath and the answer is preceded by a [1]. For now, don’t worry about this ``[1]’’; we’ll return to it later.
2 + 3 * (6 - 4)^5
## [1] 98
Like any scientific calculator, R comes with a number of built-in functions
sqrt(4)
## [1] 2
log(10)
## [1] 2.302585
cos(pi)
## [1] -1
In Statistics Lecture 1, Prof Wyner will discuss the season record of Major League Baseball’s Oakland A’s. Each season (well each complete season) a MLB team plays 162 regular season games. In the 2022 MLB World Series, the American League’s highest seeded team the Houston Astros with a season record of 106 wins and 52 loses played the Philadelphia Phillies, the National League team that slipped into the playoff in the 6th seed (87 wins adn 75 loses)
Let’s calculate the season winning percentages of these two teams
106/162
## [1] 0.654321
87/162
## [1] 0.537037
Close, but winning percentages are usually reported as number between 0 - 100. So let’s multiple by 100,
106/162 *100
## [1] 65.4321
87/162 *100
## [1] 53.7037
Now let’s calculate the difference in the two teams winning percentages. We can do this in a few ways.
106/162 * 100 - 87/162 * 100
## [1] 11.7284
(106/162* 100) - (87/162* 100)
## [1] 11.7284
Note that we don’t need the ‘( )’ around the division and multiplication operation. The division and multiplication is calculated 1st followed by the subtraction
When you enter expressions like those above, R evaluates them, prints
them, and then immediately discards them. Oftentimes, however, you’ll
want to store a value as a named variable and use it in subsequent
calculations. For instance, let’s say that we want to store the value of
\(2 + 3\times (6 - 4)^5\) as \(x\) and then compute the following: \(1/x, x + 1,\) and \(\sqrt{x}.\) To assign the value of
the expression 2 + 3*(6 - 4)^5 to the variable
x we use the assignment operator
<-. The assignment operator evaluates the expression
immediately to the right of it and stores that value in an object whose
name is whatever text came to the left of the operator.
x <- 2 + 3*(6-4)^5
Now when we execute this expression, R does not auotmatically print
anything like it did in an earlier example. However, if you look closely
at the Environment pane (top right-hand side of the
RStudio window), you’ll see that it now lists x and its value 98. 
This pane will show every variable that we have defined. As we start creating more and more variables, this list will be really helpful to keep track of what we’ve defined. Now that we have created the variable x, we can use the symbol ``x’’ in more expressions. For example, we can compute \(1/x, x + 1,\) and \(\sqrt{x}\) as follows:
x
## [1] 98
1/x
## [1] 0.01020408
x + 1
## [1] 99
sqrt(x)
## [1] 9.899495
round(3.14159, digits = 2)
## [1] 3.14
round(3.14159, digits = 4)
## [1] 3.1416
round(sqrt(x), digits = 4)
## [1] 9.8995
Let’s return to our two 2022 World Series teams. Since all teams play the same total number of regular season games, it would be convenient to define a variable, total_games, and assing to it the total number of games, 162.
total_games <- 162
106/total_games
## [1] 0.654321
87/total_games
## [1] 0.537037
Let’s also define variables for games won by the Astros and Phillies.
wins_astros <- 106
wins_phillies <- 87
Take a look up in the environment and observe these variables that we have created and assigned values to.
Let’s calcuate our two winning percentages one more time, for the Astros and then the Phillies.
wins_astros/total_games * 100
## [1] 65.4321
wins_phillies/total_games * 100
## [1] 53.7037
And then let’s round these to 1 digit.
round(wins_astros/total_games * 100, 1)
## [1] 65.4
round(wins_phillies/total_games * 100, 1)
## [1] 53.7
In the last three examples, we used the function
round(), which takes 2 arguments (entered inside the
parantheses). The first argument is the number that we want to round and
the second argument (following the comma) is the number of digits to
which we want to round the first argument. This is our first example of
a multi-argument function and we’ll be seeing a lot more of them later
on. In the last example above, we didn’t give round an
explicit number to round. Instead, R first evaluated
sqrt(x) and then rounded it. R follows the conventional
order of operations, in that it evaluates the inner-most expression
first and then works its way out.
If we try to evaluate an expression using the name of a variable that we haven’t defined yet, R will throw an error. For instance, in the code below, if we try to add 5 to a previously undeclared variable, we get an error.
y + 5
## Error in eval(expr, envir, enclos): object 'y' not found
What does R return if you try and do a calculation with a variable that you have not yet assigned a value? At the prompt write: 1 + a
Now assign the value of 2 to a, like this: a<- 1 And now add 1 to a. What does R return?
In R, what does the code below return? (make sure to use all caps for ‘TRUE’ 1 + TRUE
In R, what does the code below return? (make sure to use all caps for ‘FALSE’) 1 + FALSE
Save the value of 8/5 + 3^3 as a new variable
y.
Save the values of y/x as a new variable
z.
Compute the square root of x, y and
z.
Round these values to 3 decimal points.