R-Basics Exercises

Exercise-3.11

Importing Library

library(dslabs)

Loading the Dataset

data(murders)

Question-01

Use the $ operator to access the population size data and store it as the object pop. Then use the sort function to redefine pop so that it is sorted. Finally, use the [ operator to report the smallest population size.

# Access the population size data using the $ operator
pop <- murders$population

# Sort the population data in ascending order
sorted_pop <- sort(pop)

# Report the smallest population size (the first element in the sorted data)
smallest_population <- sorted_pop[1]

# Print the smallest population size
print(smallest_population)
## [1] 563626

Question-02

Now instead of the smallest population size, find the index of the entry with the smallest population size. Hint: use order instead of sort.

# Use the order function to get the indices that would sort the population data
sorted_indices <- order(murders$population)
sorted_indices
##  [1] 51  9 46 35  2 42  8 27 40 30 20 12 13 28 49 32 29 45 17  4 25 16  7 37 38
## [26] 18 19 41  1  6 24 50 21 26 43  3 15 22 48 47 31 34 23 11 36 39 14 33 10 44
## [51]  5
# Find the index of the entry with the smallest population size (the first index)
smallest_population_index <- sorted_indices[1]

# Print the index of the entry with the smallest population size
print(smallest_population_index)
## [1] 51

Question-03

We can actually perform the same operation as in the previous exercise using the function which.min. Write one line of code that does this.

smallest_population_index <- which.min(murders$population)

Question-04

Now we know how small the smallest state is and we know which row represents it. Which state is it? Define a variable states to be the state names from the murders data frame. Report the name of the state with the smallest population.

# Find the index of the entry with the smallest population
smallest_population_index <- which.min(murders$Population)

# Access the state names from the murders data frame
states <- murders$state

# Find the state with the smallest population
smallest_state <- states[smallest_population_index]

# Print the name of the state with the smallest population
print(smallest_state)
## character(0)

Question-05

You can create a data frame using the data.frame function. Here is a quick example:

temp <- c(35, 88, 42, 84, 81, 30)

city <- c(“Beijing”, “Lagos”, “Paris”, “Rio de Janeiro”, “San Juan”, “Toronto”)

city_temps <- data.frame(name = city, temperature = temp)

Use the rank function to determine the population rank of each state from smallest population size to biggest. Save these ranks in an object called ranks, then create a data frame with the state name and its rank. Call the data frame my_df.

temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp)

# Use the rank function to determine the population rank
ranks <- rank(murders$population)

# Create a data frame with state name and its rank
my_df <- data.frame(State = murders$state, Rank = ranks)
my_df
##                   State Rank
## 1               Alabama   29
## 2                Alaska    5
## 3               Arizona   36
## 4              Arkansas   20
## 5            California   51
## 6              Colorado   30
## 7           Connecticut   23
## 8              Delaware    7
## 9  District of Columbia    2
## 10              Florida   49
## 11              Georgia   44
## 12               Hawaii   12
## 13                Idaho   13
## 14             Illinois   47
## 15              Indiana   37
## 16                 Iowa   22
## 17               Kansas   19
## 18             Kentucky   26
## 19            Louisiana   27
## 20                Maine   11
## 21             Maryland   33
## 22        Massachusetts   38
## 23             Michigan   43
## 24            Minnesota   31
## 25          Mississippi   21
## 26             Missouri   34
## 27              Montana    8
## 28             Nebraska   14
## 29               Nevada   17
## 30        New Hampshire   10
## 31           New Jersey   41
## 32           New Mexico   16
## 33             New York   48
## 34       North Carolina   42
## 35         North Dakota    4
## 36                 Ohio   45
## 37             Oklahoma   24
## 38               Oregon   25
## 39         Pennsylvania   46
## 40         Rhode Island    9
## 41       South Carolina   28
## 42         South Dakota    6
## 43            Tennessee   35
## 44                Texas   50
## 45                 Utah   18
## 46              Vermont    3
## 47             Virginia   40
## 48           Washington   39
## 49        West Virginia   15
## 50            Wisconsin   32
## 51              Wyoming    1

Question-06

Repeat the previous exercise, but this time order my_df so that the states are ordered from least populous to most populous. Hint: create an object ind that stores the indexes needed to order the population values. Then use the bracket operator [ to re-order each column in the data frame.

# Create an object ind to store the indexes needed to order the population values
ind <- order(my_df$Rank)

# Use the bracket operator to re-order each column in the data frame
my_df <- my_df[ind, ]

# Print the re-ordered data frame
print(my_df)
##                   State Rank
## 51              Wyoming    1
## 9  District of Columbia    2
## 46              Vermont    3
## 35         North Dakota    4
## 2                Alaska    5
## 42         South Dakota    6
## 8              Delaware    7
## 27              Montana    8
## 40         Rhode Island    9
## 30        New Hampshire   10
## 20                Maine   11
## 12               Hawaii   12
## 13                Idaho   13
## 28             Nebraska   14
## 49        West Virginia   15
## 32           New Mexico   16
## 29               Nevada   17
## 45                 Utah   18
## 17               Kansas   19
## 4              Arkansas   20
## 25          Mississippi   21
## 16                 Iowa   22
## 7           Connecticut   23
## 37             Oklahoma   24
## 38               Oregon   25
## 18             Kentucky   26
## 19            Louisiana   27
## 41       South Carolina   28
## 1               Alabama   29
## 6              Colorado   30
## 24            Minnesota   31
## 50            Wisconsin   32
## 21             Maryland   33
## 26             Missouri   34
## 43            Tennessee   35
## 3               Arizona   36
## 15              Indiana   37
## 22        Massachusetts   38
## 48           Washington   39
## 47             Virginia   40
## 31           New Jersey   41
## 34       North Carolina   42
## 23             Michigan   43
## 11              Georgia   44
## 36                 Ohio   45
## 39         Pennsylvania   46
## 14             Illinois   47
## 33             New York   48
## 10              Florida   49
## 44                Texas   50
## 5            California   51

Question-07

The na_example vector represents a series of counts. You can quickly examine the object using: data(“na_example”) str(na_example) #> int [1:1000] 2 1 3 2 1 3 1 4 3 2 … However, when we compute the average with the function mean, we obtain an NA: mean(na_example) #> [1] NA The is.na function returns a logical vector that tells us which entries are NA. Assign this logical vector to an object called ind and determine how many NAs does na_example have.

# Load the na_example vector
data("na_example")

# Check the structure of na_example
str(na_example)
##  int [1:1000] 2 1 3 2 1 3 1 4 3 2 ...
# Compute a logical vector indicating NA values
ind <- is.na(na_example)

# Count the number of NA values
num_nas <- sum(ind)

# Print the number of NA values
print(num_nas)
## [1] 145

Question-08

Now compute the average again, but only for the entries that are not NA. Hint: remember the ! operator.

# Load the na_example vector
data("na_example")

# Compute a logical vector indicating non-NA values
ind <- !is.na(na_example)

# Compute the average for non-NA entries
average_non_na <- mean(na_example[ind])

# Print the average for non-NA entries
print(average_non_na)
## [1] 2.301754

Exercise-3.13

Question-01

Previously we created this data frame: temp <- c(35, 88, 42, 84, 81, 30) city <- c(“Beijing”, “Lagos”, “Paris”, “Rio de Janeiro”, “San Juan”, “Toronto”) city_temps <- data.frame(name = city, temperature = temp) Remake the data frame using the code above, but add a line that converts the temperature from Fahrenheit to Celsius. The conversion is C = 5/9 × (F − 32).

# Define the temperature data in Fahrenheit
temp_fahrenheit <- c(35, 88, 42, 84, 81, 30)

# Convert Fahrenheit to Celsius using the conversion formula
temp_celsius <- (5/9) * (temp_fahrenheit - 32)

# Define the city names
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")

# Create the data frame with city names and temperatures in Celsius
city_temps <- data.frame(name = city, temperature = temp_celsius)

# Print the data frame
print(city_temps)
##             name temperature
## 1        Beijing    1.666667
## 2          Lagos   31.111111
## 3          Paris    5.555556
## 4 Rio de Janeiro   28.888889
## 5       San Juan   27.222222
## 6        Toronto   -1.111111

Question-02

What is the following sum 1+1/22 + 1/32 + … 1/1002? Hint: thanks to Euler, we know it should be close to π2/6.

# Initialize a variable to store the sum
sum_result <- 0

# Calculate the sum
for (i in 1:100) {
  sum_result <- sum_result + 1 / (i^2)
}

# Calculate the result
result <- sum_result

# Print the result
print(result)
## [1] 1.634984

Question-03

Compute the per 100,000 murder rate for each state and store it in the object murder_rate. Then compute the average murder rate for the US using the function mean. What is the average?

# Compute the murder rate per 100,000 for each state
murder_rate <- (murders$total * 100000) / murders$population

# Compute the average murder rate for the US
average_murder_rate <- mean(murder_rate, na.rm = TRUE)

# Print the average murder rate
print(average_murder_rate)
## [1] 2.779125