Assignment 3.11 ,3.13

library(dslabs)
data("murders")
df<-murders

Exercise 3.11

Q1: Use the $ operator to access the population size data and store it as the object pop. Then use the sort function to redefine pop so that it is sorted. Finally, use the [ operator to report the smallest population size

pop <- df$population
pop <- sort(pop)
pop[1]

## [1] 563626

Q2: Now instead of the smallest population size, fnd the index of the entry with the smallest population size. Hint: use order instead of sort.

population_data <- data.frame(
    country = c("Country A", "Country B", "Country C", "Country D"),
    population = c(1000000, 500000, 200000, 3000000)
)
pop <- population_data$population

index_of_smallest <- order(pop)[1]

index_of_smallest

## [1] 3

Q3:We can actually perform the same operation as in the previous exercise using the function which.min. Write one line of code that does this

which.min(population_data$population)

## [1] 3

Q4:Now we know how small the smallest state is and we know which row represents it. Which state is it? Define a variable states to be the state names from the murders data frame. Report the name of the state with the smallest population.

states <- murders$state

state_with_smallest_population <- states[index_of_smallest]

state_with_smallest_population

## [1] "Arizona"

#####Q 5: Use the rank function to determine the population rank of each state from smallest population size to biggest. Save these ranks in an object called ranks, then create a data frame with the state name and its rank. Call the data frame my_df

murders <- data.frame(
    state = c("State A", "State B", "State C", "State D"),
    population = c(1000000, 500000, 2000000, 300000)
)

ranks <- rank(murders$population)

my_df <- data.frame(state = murders$state, rank = ranks)

my_df

##     state rank
## 1 State A    3
## 2 State B    2
## 3 State C    4
## 4 State D    1

Q6 : Repeat the previous exercise, but this time order my_df so that the states are ordered from least populous to most populous. Hint: create an object ind that stores the indexes needed to order the population values. Then use the bracket operator [ to re-order each column in the data frame

ranks <- rank(murders$population)
my_df <- data.frame(state = murders$state, rank = ranks)
ind <- order(murders$population)
my_df <- my_df[ind, ]
my_df

##     state rank
## 4 State D    1
## 2 State B    2
## 1 State A    3
## 3 State C    4

###Q 7 : The is.na function returns a logical vector that tells us which entries are NA. Assign this logical vector to an object called ind and determine how many NAs does na_example have

ind <- is.na(na_example)
num_nas <- sum(ind)
num_nas

## [1] 145

###Q 8 : Now compute the average again, but only for the entries that are not NA. Hint: remember the ! operator

average_non_na <- mean(na_example[!is.na(na_example)])
 average_non_na

## [1] 2.301754

Exercise 3.13

###Q 1 : Remake the data frame using the code above, but add a line that converts the temperature from Fahrenheit to Celsius. The conversion is C = 5/9 × (F − 32).

# Original temperature data in Fahrenheit
temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")

temp_celsius <- (5/9) * (temp - 32)
city_temps <- data.frame(name = city, temperature_fahrenheit = temp, temperature_celsius = temp_celsius)
city_temps

##             name temperature_fahrenheit temperature_celsius
## 1        Beijing                     35            1.666667
## 2          Lagos                     88           31.111111
## 3          Paris                     42            5.555556
## 4 Rio de Janeiro                     84           28.888889
## 5       San Juan                     81           27.222222
## 6        Toronto                     30           -1.111111

###Q 2 : What is the following sum 1+1/22 + 1/32 + … 1/1002? Hint: thanks to Euler, we know it should be close to π2/6

sum_value <- 0
for (n in seq(22, 1002, by = 10)) {
    sum_value <- sum_value + 1/n^2
}
sum_value

## [1] 0.005630024

###Q 3 : Compute the per 100,000 murder rate for each state and store it in the object murder_rate. Then compute the average murder rate for the US using the function mean. What is the average?

murders <- data.frame(
    state = c("State A", "State B", "State C"),
    murders = c(100, 50, 75),
    population = c(500000, 1000000, 750000)
)

murder_rate <- (murders$murders / murders$population) * 100000
average_murder_rate <- mean(murder_rate)

average_murder_rate

## [1] 11.66667