R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(dslabs)

Including Plots

You can also embed plots, for example:

data(murders)

#q1:Use the $ operator to access the population size data and store it as the object pop. Then use the sort function to redefne pop so that it is sorted. Finally, use the [ operator to report the smallest population size.

pop<-murders$population
pop

##  [1]  4779736   710231  6392017  2915918 37253956  5029196  3574097   897934
##  [9]   601723 19687653  9920000  1360301  1567582 12830632  6483802  3046355
## [17]  2853118  4339367  4533372  1328361  5773552  6547629  9883640  5303925
## [25]  2967297  5988927   989415  1826341  2700551  1316470  8791894  2059179
## [33] 19378102  9535483   672591 11536504  3751351  3831074 12702379  1052567
## [41]  4625364   814180  6346105 25145561  2763885   625741  8001024  6724540
## [49]  1852994  5686986   563626

length(pop)

## [1] 51

 murders$population[which.min(pop)]

## [1] 563626

#q2:Now instead of the smallest population size, fnd the index of the entry with the smallest population size.

ind<-(murders$population)
murders$population[ind]

##  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [51] NA

ind

##  [1]  4779736   710231  6392017  2915918 37253956  5029196  3574097   897934
##  [9]   601723 19687653  9920000  1360301  1567582 12830632  6483802  3046355
## [17]  2853118  4339367  4533372  1328361  5773552  6547629  9883640  5303925
## [25]  2967297  5988927   989415  1826341  2700551  1316470  8791894  2059179
## [33] 19378102  9535483   672591 11536504  3751351  3831074 12702379  1052567
## [41]  4625364   814180  6346105 25145561  2763885   625741  8001024  6724540
## [49]  1852994  5686986   563626

#q3We can actually perform the same operation as in the previous exercise using the function which.min.write one line code that does this.

i_min<-which.min(murders$total)
murders$state[i_min]

## [1] "Vermont"

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

#q4 Now we know how small the smallest state is and we know which row represents it. Which state is it?Defne a variable states to be the state names from the murders data frame. Report the name of the state with the smallest popoulatioin.

states <- murders$State

index_of_smallest_population <- which.min(murders$Population)

state_with_smallest_population <- states[index_of_smallest_population]
cat("The state with the smallest population is:", state_with_smallest_population)

## The state with the smallest population is:

#q5You can create a data frame using the data.frame function. Here is a quick example:temp <- c(35, 88, 42, 84, 81, 30)city <- c(“Beijing”, “Lagos”, “Paris”, “Rio de Janeiro”, “San Juan”, “Toronto”)city_temps <- data.frame(name = city, temperature = temp)

Creating a data frame

temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp)

Creating a data frame

temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp)

Accessing and displaying the data frame

print(city_temps)

##             name temperature
## 1        Beijing          35
## 2          Lagos          88
## 3          Paris          42
## 4 Rio de Janeiro          84
## 5       San Juan          81
## 6        Toronto          30

Accessing specific columns

names <- city_temps$name
temperatures <- city_temps$temperature

Accessing specific rows

paris_row <- city_temps[3, ]  # Get the data for Paris
print(paris_row)

##    name temperature
## 3 Paris          42

Filtering data

hot_cities <- city_temps[city_temps$temperature > 80, ]
print(hot_cities)

##             name temperature
## 2          Lagos          88
## 4 Rio de Janeiro          84
## 5       San Juan          81

# Create a data frame (as provided in your example)
temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp)

# Assuming you have a data frame named "murders" with columns "State" and "Population"
# If you don't have the "murders" data frame, replace it with your actual data frame

# Use the rank function to determine the population rank
ranks <- rank(murders$Population)

# Create a new data frame "my_df" with state names and their ranks
my_df <- data.frame(State = murders$State, Rank = ranks)

# Print the "my_df" data frame
print(my_df)

## [1] Rank
## <0 rows> (or 0-length row.names)

#q6. Repeat the previous exercise, but this time order my_df so that the states are ordered from leastpopulous to most populous. Hint: create an object ind that stores the indexes needed to order thepopulous to most populous. Hint: create an object ind that stores the indexes needed to order

Create a data frame (as provided in your previous example)

temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp)

Assuming you have a data frame named “murders” with columns “State” and “Population”

If you don’t have the “murders” data frame, replace it with your actual data frame

Use the rank function to determine the population rank

ranks <- rank(murders$Population)

Create a new data frame “my_df” with state names and their ranks

my_df <- data.frame(State = murders$State, Rank = ranks)

Create an “ind” object to store the indexes needed to order the population values

ind <- order(my_df$Rank)

Reorder the “my_df” data frame based on the “ind” object

my_df <- my_df[ind, ]

Print the ordered “my_df” data frame

print(my_df)

## numeric(0)

#q8

Create vectors for city names and temperatures in Fahrenheit

city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
temp_fahrenheit <- c(35, 88, 42, 84, 81, 30)

Convert temperatures from Fahrenheit to Celsius

temp_celsius <- (5/9) * (temp_fahrenheit - 32)

Create the data frame

city_temps <- data.frame(name = city, temperature_Fahrenheit = temp_fahrenheit, temperature_Celsius = temp_celsius)

Print the data frame

print(city_temps)

##             name temperature_Fahrenheit temperature_Celsius
## 1        Beijing                     35            1.666667
## 2          Lagos                     88           31.111111
## 3          Paris                     42            5.555556
## 4 Rio de Janeiro                     84           28.888889
## 5       San Juan                     81           27.222222
## 6        Toronto                     30           -1.111111

#q2. What is the following sum 1+1/22 + 1/32 + … 1/1002? Hint: thanks to Euler, we know it should be close to π2/6. # Calculate the sum

n <- 1000  # You can set this to 1000 or any desired value
result <- sum(1 / (1:n)^2)

Approximate π^2/6

pi_squared_over_6 <- pi^2 / 6

Print the result

cat("Approximation of π^2/6:", result, "\n")

## Approximation of π^2/6: 1.643935

cat("Exact value of π^2/6:", pi_squared_over_6, "\n")

## Exact value of π^2/6: 1.644934

#3Compute the per 100,000 murder rate for each state and store it in the object murder_rate. Then compute the average murder rate for the US using the function mean. What is the average?

Calculate the murder rate per 100,000 people for each state

murder_rate <- murders$total / murders$population * 100000

exercises 3.11 and 3.13

laiba tahir

2023-10-19

R Markdown

Including Plots

Creating a data frame

Creating a data frame

Accessing and displaying the data frame

Accessing specific columns

Accessing specific rows

Filtering data

Create a data frame (as provided in your previous example)

Assuming you have a data frame named “murders” with columns “State” and “Population”

If you don’t have the “murders” data frame, replace it with your actual data frame

Use the rank function to determine the population rank

Create a new data frame “my_df” with state names and their ranks

Create an “ind” object to store the indexes needed to order the population values

Reorder the “my_df” data frame based on the “ind” object

Print the ordered “my_df” data frame

Create vectors for city names and temperatures in Fahrenheit

Convert temperatures from Fahrenheit to Celsius

Create the data frame

Print the data frame

Approximate π^2/6

Print the result

Calculate the murder rate per 100,000 people for each state