This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(dslabs)
You can also embed plots, for example:
data(murders)
BS5th<-data.frame(name=c("ali","ahmad","sania","sara","adil","sharjeel","subhan","arbaz","asad","hassan","waleed"),cgpa=c(2.5,3.2,3.9,2.99,3.10,2.99,2.91,2.82,3.2,3.52,2.50),grade=c("D","B","A","C","B","A","B","B","B","A","D"))
BS5th
## name cgpa grade
## 1 ali 2.50 D
## 2 ahmad 3.20 B
## 3 sania 3.90 A
## 4 sara 2.99 C
## 5 adil 3.10 B
## 6 sharjeel 2.99 A
## 7 subhan 2.91 B
## 8 arbaz 2.82 B
## 9 asad 3.20 B
## 10 hassan 3.52 A
## 11 waleed 2.50 D
ind<-BS5th$cgpa>=3.10
BS5th$name[ind]
## [1] "ahmad" "sania" "adil" "asad" "hassan"
ind<-BS5th$grade=="B"
BS5th$name[ind]
## [1] "ahmad" "adil" "subhan" "arbaz" "asad"
#q3………cgpa>3.1 and got “A”?
cgpa_ind<-BS5th$cgpa>3.1
grade_ind<-BS5th$grade=="B"
BS5th$name[cgpa_ind&grade_ind]
## [1] "ahmad" "asad"
murder_rate = murders$total / murders$population * 100000
murder_rate
## [1] 2.8244238 2.6751860 3.6295273 3.1893901 3.3741383 1.2924531
## [7] 2.7139722 4.2319369 16.4527532 3.3980688 3.7903226 0.5145920
## [13] 0.7655102 2.8369608 2.1900730 0.6893484 2.2081106 2.6732010
## [19] 7.7425810 0.8280881 5.0748655 1.8021791 4.1786225 0.9992600
## [25] 4.0440846 5.3598917 1.2128379 1.7521372 3.1104763 0.3798036
## [31] 2.7980319 3.2537239 2.6679599 2.9993237 0.5947151 2.6871225
## [37] 2.9589340 0.9396843 3.5977513 1.5200933 4.4753235 0.9825837
## [43] 3.4509357 3.2013603 0.7959810 0.3196211 3.1246001 1.3829942
## [49] 1.4571013 1.7056487 0.8871131
low = murder_rate < 1
low
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## [13] TRUE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
## [25] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
## [37] FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE
## [49] FALSE FALSE TRUE
#q5:Now use the results from the previous exercise and the function which to determine the indices of murder_rate associated with values lower than 1.
low<-murder_rate<1
which(low)
## [1] 12 13 16 20 24 30 35 38 42 45 46 51
#q6:Use the results from the previous exercise to report the names of the states with murder rates lower than 1.
ind<-which(murders$murder_rate<1)
murders$state[ind]
## character(0)
#q7:Now extend the code from exercise 2 and 3 to report the states in the Northeast with murder rates lower than 1. Hint: use the previously defined logical vector low and the logical operator &.
murder_rate_ind<-(murders$murder_rate<1)
murder_region_ind<-(murders$region=="Northeast")
murders$state[which(murder_rate_ind & murder_region_ind)]
## character(0)
#q8:In a previous exercise we computed the murder rate for each state and the average of these numbers.How many states are below the average?
avg<-mean(murders$murder_rate)
## Warning in mean.default(murders$murder_rate): argument is not numeric or
## logical: returning NA
avg
## [1] NA
sum(murder_rate<avg)
## [1] NA
#q9:Use the match function to identify the states with abbreviations AK, MI, and IA. Hint: start by defining an index of the entries of murders$abb that match the three abbreviations, then use the [ operator to extract the states.
ind<-match(c("AK", "MI","IA"),murders$abb)
murders$state[ind]
## [1] "Alaska" "Michigan" "Iowa"
#q10:Use the %in% operator to create a logical vector that answers the question: which of the following are actual abbreviations: MA, ME, MI, MO, MU
c( "MA", "ME", "MI", "MO", "MU" )%in% murders$abb
## [1] TRUE TRUE TRUE TRUE FALSE
#q11:Extend the code you used in exercise 7 to report the one entry that is not an actual abbreviation.
abbs<-c("MA", "ME", "MI", "MO", "MU")
ind<-which(!abbs%in%murders$abb)
abbs[ind]
## [1] "MU"
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.