Creating vectors

Here I am tasked with creating three vectors – x, y, and z – as shown below.

x<-c(5,10,15,20,25,30)
y<-c(-1,NA,75,3,5,8)
z<-5

Multiply

Now I multiply the first two vectors (x & y) by z, and store the resulting product in two new objects. I also print the new objects.

new_x<-x*z
new_y<-y*z

print(new_x)

## [1]  25  50  75 100 125 150

print(new_y)

## [1]  -5  NA 375  15  25  40

Replace the missing element

Here, I will replace the missing element of the vector “new_y” with the value 2.5, using the ifelse() function. I will also print this modified vector.

new_y<-ifelse(test = is.na(new_y) == TRUE, yes = 2.5, no = new_y)
print(new_y)

## [1]  -5.0   2.5 375.0  15.0  25.0  40.0

As you can see, RMarkdown has added a new decimal place to the other numberes in “new_y”, because one of its elements is now 2.5.

Load the PRB data

Here, I will load PRB data from my professor’s github account.

#using the excellent readr package
library(readr)
mydata<-read_csv(file = "https://raw.githubusercontent.com/coreysparks/data/master/PRB2008_All.csv")

## Parsed with column specification:
## cols(
##   .default = col_integer(),
##   Country = col_character(),
##   Continent = col_character(),
##   Region = col_character(),
##   Population. = col_double(),
##   Rate.of.natural.increase = col_double(),
##   ProjectedPopMid2025 = col_double(),
##   ProjectedPopMid2050 = col_double(),
##   IMR = col_double(),
##   TFR = col_double(),
##   PercPop1549HIVAIDS2001 = col_double(),
##   PercPop1549HIVAIDS2007 = col_double(),
##   PercPpUnderNourished0204 = col_double(),
##   PopDensPerSqMile = col_double()
## )

## See spec(...) for full column specifications.

Print first ten country names

Here, I will print the first ten country names from our new data frame

print(mydata$Country[1:10])

##  [1] "Afghanistan"         "Albania"             "Algeria"            
##  [4] "Andorra"             "Angola"              "Antigua and Barbuda"
##  [7] "Argentina"           "Armenia"             "Australia"          
## [10] "Austria"

How many countries in the data?

There are lots of ways we can count data in R. Here are two of them:

#Assuming there are no missing countries, we can ask for a summary of mydata$Country

summary(mydata$Country)

##    Length     Class      Mode 
##       209 character character

#Or, we can ask for the number of rows in the data frame (assuming each observation is a country, with no missing countries)

nrow(mydata)

## [1] 209

If we wanted to make extra-super-sure that no countries are missing (that is, no countries are “NA”), we could call the following function, which will only count those observations which are NOT NA:

length(which(!is.na(mydata$Country)))

## [1] 209

How many countries are missing the e0Total variable?

Here, I will discover how many countries are missing the e0Total (life expectancy) variable.

#similar to the last exercise, we just ask R for the length of NA's in e0Total
length(which(is.na(mydata$e0Total)))

## [1] 2

So, we see that there are two countries missing e0Total.

Which countries are missing the e0Total variable?

Finally, I will see specifically which countries are missing the e0Total variable:

#Here, I will call for a subset of country names where e0Total equals NA
subset(mydata$Country,is.na(mydata$e0Total))

## [1] "Andorra" "Monaco"

We see that Andorra and Monaco are missing e0Total (life expectancy) data – probably on account of them being very small countries.

DEM7273_Homework1

Karlerik Naslund

August 24, 2017