Subsetting, Sorting & Dates: Exercise 3

Exercises

These excercises accompany the Subsetting, Sorting and Dates tutorial.

Import the ‘so2_data.csv’ file we created in Exercise 2 using the read.csv() function and save it to a variable called so2.data. Create a histogram plot of the SO2 values.
Use indexing ([]) to find out the value within the 15th row of the 4th column of the chicago_air dataset (the dataset we used during the lecture).
Using either indexing or the subset() function, create a subset of the chicago_air ozone data by only selecting ozone values greater than .065 ppm. Assign this subsetted data to a new variable called high.ozone
Create a variable called my.date which contains a vector of two values, ‘October 7 2015’ and ‘November 2 2015’. Convert the values in my.date to a date class in R and save it as a variable called new.date. Convert the dates in new.date to a different date format: ‘mmddyyyy’. Save this to a new variable called new.date.format.
Import the csv file in your datasets folder called ‘dates_values.csv’. Save the dataframe to a variable called my.date2. Use str() to see how the dates were imported. Convert the numbers in the date column into dates in R and save as new.date2. Then convert them to the format ‘mm-dd-yyyy’ and save this as new.date.format2.
Sort the chicago_air data in descending order by ozone concentration and save the sorted data to a new variable called chi_air_desc. Using the write.csv() function, save this sorted data to a csv file on your thumb drive named “OzoneSorted.csv”.

Solutions

Solution 1

so2.data <- read.csv("E:/RIntro/datasets/so2_data.csv", header = TRUE, stringsAsFactors = FALSE)
hist(so2.data$SO2) #The missing value for the SO2 data (-999) is causing the histogram of SO2 data to look a little strange.  We will learn later about how to remove these missing data.

Solution 2

library(region5air)
data(chicago_air)
chicago_air[15,4]

## [1] 0.66

Solution 3

head(chicago_air)

##         date ozone temp solar month weekday
## 1 2013-01-01 0.032   17  0.65     1       3
## 2 2013-01-02 0.020   15  0.61     1       4
## 3 2013-01-03 0.021   28  0.17     1       5
## 4 2013-01-04 0.028   18  0.62     1       6
## 5 2013-01-05 0.025   26  0.48     1       7
## 6 2013-01-06 0.026   36  0.47     1       1

high.ozone <- chicago_air[(chicago_air$ozone > .065),] #Notice that the method utilizing brackets returns rows with NA values in the ozone column.
high.ozone <- subset(chicago_air, ozone > .065) # The subset() function automatically removes rows with NA values in the ozone column.

Solution 4

my.date <- c('October 7 2015','November 2 2015')
new.date <- as.Date(my.date,format="%B %d %Y")
new.date.format <- format(new.date, format='%m%d%Y')
new.date.format

## [1] "10072015" "11022015"

Solution 5

my.date2 <- read.csv('E:/RIntro/datasets/dates_values.csv', header = TRUE)
str(my.date2) #Notice these were imported as Excel integer dates

## 'data.frame':    10 obs. of  3 variables:
##  $ Date  : int  42393 42394 42395 42396 42397 42398 42399 42400 42401 42402
##  $ Value : int  1 2 3 4 5 6 7 8 9 10
##  $ Value2: int  5 8 -99 3 4 -999 6 1 3 4

new.date2 <- as.Date(my.date2[,1], origin = '1899-12-30') #Utilize the Excel origin date to convert these into an R date 
new.date.format2 <- format(new.date2,format="%m-%d-%Y") #Change the format from the default R date format ("yyyy-mm-dd") to the format you provided
new.date.format2

##  [1] "01-24-2016" "01-25-2016" "01-26-2016" "01-27-2016" "01-28-2016"
##  [6] "01-29-2016" "01-30-2016" "01-31-2016" "02-01-2016" "02-02-2016"

Solution 6

library(region5air)
data(chicago_air)
chi_air_desc <- chicago_air[order(-chicago_air$ozone),]
head(chi_air_desc)

##           date ozone temp solar month weekday
## 134 2013-05-14 0.081   74  1.40     5       3
## 252 2013-09-09 0.078   83  1.11     9       2
## 171 2013-06-20 0.074   80  1.35     6       5
## 139 2013-05-19 0.069   73  1.21     5       1
## 140 2013-05-20 0.069   81  1.38     5       2
## 121 2013-05-01 0.068   80  1.36     5       4

write.csv(chi_air_desc,file="E:/RIntro/datasets/OzoneSorted.csv")