Advanced R Programming assignment part 2

Assignment Part 2: Longitudinal Data Class and Methods

The purpose of this part is to create a new class for representing longitudinal data, which is data that is collected over time on a given subject/person. This data may be collected at multiple visits, in multiple locations. You will need to write a series of generics and methods for interacting with this kind of data.

The data for this part come from a small study on indoor air pollution on 10 subjects. Each subject was visited 3 times for data collection. Indoor air pollution was measured using a high-resolution monitor which records pollutant levels every 5 minutes and the monitor was placed in the home for about 1 week. In addition to measuring pollutant levels in the bedroom, a separate monitor was usually placed in another room in the house at roughly the same time.

Before doing this part you may want to review the section on object oriented programming (you can also read that section here).

The variables in the dataset are

id: the subject identification number
visit: the visit number which can be 0, 1, or 2
room: the room in which the monitor was placed
value: the level of pollution in micrograms per cubic meter
timepoint: the time point of the monitor value for a given visit/room
You will need to design a class called “LongitudinalData” that characterizes the structure of this longitudinal dataset. You will also need to design classes to represent the concept of a “subject”, a “visit”, and a “room”.

In addition you will need to implement the following functions

make_LD: a function that converts a data frame into a “LongitudinalData” object
subject: a generic function for extracting subject-specific information
visit: a generic function for extracting visit-specific information
room: a generic function for extracting room-specific information For each generic/class combination you will need to implement a method, although not all combinations are necessary (see below). You will also need to write print and summary methods for some classes (again, see below).

To complete this Part, you can use either the S3 system, the S4 system, or the reference class system to implement the necessary functions. It is probably not wise to mix any of the systems together, but you should be able to compete the assignment using any of the three systems. The amount of work required should be the same when using any of the systems.

In order to submit this assignment, please prepare two files:

oop_code.R: an R script file that contains the code implementing your classes, methods, and generics for the longitudinal dataset.
oop_output.txt: a text file containing the output of running the above input code.

Loading data

In this part the dats form the Longitudinal Dataset is loaded (it was downloaded manually) and a first glimp on the data is printed.

URLfile <- "C:/Users/menno_000/Documents/R/Course R Programming/Course 1/data/MIE.csv"
longdata <- read.csv(URLfile, sep = ",")
class(longdata)

## [1] "data.frame"

summary(longdata)

##        id          visit                 room           value         
##  Min.   : 14   Min.   :0.000   bedroom     :54827   Min.   :   2.000  
##  1st Qu.: 41   1st Qu.:0.000   living room :37564   1st Qu.:   2.750  
##  Median : 46   Median :1.000   den         : 7198   Median :   7.875  
##  Mean   : 57   Mean   :1.001   kitchen     : 4032   Mean   :  17.412  
##  3rd Qu.: 74   3rd Qu.:2.000   tv room     : 4017   3rd Qu.:  16.000  
##  Max.   :106   Max.   :2.000   family  room: 3879   Max.   :1775.000  
##                                (Other)     : 9360                     
##    timepoint   
##  Min.   :   1  
##  1st Qu.: 562  
##  Median :1065  
##  Mean   :1088  
##  3rd Qu.:1569  
##  Max.   :3075  
##

lapply(longdata, class)

## $id
## [1] "integer"
## 
## $visit
## [1] "integer"
## 
## $room
## [1] "factor"
## 
## $value
## [1] "numeric"
## 
## $timepoint
## [1] "integer"

Transformation in object

In the link: https://stackoverflow.com/questions/27219132/creating-classes-in-r-s3-s4-r5-rc-or-r6 the choice between the different implementation structures is defined. In this case a S3 implementation is sufficient.

Based on the output in the OOPoutput.R file you can see that there have to be methods for generic questions, subject, visit and room. In the code the results from the file oop_output.txt is in the explanatory lines.

# Define functions -----------------------------------------------------
subject <- function(loadlongdata, id) UseMethod("subject")

visit <- function(subject, visitid) UseMethod("visit")

room <- function(visit, roomid) UseMethod("room")

# Define methods for LongitudionalData objects 
make_LD <- function(longdata) {
  loadlongdata <- longdata %>% nest(-id)
  structure(loadlongdata, class = c("LongitudinalData"))
}

# oop_output.txt: command: print(class(x)) result: [1] "LongitudinalData"
print.LongitudinalData <- function(variable) {
  cat("Longitudinal dataset with", length(variable[["id"]]), "subjects")
  invisible(variable)
}

subject.LongitudinalData <- function(loadlongdata, id) {
  index <- which(loadlongdata[["id"]] == id)
  if (length(index) == 0)
    return(NULL)
  structure(list(id = id, data = loadlongdata[["data"]][[index]]), class = "Subject")
}

# Define methods for Subject objects

# oop_output.txt: command: out <- subject(x, 10) (Longitudinal dataset with 10 subjects) 
# result: Subject 10 doesn't exist
# oop_output.txt: command: out <- subject(x, 14), result: Subject ID: 14

print.Subject <- function(variable) {
  cat("Subject ID:", variable[["id"]])
  invisible(variable)
}

# oop_output.txt: command: out <- subject(x, 54) %>% summary, result: 
# ID: 54 
#   visit  bedroom       den living room    office
# 1     0       NA        NA    2.792601 13.255475
# 2     1       NA 13.450946          NA  4.533921
# 3     2 4.193721  3.779225          NA        NA


summary.Subject <- function(object) {
  output <- object[["data"]] %>% 
    group_by(visit, room) %>%
    summarise(value = mean(value)) %>% 
    spread(room, value) %>% 
    as.data.frame
  structure(list(id = object[["id"]],
                 output = output), class = "Summary")
}

visit.Subject <- function(subject, visitid) {
  data <- subject[["data"]] %>% 
    filter(visit == visitid) %>% 
    select(-visit)
  structure(list(id = subject[["id"]],
                 visitid = visitid,
                 data = data), class = "Visit")
}

# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# as an example. The results are printed at the room methods

# Define methods for Visit objects
room.Visit <- function(visit, roomid) {
  data <- visit[["data"]] %>% 
    filter(room == roomid) %>% 
    select(-room)
  structure(list(id = visit[["id"]],
            visitid = visit[["visitid"]],
            room = roomid,
            data = data), class = "Room")
}

# Define methods for Room objects 
# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# result: 
# ID: 44 
# Visit: 0 
# Room: bedroom 

print.Room <- function(variable) {
  cat("ID:", variable[["id"]], "\n")
  cat("Visit:", variable[["visitid"]], "\n")
  cat("Room:", variable[["room"]])
  invisible(variable)
}

# Show a summary of the pollutant values
# oop_output.tvariablet: command:
# out <- subject(variable, 44) %>% visit(0) %>% room("bedroom") %>% summary
# result: 
summary.Room <- function(object) {
  output <- summary(object[["data"]][["value"]])
  structure(list(id = object[["id"]],
                 output = output), class = "Summary")
}

# Define methods for Summary objects
# out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
# ID: 44 
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#    2.75   14.00   24.00   41.37   37.00 1607.00 

print.Summary <- function(variable) {
  cat("ID:", variable[[1]], "\n")
  print(variable[[2]])
  invisible(variable)
}

Code van outputcode

The next code is directly copied from the file oop_output.R. The first variable name is changed in one used in the previous code and the code is devided in separate chunks.

x <- make_LD(longdata)
print(class(x))

## [1] "LongitudinalData"

print(x)

## Longitudinal dataset with 10 subjects

## Subject 10 doesn't exist
out <- subject(x, 10)
print(out)

## NULL

out <- subject(x, 14)
print(out)

## Subject ID: 14

out <- subject(x, 54) %>% summary
print(out)

## ID: 54 
##   visit  bedroom       den living room    office
## 1     0       NA        NA    2.792601 13.255475
## 2     1       NA 13.450946          NA  4.533921
## 3     2 4.193721  3.779225          NA        NA

out <- subject(x, 14) %>% summary
print(out)

## ID: 14 
##   visit   bedroom family  room living room
## 1     0  4.786592           NA     2.75000
## 2     1  3.401442     8.426549          NA
## 3     2 18.583635           NA    22.55069

out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
print(out)

## ID: 44 
## Visit: 0 
## Room: bedroom

## Show a summary of the pollutant values
out <- subject(x, 44) %>% visit(0) %>% room("bedroom") %>% summary
print(out)

## ID: 44 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     8.0    30.0    51.0    88.8    80.0   911.0

out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
print(out)

## ID: 44 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.75   14.00   24.00   41.37   37.00 1607.00

Advanced R Programming assignment part 2

Menno Oerlemans

9 februari 2018

Assignment Part 2: Longitudinal Data Class and Methods

Loading data

Transformation in object

Code van outputcode