Assignment Part 2: Longitudinal Data Class and Methods

The purpose of this part is to create a new class for representing longitudinal data, which is data that is collected over time on a given subject/person. This data may be collected at multiple visits, in multiple locations. You will need to write a series of generics and methods for interacting with this kind of data.

The data for this part come from a small study on indoor air pollution on 10 subjects. Each subject was visited 3 times for data collection. Indoor air pollution was measured using a high-resolution monitor which records pollutant levels every 5 minutes and the monitor was placed in the home for about 1 week. In addition to measuring pollutant levels in the bedroom, a separate monitor was usually placed in another room in the house at roughly the same time.

Before doing this part you may want to review the section on object oriented programming (you can also read that section here).

The variables in the dataset are

In addition you will need to implement the following functions

To complete this Part, you can use either the S3 system, the S4 system, or the reference class system to implement the necessary functions. It is probably not wise to mix any of the systems together, but you should be able to compete the assignment using any of the three systems. The amount of work required should be the same when using any of the systems.

In order to submit this assignment, please prepare two files:

Loading data

In this part the dats form the Longitudinal Dataset is loaded (it was downloaded manually) and a first glimp on the data is printed.

URLfile <- "C:/Users/menno_000/Documents/R/Course R Programming/Course 1/data/MIE.csv"
longdata <- read.csv(URLfile, sep = ",")
class(longdata)
## [1] "data.frame"
summary(longdata)
##        id          visit                 room           value         
##  Min.   : 14   Min.   :0.000   bedroom     :54827   Min.   :   2.000  
##  1st Qu.: 41   1st Qu.:0.000   living room :37564   1st Qu.:   2.750  
##  Median : 46   Median :1.000   den         : 7198   Median :   7.875  
##  Mean   : 57   Mean   :1.001   kitchen     : 4032   Mean   :  17.412  
##  3rd Qu.: 74   3rd Qu.:2.000   tv room     : 4017   3rd Qu.:  16.000  
##  Max.   :106   Max.   :2.000   family  room: 3879   Max.   :1775.000  
##                                (Other)     : 9360                     
##    timepoint   
##  Min.   :   1  
##  1st Qu.: 562  
##  Median :1065  
##  Mean   :1088  
##  3rd Qu.:1569  
##  Max.   :3075  
## 
lapply(longdata, class)
## $id
## [1] "integer"
## 
## $visit
## [1] "integer"
## 
## $room
## [1] "factor"
## 
## $value
## [1] "numeric"
## 
## $timepoint
## [1] "integer"

Transformation in object

In the link: https://stackoverflow.com/questions/27219132/creating-classes-in-r-s3-s4-r5-rc-or-r6 the choice between the different implementation structures is defined. In this case a S3 implementation is sufficient.

Based on the output in the OOPoutput.R file you can see that there have to be methods for generic questions, subject, visit and room. In the code the results from the file oop_output.txt is in the explanatory lines.

# Define functions -----------------------------------------------------
subject <- function(loadlongdata, id) UseMethod("subject")

visit <- function(subject, visitid) UseMethod("visit")

room <- function(visit, roomid) UseMethod("room")

# Define methods for LongitudionalData objects 
make_LD <- function(longdata) {
  loadlongdata <- longdata %>% nest(-id)
  structure(loadlongdata, class = c("LongitudinalData"))
}

# oop_output.txt: command: print(class(x)) result: [1] "LongitudinalData"
print.LongitudinalData <- function(variable) {
  cat("Longitudinal dataset with", length(variable[["id"]]), "subjects")
  invisible(variable)
}

subject.LongitudinalData <- function(loadlongdata, id) {
  index <- which(loadlongdata[["id"]] == id)
  if (length(index) == 0)
    return(NULL)
  structure(list(id = id, data = loadlongdata[["data"]][[index]]), class = "Subject")
}

# Define methods for Subject objects

# oop_output.txt: command: out <- subject(x, 10) (Longitudinal dataset with 10 subjects) 
# result: Subject 10 doesn't exist
# oop_output.txt: command: out <- subject(x, 14), result: Subject ID: 14

print.Subject <- function(variable) {
  cat("Subject ID:", variable[["id"]])
  invisible(variable)
}

# oop_output.txt: command: out <- subject(x, 54) %>% summary, result: 
# ID: 54 
#   visit  bedroom       den living room    office
# 1     0       NA        NA    2.792601 13.255475
# 2     1       NA 13.450946          NA  4.533921
# 3     2 4.193721  3.779225          NA        NA


summary.Subject <- function(object) {
  output <- object[["data"]] %>% 
    group_by(visit, room) %>%
    summarise(value = mean(value)) %>% 
    spread(room, value) %>% 
    as.data.frame
  structure(list(id = object[["id"]],
                 output = output), class = "Summary")
}

visit.Subject <- function(subject, visitid) {
  data <- subject[["data"]] %>% 
    filter(visit == visitid) %>% 
    select(-visit)
  structure(list(id = subject[["id"]],
                 visitid = visitid,
                 data = data), class = "Visit")
}

# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# as an example. The results are printed at the room methods

# Define methods for Visit objects
room.Visit <- function(visit, roomid) {
  data <- visit[["data"]] %>% 
    filter(room == roomid) %>% 
    select(-room)
  structure(list(id = visit[["id"]],
            visitid = visit[["visitid"]],
            room = roomid,
            data = data), class = "Room")
}

# Define methods for Room objects 
# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# result: 
# ID: 44 
# Visit: 0 
# Room: bedroom 

print.Room <- function(variable) {
  cat("ID:", variable[["id"]], "\n")
  cat("Visit:", variable[["visitid"]], "\n")
  cat("Room:", variable[["room"]])
  invisible(variable)
}

# Show a summary of the pollutant values
# oop_output.tvariablet: command:
# out <- subject(variable, 44) %>% visit(0) %>% room("bedroom") %>% summary
# result: 
summary.Room <- function(object) {
  output <- summary(object[["data"]][["value"]])
  structure(list(id = object[["id"]],
                 output = output), class = "Summary")
}

# Define methods for Summary objects
# out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
# ID: 44 
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#    2.75   14.00   24.00   41.37   37.00 1607.00 

print.Summary <- function(variable) {
  cat("ID:", variable[[1]], "\n")
  print(variable[[2]])
  invisible(variable)
}

Code van outputcode

The next code is directly copied from the file oop_output.R. The first variable name is changed in one used in the previous code and the code is devided in separate chunks.

x <- make_LD(longdata)
print(class(x))
## [1] "LongitudinalData"
print(x)
## Longitudinal dataset with 10 subjects
## Subject 10 doesn't exist
out <- subject(x, 10)
print(out)
## NULL
out <- subject(x, 14)
print(out)
## Subject ID: 14
out <- subject(x, 54) %>% summary
print(out)
## ID: 54 
##   visit  bedroom       den living room    office
## 1     0       NA        NA    2.792601 13.255475
## 2     1       NA 13.450946          NA  4.533921
## 3     2 4.193721  3.779225          NA        NA
out <- subject(x, 14) %>% summary
print(out)
## ID: 14 
##   visit   bedroom family  room living room
## 1     0  4.786592           NA     2.75000
## 2     1  3.401442     8.426549          NA
## 3     2 18.583635           NA    22.55069
out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
print(out)
## ID: 44 
## Visit: 0 
## Room: bedroom
## Show a summary of the pollutant values
out <- subject(x, 44) %>% visit(0) %>% room("bedroom") %>% summary
print(out)
## ID: 44 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     8.0    30.0    51.0    88.8    80.0   911.0
out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
print(out)
## ID: 44 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2.75   14.00   24.00   41.37   37.00 1607.00