The purpose of this part is to create a new class for representing longitudinal data, which is data that is collected over time on a given subject/person. This data may be collected at multiple visits, in multiple locations. You will need to write a series of generics and methods for interacting with this kind of data.
The data for this part come from a small study on indoor air pollution on 10 subjects. Each subject was visited 3 times for data collection. Indoor air pollution was measured using a high-resolution monitor which records pollutant levels every 5 minutes and the monitor was placed in the home for about 1 week. In addition to measuring pollutant levels in the bedroom, a separate monitor was usually placed in another room in the house at roughly the same time.
Before doing this part you may want to review the section on object oriented programming (you can also read that section here).
The variables in the dataset are
In addition you will need to implement the following functions
To complete this Part, you can use either the S3 system, the S4 system, or the reference class system to implement the necessary functions. It is probably not wise to mix any of the systems together, but you should be able to compete the assignment using any of the three systems. The amount of work required should be the same when using any of the systems.
In order to submit this assignment, please prepare two files:
In this part the dats form the Longitudinal Dataset is loaded (it was downloaded manually) and a first glimp on the data is printed.
URLfile <- "C:/Users/menno_000/Documents/R/Course R Programming/Course 1/data/MIE.csv"
longdata <- read.csv(URLfile, sep = ",")
class(longdata)
## [1] "data.frame"
summary(longdata)
## id visit room value
## Min. : 14 Min. :0.000 bedroom :54827 Min. : 2.000
## 1st Qu.: 41 1st Qu.:0.000 living room :37564 1st Qu.: 2.750
## Median : 46 Median :1.000 den : 7198 Median : 7.875
## Mean : 57 Mean :1.001 kitchen : 4032 Mean : 17.412
## 3rd Qu.: 74 3rd Qu.:2.000 tv room : 4017 3rd Qu.: 16.000
## Max. :106 Max. :2.000 family room: 3879 Max. :1775.000
## (Other) : 9360
## timepoint
## Min. : 1
## 1st Qu.: 562
## Median :1065
## Mean :1088
## 3rd Qu.:1569
## Max. :3075
##
lapply(longdata, class)
## $id
## [1] "integer"
##
## $visit
## [1] "integer"
##
## $room
## [1] "factor"
##
## $value
## [1] "numeric"
##
## $timepoint
## [1] "integer"
In the link: https://stackoverflow.com/questions/27219132/creating-classes-in-r-s3-s4-r5-rc-or-r6 the choice between the different implementation structures is defined. In this case a S3 implementation is sufficient.
Based on the output in the OOPoutput.R file you can see that there have to be methods for generic questions, subject, visit and room. In the code the results from the file oop_output.txt is in the explanatory lines.
# Define functions -----------------------------------------------------
subject <- function(loadlongdata, id) UseMethod("subject")
visit <- function(subject, visitid) UseMethod("visit")
room <- function(visit, roomid) UseMethod("room")
# Define methods for LongitudionalData objects
make_LD <- function(longdata) {
loadlongdata <- longdata %>% nest(-id)
structure(loadlongdata, class = c("LongitudinalData"))
}
# oop_output.txt: command: print(class(x)) result: [1] "LongitudinalData"
print.LongitudinalData <- function(variable) {
cat("Longitudinal dataset with", length(variable[["id"]]), "subjects")
invisible(variable)
}
subject.LongitudinalData <- function(loadlongdata, id) {
index <- which(loadlongdata[["id"]] == id)
if (length(index) == 0)
return(NULL)
structure(list(id = id, data = loadlongdata[["data"]][[index]]), class = "Subject")
}
# Define methods for Subject objects
# oop_output.txt: command: out <- subject(x, 10) (Longitudinal dataset with 10 subjects)
# result: Subject 10 doesn't exist
# oop_output.txt: command: out <- subject(x, 14), result: Subject ID: 14
print.Subject <- function(variable) {
cat("Subject ID:", variable[["id"]])
invisible(variable)
}
# oop_output.txt: command: out <- subject(x, 54) %>% summary, result:
# ID: 54
# visit bedroom den living room office
# 1 0 NA NA 2.792601 13.255475
# 2 1 NA 13.450946 NA 4.533921
# 3 2 4.193721 3.779225 NA NA
summary.Subject <- function(object) {
output <- object[["data"]] %>%
group_by(visit, room) %>%
summarise(value = mean(value)) %>%
spread(room, value) %>%
as.data.frame
structure(list(id = object[["id"]],
output = output), class = "Summary")
}
visit.Subject <- function(subject, visitid) {
data <- subject[["data"]] %>%
filter(visit == visitid) %>%
select(-visit)
structure(list(id = subject[["id"]],
visitid = visitid,
data = data), class = "Visit")
}
# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# as an example. The results are printed at the room methods
# Define methods for Visit objects
room.Visit <- function(visit, roomid) {
data <- visit[["data"]] %>%
filter(room == roomid) %>%
select(-room)
structure(list(id = visit[["id"]],
visitid = visit[["visitid"]],
room = roomid,
data = data), class = "Room")
}
# Define methods for Room objects
# oop_output.txt: command: out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
# result:
# ID: 44
# Visit: 0
# Room: bedroom
print.Room <- function(variable) {
cat("ID:", variable[["id"]], "\n")
cat("Visit:", variable[["visitid"]], "\n")
cat("Room:", variable[["room"]])
invisible(variable)
}
# Show a summary of the pollutant values
# oop_output.tvariablet: command:
# out <- subject(variable, 44) %>% visit(0) %>% room("bedroom") %>% summary
# result:
summary.Room <- function(object) {
output <- summary(object[["data"]][["value"]])
structure(list(id = object[["id"]],
output = output), class = "Summary")
}
# Define methods for Summary objects
# out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
# ID: 44
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 2.75 14.00 24.00 41.37 37.00 1607.00
print.Summary <- function(variable) {
cat("ID:", variable[[1]], "\n")
print(variable[[2]])
invisible(variable)
}
The next code is directly copied from the file oop_output.R. The first variable name is changed in one used in the previous code and the code is devided in separate chunks.
x <- make_LD(longdata)
print(class(x))
## [1] "LongitudinalData"
print(x)
## Longitudinal dataset with 10 subjects
## Subject 10 doesn't exist
out <- subject(x, 10)
print(out)
## NULL
out <- subject(x, 14)
print(out)
## Subject ID: 14
out <- subject(x, 54) %>% summary
print(out)
## ID: 54
## visit bedroom den living room office
## 1 0 NA NA 2.792601 13.255475
## 2 1 NA 13.450946 NA 4.533921
## 3 2 4.193721 3.779225 NA NA
out <- subject(x, 14) %>% summary
print(out)
## ID: 14
## visit bedroom family room living room
## 1 0 4.786592 NA 2.75000
## 2 1 3.401442 8.426549 NA
## 3 2 18.583635 NA 22.55069
out <- subject(x, 44) %>% visit(0) %>% room("bedroom")
print(out)
## ID: 44
## Visit: 0
## Room: bedroom
## Show a summary of the pollutant values
out <- subject(x, 44) %>% visit(0) %>% room("bedroom") %>% summary
print(out)
## ID: 44
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8.0 30.0 51.0 88.8 80.0 911.0
out <- subject(x, 44) %>% visit(1) %>% room("living room") %>% summary
print(out)
## ID: 44
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.75 14.00 24.00 41.37 37.00 1607.00