Loading the data and libraries

In the previous post, I worked on my own data. Now it’s time to try looking at the physionet SDDB database data.

I have copied the SDDB database and used the script to get to a txt files with raw heart rate data. Now, i’ll import the data into R and make plots of it and see what the SaX representation looks like (for a subset of 5 subjects).

Load the libraries

# load library TSclust
library(TSclust)
## Loading required package: wmtsa
## Loading required package: splus2R
## Loading required package: ifultools
## Loading required package: MASS
## Loading required package: pdc
## Loading required package: cluster
# load ggplot2 for plotting time series
library(ggplot2)
# load sewave library
library(seewave)
## 
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Welcome to seewave ! 
## The package is regularly updated, please check for new version [http://rug.mnhn.fr/seewave]
## Thanks to use the right reference when citing seewave in publications
## See citation('seewave')
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# load list of subjects
subjectsList <- read.csv("/home//crtah/CloudStation/Projects//physionet/sddb/RECORDS", header = FALSE, col.names="Subject")
# isolate a subset of subjects (it takes too long otherwise)
subjectsList <- subjectsList[1:5,]

# load heart rates (constant intervals of measurement, frequency 4Hz) of subjects
heartRates <- list()
for (subject in subjectsList) {
  filename <- paste0("/home//crtah/CloudStation/Projects//physionet/sddb/", subject, ".unauditedHRconstint.txt")
  heartRate <- read.delim(filename, col.names=c("Seconds","HeartRate"))
  heartRates[[as.character(subject)]] <- heartRate
  rm(filename, heartRate)
  }

Plot the timeseries

A plot of the timeseries, PAA representation, SaX word plot, SaX word.

# loop through all the subjects
for (subject in subjectsList){
  subject <- as.character(subject)
  # print out subject ID
  print(paste("Data for subject:", subject))
  # plot a heart rate curve for each subject
  #browser()
  data = heartRates[[subject]]
  plot <- ggplot(data = data) + geom_point(aes(x=Seconds, y=HeartRate), color="blue") + theme_bw()
  print(plot)
  
  # make a timeseries from the heart rate values (not really neccesary?)
  heartRate <-  ts(heartRates[[subject]]["HeartRate"], frequency = 4)  
  
  # do a PAA on the heart rate time series
  PAA(heartRate, w=100)
  
  # plot a SaX "curve"
  SAX.plot(series=heartRate, w=100, alpha = 4, col.ser=c("black","blue","red"))
  
  # produce a SaX of the heart rate time series
  # using alphabet size of 4 and word lenght 100
  SAX( x=heartRate, alphabet_size = 4, PAA_number = 100 )
  }
## [1] "Data for subject: 30"

plot of chunk plotHeartRatesplot of chunk plotHeartRates

## [1] "Data for subject: 31"

plot of chunk plotHeartRatesplot of chunk plotHeartRates

## [1] "Data for subject: 32"

plot of chunk plotHeartRatesplot of chunk plotHeartRates

## [1] "Data for subject: 33"

plot of chunk plotHeartRatesplot of chunk plotHeartRates

## [1] "Data for subject: 34"

plot of chunk plotHeartRatesplot of chunk plotHeartRates

Conclusion

SaX representation of subject from the sddb database were done only as an exercise (in getting to the data as much as plotting it).