In the previous post, I worked on my own data. Now it’s time to try looking at the physionet SDDB database data.
I have copied the SDDB database and used the script to get to a txt files with raw heart rate data. Now, i’ll import the data into R and make plots of it and see what the SaX representation looks like (for a subset of 5 subjects).
Load the libraries
# load library TSclust
library(TSclust)
## Loading required package: wmtsa
## Loading required package: splus2R
## Loading required package: ifultools
## Loading required package: MASS
## Loading required package: pdc
## Loading required package: cluster
# load ggplot2 for plotting time series
library(ggplot2)
# load sewave library
library(seewave)
##
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Welcome to seewave !
## The package is regularly updated, please check for new version [http://rug.mnhn.fr/seewave]
## Thanks to use the right reference when citing seewave in publications
## See citation('seewave')
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# load list of subjects
subjectsList <- read.csv("/home//crtah/CloudStation/Projects//physionet/sddb/RECORDS", header = FALSE, col.names="Subject")
# isolate a subset of subjects (it takes too long otherwise)
subjectsList <- subjectsList[1:5,]
# load heart rates (constant intervals of measurement, frequency 4Hz) of subjects
heartRates <- list()
for (subject in subjectsList) {
filename <- paste0("/home//crtah/CloudStation/Projects//physionet/sddb/", subject, ".unauditedHRconstint.txt")
heartRate <- read.delim(filename, col.names=c("Seconds","HeartRate"))
heartRates[[as.character(subject)]] <- heartRate
rm(filename, heartRate)
}
A plot of the timeseries, PAA representation, SaX word plot, SaX word.
# loop through all the subjects
for (subject in subjectsList){
subject <- as.character(subject)
# print out subject ID
print(paste("Data for subject:", subject))
# plot a heart rate curve for each subject
#browser()
data = heartRates[[subject]]
plot <- ggplot(data = data) + geom_point(aes(x=Seconds, y=HeartRate), color="blue") + theme_bw()
print(plot)
# make a timeseries from the heart rate values (not really neccesary?)
heartRate <- ts(heartRates[[subject]]["HeartRate"], frequency = 4)
# do a PAA on the heart rate time series
PAA(heartRate, w=100)
# plot a SaX "curve"
SAX.plot(series=heartRate, w=100, alpha = 4, col.ser=c("black","blue","red"))
# produce a SaX of the heart rate time series
# using alphabet size of 4 and word lenght 100
SAX( x=heartRate, alphabet_size = 4, PAA_number = 100 )
}
## [1] "Data for subject: 30"
## [1] "Data for subject: 31"
## [1] "Data for subject: 32"
## [1] "Data for subject: 33"
## [1] "Data for subject: 34"
SaX representation of subject from the sddb database were done only as an exercise (in getting to the data as much as plotting it).