Recreating “Bilateral step length estimation using a single inertial measurement unit attached to the pelvis”[1] using the “Human Activity Recognition with Smartphones”[2] data.

  1. https://jneuroengrehab.biomedcentral.com/articles/10.1186/1743-0003-9-9
  2. https://www.kaggle.com/uciml/human-activity-recognition-with-smartphones

Method

Wavelet Decomposition Instructions

The X and Y accelerometer signals were decomposed using a “Stationary wavelet decomposition”. A Daubechies level 5 (“db5”) mother wavelet was chosen given its similarity to the IMU signals in the proximity of HS. The original signals were then decomposed in an approximation curve plus ten levels of detail. Thresholds were applied to the first three detail levels and the other detail levels were discarded. Thresholds for these levels were 1/5, 1/4 and 1/3 of its magnitude for the first, second and third level, respectively. The signals were reconstructed using only the first three levels of detail after thresholds were applied. An interval of interest for the accelerometer signals was defined as the interval of time during which the reconstructed signals differed from zero.

Gait Cycle Feature Identification

Based on the preliminary visual investigation, the right HS (rHS) was detected as the instant of time in the middle between the maximum of the Y accelerometer signal and the first minimum of the accelerometer signal along the X accelerometer signal in the corresponding intervals of interest. The identification of the left HS (lHS) required the identification of both right and left toe off instances (rTO, lTO). The lTO was detected as the instant of time in the middle between the first maximum of the Y accelerometer signal after the rHS and the second minimum of the X accelerometer signal in the corresponding intervals of interest. The rTO was found as the time of the minimum negative peak value between two consecutive lTO and rHS. The lHS was found as the first local maximum of the Z accelerometer signal before the rTO.

Clean Data

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.1     v dplyr   1.0.5
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## Warning: package 'tibble' was built under R version 4.0.5
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
setwd("~/CUNY/HAPT Data Set")

LOAD Prepped Data

Loading the raw accelerometer data.

filenames <- list.files("~/CUNY/HAPT Data Set/RawData", full.names=TRUE)
ldf <- lapply(filenames[1:122], read_file)
foo <- str_split(ldf, "\n", )
test <- data.frame(do.call(rbind, foo))
## Warning in (function (..., deparse.level = 1) : number of columns of result is
## not a multiple of vector length (arg 1)
test[1:3,1:3]
##                                                           X1
## 1  0.9180555898766518 -0.1124999994242935 0.5097222514293852
## 2  0.4430555869047616 0.03750000183409193 0.8888889575760315
## 3 0.4138889059802831 -0.01527777880638848 0.9222222655264142
##                                                           X2
## 1 0.9111111304603812 -0.09305556168259389 0.5375000404706096
## 2  0.4402778031382534 0.04166666836688091 0.8805556062765068
## 3 0.4138889059802831 -0.01944444533917746 0.9333334005924472
##                                                           X3
## 1  0.8819444981597608 -0.0861111144222878 0.5138889270791476
## 2   0.4513888895804282 0.0430555572111439 0.8763889306267443
## 3 0.4041666871094333 -0.02083333418344045 0.9263889411761765
testt <- as.data.frame(t(test))
names(testt) <- filenames[1:122]
testp <- pivot_longer(testt, cols = 1:122)
testp$name <- str_replace(testp$name, "C:/Users/Randy/Documents/CUNY/HAPT Data Set/RawData/", "")
testp$name <- str_replace(testp$name, ".txt", "")

testw <- cbind(str_split(testp$name, "_", simplify = TRUE), testp)
testww <- cbind(str_split(testp$value, " ", simplify = TRUE), testw)
names(testww) <- c("x", "y", "z", "sensor", "exp", "user", "name", "value")



#testww <- pivot_wider(testww, sensor)

testww$Index <- ave( 1:nrow(testww), factor(testww$exp), factor(testww$user), factor(testww$sensor), FUN=seq_along )

testww$exp <- str_replace(testww$exp, "exp0", "")
testww$exp <- str_replace(testww$exp, "exp", "")
testww$user <- str_replace(testww$user, "user0", "")
testww$user <- str_replace(testww$user, "user", "")
testww$con <- paste(testww$exp, testww$user, testww$Index, sep = "_")
labels <- read_delim("C:/Users/Randy/Documents/CUNY/HAPT Data Set/RawData/labels.txt", " ", col_names = c("exp", "user", "activity", "a", "b"))
## 
## -- Column specification --------------------------------------------------------
## cols(
##   exp = col_double(),
##   user = col_double(),
##   activity = col_double(),
##   a = col_double(),
##   b = col_double()
## )
seqs <- mapply(FUN = function(a, b) {
      seq(from = a, to = b, by = 1)
  }, a = labels$a, b = labels$b)


df<-data.frame(lapply(seqs, "length<-", max(lengths(seqs))))

df <-t(df)

df <- cbind(labels, df)

df <- pivot_longer(df, 6:2037)

df$con <- paste(df$exp, df$user, df$value, sep = "_")
something <- left_join(testww, df, by = "con")

label_names <- read_delim("C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt", " ", col_names = c("activity", "label"))
## 
## -- Column specification --------------------------------------------------------
## cols(
##   activity = col_double(),
##   label = col_character()
## )
## Warning: 11 parsing failures.
## row col  expected    actual                                                              file
##   1  -- 2 columns 3 columns 'C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt'
##   2  -- 2 columns 3 columns 'C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt'
##   4  -- 2 columns 3 columns 'C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt'
##   5  -- 2 columns 3 columns 'C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt'
##   6  -- 2 columns 3 columns 'C:/Users/Randy/Documents/CUNY/HAPT Data Set/activity_labels.txt'
## ... ... ......... ......... .................................................................
## See problems(...) for more details.
something <- left_join(something, label_names, by = "activity")

something$exp.x <- as.numeric(something$exp.x)

rawAF <- something %>% 
  filter(label == "WALKING") %>% 
  arrange(exp.x)

rawAF %>% 
  count(exp.x)
##    exp.x    n
## 1      1 6708
## 2      2 6994
## 3      3 4282
## 4      4 4010
## 5      5 4434
## 6      6 3806
## 7      7 4372
## 8      8 4014
## 9      9 4310
## 10    10 3642
## 11    11 4160
## 12    12 3872
## 13    13 4144
## 14    14 3888
## 15    15 3628
## 16    16 3376
## 17    17 3684
## 18    18 3636
## 19    19 3774
## 20    21 3646
## 21    22 4244
## 22    23 4202
## 23    24 3886
## 24    25 3384
## 25    26 4130
## 26    27 3874
## 27    28 4306
## 28    29 4072
## 29    30 3906
## 30    31 3782
## 31    32 3704
## 32    33 3592
## 33    34 4504
## 34    35 4134
## 35    36 4034
## 36    37 3920
## 37    38 3870
## 38    39 3566
## 39    40 3790
## 40    41 3416
## 41    42 3868
## 42    43 3636
## 43    44 3264
## 44    45 3310
## 45    46 4184
## 46    47 4148
## 47    48 3956
## 48    49 4230
## 49    50 4982
## 50    51 5218
## 51    52 3928
## 52    53 4346
## 53    54 3972
## 54    55 4072
## 55    56 4130
## 56    57 3688
## 57    58 3710
## 58    59 3808
## 59    60 4644
## 60    61 4392
#now known as cleaned_raw

Data Load

Load data and create table to work with.

library(tidyverse)
raw <- read_csv("Cleaned_Raw")
acc <- raw %>% 
  dplyr::filter(sensor == "acc", exp.x == 1) %>% 
  dplyr::select(x, y, z, sensor, exp.x, Index) %>% 
  arrange(Index)

gyro <- raw %>% 
  dplyr::filter(sensor == "gyro", exp.x == 1) %>% 
  dplyr::select(x, y, z, sensor, exp.x, Index) %>% 
  arrange(Index)
names(gyro) <- c("gx", "gy", "gz", "sensor", "exp.x1", "Index")
signals <- cbind(acc[,1:3],gyro[,c(1:3, 6)])
head(signals)
##           x           y          z         gx         gy          gz Index
## 1 1.4208334 -0.34027778 -0.1250000 -0.2758057  1.6426166 -0.08216137  7496
## 2 1.0027778 -0.20416667 -0.1083333 -0.6759224  0.6704246 -0.08338311  7497
## 3 0.6833333 -0.06111112 -0.1083333 -1.1334605 -0.3915646  0.11881329  7498
## 4 0.7333334 -0.08333334 -0.1208333 -1.2907583 -0.7635816  0.10567969  7499
## 5 0.9569445 -0.26388890 -0.1375000 -1.2049317 -0.7596110  0.03451389  7500
## 6 1.0500000 -0.40277780 -0.1444445 -0.8530733 -0.6325510 -0.08704830  7501

Wavelet Decomposition

DB5 Sationary Wavelet Decomposition using wavethresh package. With a window size of 2^7, there are 7 levels of detail. The first level is the original signal so the first three levels of detail we take are 6, 5, and 4. This is where we apply the thresholds of 1/5, 1/4, and 1/3.

library(wavethresh)
ywd <- wd(signals$y[200:327], filter.number=5, family="DaubExPhase", type="station") 

ywdT1 <- threshold(ywd, policy="manual", value=.2,
    levels=6, #by.level = TRUE,
    verbose=TRUE)

ywdT2 <- threshold(ywd, policy="manual", value=.25,
    levels=5, #by.level = TRUE,
    verbose=TRUE)

ywdT3 <- threshold(ywd, policy="manual", value=1/3,
    levels=4, #by.level = TRUE,
    verbose=TRUE)



spec1 <- accessD(ywdT1, level = 6)
spec2 <- accessD(ywdT2, level = 5)
spec3 <- accessD(ywdT3, level = 4)

test <- as.data.frame(cbind(spec1, spec2, spec3, spec1+spec2+spec3, signals$y[1200:1327], signals$Index[1200:1327]))
head(test)
plot(ywd)

## [1] 1.088341 1.088341 1.088341 1.088341 1.088341 1.088341 1.088341
plot(spec1+spec2+spec3, type="l")

ggplot(test, aes(V6)) +
  geom_line(aes(y = spec1, color = "x")) + 
  geom_line(aes(y = spec2, color = "y")) + 
  geom_line(aes(y = spec3, color = "z")) + 
  geom_line(aes(y = V5, color = "a")) + 
  geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")

library(wavethresh)
xwd <- wd(signals$x[200:327], filter.number=5, family="DaubExPhase", type="station")


xwdT1 <- threshold(xwd, policy="manual", value=.2,
    levels=6, #by.level = TRUE,
    verbose=TRUE)

xwdT2 <- threshold(xwd, policy="manual", value=.25,
    levels=5, #by.level = TRUE,
    verbose=TRUE)

xwdT3 <- threshold(xwd, policy="manual", value=1/3,
    levels=4, #by.level = TRUE,
    verbose=TRUE)



spec1 <- accessD(xwdT1, level = 6)
spec2 <- accessD(xwdT2, level = 5)
spec3 <- accessD(xwdT3, level = 4)

test <- as.data.frame(cbind(spec1, spec2, spec3, spec1+spec2+spec3, signals$x[1200:1327], signals$Index[1200:1327]))
plot(xwd)

## [1] 1.568444 1.568444 1.568444 1.568444 1.568444 1.568444 1.568444
plot(spec1+spec2+spec3, type="l")

ggplot(test, aes(V6)) +
  geom_line(aes(y = spec1, color = "x")) + 
  geom_line(aes(y = spec2, color = "y")) + 
  geom_line(aes(y = spec3, color = "z")) + 
  geom_line(aes(y = V5, color = "a")) + 
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")

#plot(getpacket(ywd, level = 2, index = 1))

Create Function

stepwaves <- function(chunk) {

xwd <- wd(chunk$x, filter.number=5, family="DaubExPhase", type="station")


xwdT1 <- threshold(xwd, policy="manual", value=.2,
    levels=6, #by.level = TRUE,
    verbose=TRUE)

xwdT2 <- threshold(xwd, policy="manual", value=.25,
    levels=5, #by.level = TRUE,
    verbose=TRUE)

xwdT3 <- threshold(xwd, policy="manual", value=1/3,
    levels=4, #by.level = TRUE,
    verbose=TRUE)



spec1x <- accessD(xwdT1, level = 6)
spec2x <- accessD(xwdT2, level = 5)
spec3x <- accessD(xwdT3, level = 4)

ywd <- wd(chunk$y, filter.number=5, family="DaubExPhase", type="station")


ywdT1 <- threshold(ywd, policy="manual", value=.2,
    levels=6, #by.level = TRUE,
    verbose=TRUE)

ywdT2 <- threshold(ywd, policy="manual", value=.25,
    levels=5, #by.level = TRUE,
    verbose=TRUE)

ywdT3 <- threshold(ywd, policy="manual", value=1/3,
    levels=4, #by.level = TRUE,
    verbose=TRUE)



spec1y <- accessD(ywdT1, level = 6)
spec2y <- accessD(ywdT2, level = 5)
spec3y <- accessD(ywdT3, level = 4)

xdec <- spec1x+spec2x+spec3x
ydec <- spec1y+spec2y+spec3y
x <- chunk$x
y <- chunk$y
z <- chunk$z
user <- chunk$user
exp <- chunk$exp
index <- chunk$Index

allwaves <- as.data.frame(cbind(spec1x, spec2x, spec3x, xdec, spec1y, spec2y, spec3y, ydec, x, y, z, user, exp, index))

return(allwaves)

}

Add data for all walking experiments

explist <- c(1,3,5,7,9,11,13,15,17,19,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60)

acc <- raw %>% 
  dplyr::filter(sensor == "acc") %>% #, exp.x %in% explist) %>% 
  dplyr::select(x, y, z, sensor, exp.x, user.x, Index) %>% 
  arrange(Index)

gyro <- raw %>% 
  dplyr::filter(sensor == "gyro") %>% #, exp.x %in% explist ) %>% 
  dplyr::select(x, y, z, sensor, exp.x, user.x, Index) %>% 
  arrange(Index)
names(gyro) <- c("gx", "gy", "gz", "sensor", "exp", "user", "Index")
signals <- cbind(acc[,1:3],gyro[,c(1:3, 5:7)])

Window every 128

#test <- signals %>% dplyr::filter(exp == 1)

#tmp <- split(test,(seq(nrow(test))-1) %/% 128) 

#big <- lapply(tmp[1:(length(tmp)-1)], stepwaves)

signals$exp1 <- signals$exp

test <- signals %>% group_by(exp1) %>% group_map(split)

bar <- list()

for (i in test){
  foo = i[[1]]
  big = split(foo, (seq(nrow(foo))-1) %/% 128)
  bash = lapply(big[1:(length(big)-1)], stepwaves)
  bar <- append(bar, bash)
}

df <- do.call(rbind.data.frame, bar)
head(df)

Results for Wavelet Decomposition

Let’s put them together.

It looks nothing like the diagram in the original study but it looks close for the data here. https://jneuroengrehab.biomedcentral.com/articles/10.1186/1743-0003-9-9/figures/2

ggplot(df[10000:10207,], aes(index)) +
  geom_line(aes(y = xdec, color = "xd")) + 
  geom_line(aes(y = ydec, color = "yd")) + 
  geom_line(aes(y = x, color = "x")) + 
  geom_line(aes(y = y, color = "y")) + 
  #geom_point(aes(y = rhs-1, color = "rsh")) +
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")

Finding gait cycle range with Autocorrelation

detach("package:wavethresh")
library(tidyverse)
#gait <- read_csv("step_output_v1")
slim <- df %>% dplyr::select(x,y,z,exp)

slim <- slim %>% 
  mutate(mag = sqrt(((x-1)^2)+(y^2)+(z^2)))
library(fpp2)
## Warning: package 'fpp2' was built under R version 4.0.5
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.4 --
## v forecast  8.14     v expsmooth 2.3 
## v fma       2.4
## Warning: package 'forecast' was built under R version 4.0.5
## Warning: package 'fma' was built under R version 4.0.5
## Warning: package 'expsmooth' was built under R version 4.0.5
## 
#turns out only 1 experiment per subject was using lateral pelvic sensor location.
explist <- c(1,3,5,7,9,11,13,15,17,19,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60)
allexp <- c(1:19,21:61)
y <- data.frame()

for(i in c(1:19, 21:61)){#explist){
mag1 <- slim %>% 
  filter(exp == i) %>% #group_by(exp) %>% 
  dplyr::select(mag)

acftest <- forecast::ggAcf(mag1, lag=100)

x <- acftest$data[40:70,] %>% filter(Freq == max(Freq))

x <- as.data.frame(x)

x$exp <- i

y <- rbind(y, x)
#print(ggAcf(mag1, lag=100))


#print(autoplot(ts(mag1[200:400,])))
}

acflist <- y
detach("package:fpp2")
acftest <- forecast::ggAcf(mag1, lag=100)
max(acftest$data$Freq[40:70])
## [1] 0.7199533
acftest$data[40:70,] %>% filter(Freq == max(Freq))
##   Var2 Var3      Freq lag
## 1  mag  mag 0.7199533  60

Gait Cycle Feature Creation

Tagging local minima and maxima within the windows of interest.

df <- df %>% 
  mutate(xwin = if_else(xdec != 0, 1, 0)) %>% 
  mutate(ywin = if_else(ydec != 0, 1, 0)) %>% 
  mutate(rownum = row_number())

library(data.table)
xwinlist <- list()
ywinlist <- list()
xwinlist <- split(seq_along(df$xwin[1:188784])[df$xwin != 0], rleid(df$xwin[1:188784])[df$xwin != 0])
ywinlist <- split(seq_along(df$ywin[1:188784])[df$ywin != 0], rleid(df$ywin[1:188784])[df$ywin != 0])


xlocalmax <- which(diff(sign(diff(df$x)))==-2)+1 #max
xlocalmax <- as.data.frame(xlocalmax) %>% mutate(xmax = 1)
names(xlocalmax) <- c("rownum", "xmax")

xlocalmin <- which(diff(sign(diff(df$x)))==+2)+1 #min
xlocalmin <- as.data.frame(xlocalmin) %>% mutate(xmin = 1)
names(xlocalmin) <- c("rownum", "xmin")

ylocalmax <- which(diff(sign(diff(df$y)))==-2)+1 #max
ylocalmax <- as.data.frame(ylocalmax) %>% mutate(ymax = 1)
names(ylocalmax) <- c("rownum", "ymax")

ylocalmin <- which(diff(sign(diff(df$y)))==+2)+1 #min
ylocalmin <- as.data.frame(ylocalmin) %>% mutate(ymin = 1)
names(ylocalmin) <- c("rownum", "ymin")

zlocalmax <- which(diff(sign(diff(df$z)))==-2)+1 #max
zlocalmax <- as.data.frame(zlocalmax) %>% mutate(zmax = 1)
names(zlocalmax) <- c("rownum", "zmax")


df <- df %>% 
  left_join(xlocalmax, by = "rownum") %>%
  left_join(xlocalmin, by = "rownum") %>%
  left_join(ylocalmax, by = "rownum") %>%
  left_join(ylocalmin, by = "rownum") %>% 
  left_join(zlocalmax, by = "rownum")

df <- df %>% 
  mutate(rlidx = rleid(xwin==0)) %>% 
  mutate(rlidy = rleid(ywin==0))
head(df)
##       spec1x      spec2x spec3x        xdec spec1y      spec2y      spec3y
## 1  0.1353014 -0.30450842      0 -0.16920704      0  0.07651245  0.00000000
## 2 -0.0100473 -0.26462043      0 -0.27466773      0  0.01334185 -0.03472984
## 3  0.0000000  0.00000000      0  0.00000000      0  0.00000000 -0.11342998
## 4  0.0000000  0.14942648      0  0.14942648      0 -0.03043552 -0.03941623
## 5  0.0000000  0.07618343      0  0.07618343      0  0.00000000  0.00000000
## 6  0.0000000  0.00000000      0  0.00000000      0  0.00000000  0.00000000
##          ydec         x           y          z user exp index xwin ywin rownum
## 1  0.07651245 1.4208334 -0.34027778 -0.1250000    1   1  7496    1    1      1
## 2 -0.02138799 1.0027778 -0.20416667 -0.1083333    1   1  7497    1    1      2
## 3 -0.11342998 0.6833333 -0.06111112 -0.1083333    1   1  7498    0    1      3
## 4 -0.06985175 0.7333334 -0.08333334 -0.1208333    1   1  7499    1    1      4
## 5  0.00000000 0.9569445 -0.26388890 -0.1375000    1   1  7500    1    0      5
## 6  0.00000000 1.0500000 -0.40277780 -0.1444445    1   1  7501    0    0      6
##   xmax xmin ymax ymin zmax rlidx rlidy
## 1   NA   NA   NA   NA   NA     1     1
## 2   NA   NA   NA   NA   NA     1     1
## 3   NA    1    1   NA   NA     2     1
## 4   NA   NA   NA   NA   NA     3     1
## 5   NA   NA   NA   NA   NA     3     2
## 6    1   NA   NA   NA   NA     4     2
ggplot(df[200:327,], aes(index)) +
  #geom_line(aes(y = xdec, color = "xd")) + 
  #geom_line(aes(y = ydec, color = "yd")) + 
  geom_line(aes(y = x, color = "x")) + 
  geom_line(aes(y = y, color = "y")) + 
  #geom_point(aes(y = rhs-1, color = "rsh")) +
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")

Right Heel Strike

RHS - halfway between max y and first x min

acflist
##    Var2 Var3      Freq lag exp
## 1   mag  mag 0.6231667  55   1
## 2   mag  mag 0.6862755  59   2
## 3   mag  mag 0.6157629  57   3
## 4   mag  mag 0.5493169  55   4
## 5   mag  mag 0.5402958  61   5
## 6   mag  mag 0.7360311  58   6
## 7   mag  mag 0.6139289  52   7
## 8   mag  mag 0.6000483  51   8
## 9   mag  mag 0.4920101  59   9
## 10  mag  mag 0.6121896  54  10
## 11  mag  mag 0.6114783  57  11
## 12  mag  mag 0.6772004  57  12
## 13  mag  mag 0.5820849  56  13
## 14  mag  mag 0.6761947  54  14
## 15  mag  mag 0.7126567  55  15
## 16  mag  mag 0.6600908  52  16
## 17  mag  mag 0.4714725  52  17
## 18  mag  mag 0.6814041  52  18
## 19  mag  mag 0.7301755  56  19
## 20  mag  mag 0.7387067  55  21
## 21  mag  mag 0.6759565  60  22
## 22  mag  mag 0.7594183  57  23
## 23  mag  mag 0.6936834  58  24
## 24  mag  mag 0.7235355  56  25
## 25  mag  mag 0.5939962  54  26
## 26  mag  mag 0.6732414  51  27
## 27  mag  mag 0.6238475  62  28
## 28  mag  mag 0.7835763  59  29
## 29  mag  mag 0.7215232  58  30
## 30  mag  mag 0.7197561  56  31
## 31  mag  mag 0.6875469  56  32
## 32  mag  mag 0.7498070  55  33
## 33  mag  mag 0.6826632  62  34
## 34  mag  mag 0.7287710  61  35
## 35  mag  mag 0.7369380  55  36
## 36  mag  mag 0.7526903  55  37
## 37  mag  mag 0.6467269  52  38
## 38  mag  mag 0.6650576  50  39
## 39  mag  mag 0.5916666  56  40
## 40  mag  mag 0.7112189  54  41
## 41  mag  mag 0.7443458  53  42
## 42  mag  mag 0.7730848  51  43
## 43  mag  mag 0.6973920  51  44
## 44  mag  mag 0.7463018  50  45
## 45  mag  mag 0.4902211  59  46
## 46  mag  mag 0.5391015  60  47
## 47  mag  mag 0.5958135  61  48
## 48  mag  mag 0.6102064  61  49
## 49  mag  mag 0.5276549  63  50
## 50  mag  mag 0.5302286  64  51
## 51  mag  mag 0.5361400  57  52
## 52  mag  mag 0.6855913  59  53
## 53  mag  mag 0.7500505  55  54
## 54  mag  mag 0.8085641  56  55
## 55  mag  mag 0.6392366  58  56
## 56  mag  mag 0.6962261  56  57
## 57  mag  mag 0.7777401  57  58
## 58  mag  mag 0.7861890  57  59
## 59  mag  mag 0.7901076  62  60
## 60  mag  mag 0.7199533  60  61
c <- data.frame()
a <- data.frame()
for(i in c(1:19,21:27,29:49,51:60)){
  t1 <- df %>% dplyr::filter(exp == acflist$exp[i]) %>% dplyr::select(y, index, rownum, ydec, spec2y)

  t2 <- t1 %>% filter(rownum %in% rownum[1:200]) %>% dplyr::filter(spec2y != 0) %>%  filter(y == min(y))
  
  a <- t2$rownum
  b <- max(t1$rownum)
  
  t3 <- df %>% dplyr::filter(exp == acflist$exp[i]) %>% 
    dplyr::filter(ydec != 0) %>% 
    dplyr::select(y, index, rownum)
  repeat{
    if(
      count(t3 %>% 
            dplyr::filter(rownum %in% 
                          (tail(a, 1)+acflist$lag[i]-10):(tail(a,1)+acflist$lag[i]+10)
                          ) 
            ) == 0
    ){
      t4 <- t1 %>% dplyr::filter(rownum %in% (tail(a, 1)+acflist$lag[i]-10):(tail(a,1)+acflist$lag[i]+10)) %>% filter(y == max(y)) %>% tail(1)
    } else {
      t4 <- t3 %>% dplyr::filter(rownum %in% (tail(a, 1)+acflist$lag[i]-10):(tail(a,1)+acflist$lag[i]+10)) %>% filter(y == max(y)) %>% tail(1)
    }
    
    a <- append(a, t4$rownum)
    if (tail(a, 1) >  (b-acflist$lag[i])) {
      break
    }
  }
  c <- append(c, a)
}

Don’t use

c <- c %>% unlist() %>% as.data.frame() %>% mutate(j = 1) 
names(c) <- c("rownum", "rhs")
c <- c %>% dplyr::filter(rownum != 0) %>% distinct()

df <- df %>% 
  left_join(c, by = "rownum")

Old RHS

The two red dots on this graph correctly identify right foot heel strike.

ggplot(df[200:327,], aes(index)) +
  #geom_line(aes(y = xdec, color = "xd")) + 
  geom_line(aes(y = ydec, color = "yd")) + 
  geom_line(aes(y = x, color = "x")) + 
  geom_line(aes(y = y, color = "y")) + 
  geom_point(aes(y = rhs-1, color = "rhs")) +
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")

Left Toe Off

LTO - halfway between first max y after RHS and second x min

ymax <- df %>% dplyr::filter(ymax==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "y")

rhs <- df %>% dplyr::filter(rhs==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "rhs")

ymax_after_rhs <- rbind(ymax, rhs) %>% arrange(rownum)

x <- data.frame()

for(i in 1:nrow(ymax_after_rhs)){
  a <- if_else(ymax_after_rhs$var[i] == "rhs" & ymax_after_rhs$var[i+1] == "y" , 
               ymax_after_rhs$rownum[i+1], 0)
  x <- append(x,a)
}

x <- x %>% unlist() %>% as.data.frame() %>% mutate(j = "ymax_after_rhs")
names(x) <- c("rownum", "var")
ymax_after_rhs <- x %>% dplyr::filter(rownum != 0) %>% distinct()


xmin2 <- df %>% dplyr::filter(xwin==1,xmin==1) %>% 
  arrange(desc(rownum)) %>% 
  distinct(rlidx, .keep_all = TRUE) %>% #only keeps first xmin
  arrange(rownum) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "x2")

lto <- rbind(ymax_after_rhs, xmin2) %>% arrange(rownum)

x <- data.frame()

for(i in 1:nrow(lto)){
  a <- if_else(lto$var[i] == "ymax_after_rhs" & lto$var[i+1] == "x2" | lto$var[i-1] == "x2", 
    round((lto$rownum[i] + 
             if_else(abs(lto$rownum[i] - lto$rownum[i+1]) >
                       abs(lto$rownum[i] - lto$rownum[i-1]),
                     lto$rownum[i-1],
                     lto$rownum[i+1]
           ))/2), 0)
  x <- append(x,a)
}

x <- x %>% unlist() %>% as.data.frame() %>% mutate(j = 1) 
names(x) <- c("rownum", "lto")
x <- x %>% dplyr::filter(rownum != 0) %>% distinct()

df <- df %>% 
  left_join(x, by = "rownum")

Right Toe Off

RTO - y min between LTO to next RHS

lto <- df %>% dplyr::filter(lto==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "lto")

rhs <- df %>% dplyr::filter(rhs==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "rhs")

rto <- rbind(lto, rhs) %>% 
  arrange(var) %>% 
  distinct(rownum, .keep_all = TRUE) %>% #keep lto
  arrange(rownum)

x <- data.frame()
y <- data.frame()
a <- data.frame()
b <- data.frame()
for(i in 1:(nrow(rto)-1)){
  if(rto$var[i] == "lto" & rto$var[i+1] == "rhs"){ 
    a <- rto$rownum[i]
    b <- rto$rownum[i+1]
  }
    x <- append(x,a)
    y <- append(y,b)
}
x <- data.frame()
y <- data.frame()

for(i in 1:(nrow(rto)-1)){
  if(rto$var[i] == "lto" & rto$var[i+1] == "rhs"){ 
    a <- rto$rownum[i]
    b <- rto$rownum[i+1]
  }
    x <- append(x,a)
    y <- append(y,b)
}

x <- x %>% unlist() %>% as.data.frame() 
names(x) <- "seq1"
x <- x %>% distinct()

y <- y %>% unlist() %>% as.data.frame()  
names(y) <- "seq2"
y <- y %>% distinct()

find_ymin <- cbind(x,y)

seqs <- mapply(FUN = function(a, b) {
      seq(from = a, to = b, by = 1)
  }, a = find_ymin$seq1, b = find_ymin$seq2)

find_ymin <- data.frame(lapply(seqs, "length<-", max(lengths(seqs))))

find_ymin <- t(find_ymin)

test <- c(find_ymin) %>% na.omit() %>% as.data.frame()

names(test) <- "x"

test <- test %>% 
  arrange(x)

x <- test$x

y <- sort(x)
g <- cumsum(c(1, abs(y[-length(y)] - y[-1]) > 1))
test <- cbind(g, y) %>% as.data.frame()
names(test) <- c("seqs", "rownum")

findy <- df %>% 
  dplyr::select(rownum, y) %>% 
  left_join(test, by = "rownum") %>% 
  na.omit()

rto <- data.frame()

rto <- findy %>% 
  group_by(seqs) %>% 
  slice(which.min(y)) %>% 
  ungroup() %>% 
  dplyr::select(rownum) %>% 
  mutate(rto = 1)

df <- df %>% 
  left_join(rto, by = "rownum")

Left Heel Strike

LHS - first max z before RTO

zmax <- df %>% dplyr::filter(zmax==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "zmax")

rto <- df %>% dplyr::filter(rto==1) %>% 
  dplyr::select(rownum) %>% 
  mutate(var = "rto")

lhs <- rbind(zmax, rto) %>% 
  arrange(var) %>% 
  distinct(rownum, .keep_all = TRUE) %>% 
  arrange(rownum)

x <- data.frame()

for(i in 1:nrow(lhs)){
  a <- if_else(lhs$var[i] == "rto" & lhs$var[i-1] == "zmax",
    lhs$rownum[i-1],0)
  #return(a)
  x <- append(x,a)
}

x <- x %>% unlist() %>% as.data.frame() %>% mutate(j = 1) 
names(x) <- c("rownum", "lhs")
x <- x %>% dplyr::filter(rownum != 0) %>% distinct(, .keep_all = TRUE)

df <- df %>% 
  left_join(x, by = "rownum")

head(df)
##       spec1x      spec2x spec3x        xdec spec1y      spec2y      spec3y
## 1  0.1353014 -0.30450842      0 -0.16920704      0  0.07651245  0.00000000
## 2 -0.0100473 -0.26462043      0 -0.27466773      0  0.01334185 -0.03472984
## 3  0.0000000  0.00000000      0  0.00000000      0  0.00000000 -0.11342998
## 4  0.0000000  0.14942648      0  0.14942648      0 -0.03043552 -0.03941623
## 5  0.0000000  0.07618343      0  0.07618343      0  0.00000000  0.00000000
## 6  0.0000000  0.00000000      0  0.00000000      0  0.00000000  0.00000000
##          ydec         x           y          z user exp index xwin ywin rownum
## 1  0.07651245 1.4208334 -0.34027778 -0.1250000    1   1  7496    1    1      1
## 2 -0.02138799 1.0027778 -0.20416667 -0.1083333    1   1  7497    1    1      2
## 3 -0.11342998 0.6833333 -0.06111112 -0.1083333    1   1  7498    0    1      3
## 4 -0.06985175 0.7333334 -0.08333334 -0.1208333    1   1  7499    1    1      4
## 5  0.00000000 0.9569445 -0.26388890 -0.1375000    1   1  7500    1    0      5
## 6  0.00000000 1.0500000 -0.40277780 -0.1444445    1   1  7501    0    0      6
##   xmax xmin ymax ymin zmax rlidx rlidy rhs lto rto lhs
## 1   NA   NA   NA   NA   NA     1     1  NA  NA  NA  NA
## 2   NA   NA   NA   NA   NA     1     1  NA  NA  NA  NA
## 3   NA    1    1   NA   NA     2     1  NA  NA  NA  NA
## 4   NA   NA   NA   NA   NA     3     1  NA  NA  NA  NA
## 5   NA   NA   NA   NA   NA     3     2  NA  NA  NA  NA
## 6    1   NA   NA   NA   NA     4     2  NA  NA  NA  NA
#write.csv(df, "step_output_v1")

Gait Cycle Feature Results

This is 2 and a half gait cycles, there should be 10 dots.

ggplot(df[200:327,], aes(rownum)) +
  #geom_line(aes(y = xdec, color = "xd")) + 
  #geom_line(aes(y = ydec, color = "yd")) + 
  geom_line(aes(y = x, color = "x")) + 
  geom_line(aes(y = y, color = "y")) + 
  geom_point(aes(y = rhs-1, color = "rhs")) +
  geom_point(aes(y = lhs-1, color = "lhs")) +
  geom_point(aes(y = rto-1, color = "rto")) +
  geom_point(aes(y = lto-1, color = "lto")) +
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal")
## Warning: Removed 126 rows containing missing values (geom_point).
## Warning: Removed 125 rows containing missing values (geom_point).

## Warning: Removed 125 rows containing missing values (geom_point).

## Warning: Removed 125 rows containing missing values (geom_point).

New gait cycle rules

Preping data for picking gait cycle features in order.

Pros: maintaining gait cycle feature order will help filter extra features.

Con: missing features could result in a missed gait cycle.

gob <- df %>% dplyr::filter(rhs == 1 | lhs == 1 | rto == 1 | lto == 1) %>% 
  dplyr::select(rownum, rhs, lto, lhs, rto) 

gob <- gob %>% 
  mutate(ID = seq(1,nrow(gob),1))



data1 <- gob %>% pivot_longer(2:5) %>% na.omit() %>% as.data.frame()
data1 <- data1 %>% 
  mutate(ID1 = seq(1,nrow(data1),1))
names(data1) <- c("rownum", "ID","x", "other", "ID1")
#data1 <- data1 %>% mutate(ID = seq(1,nrow(data1),1))
list1 <- c("rhs", "lto", "lhs", "rto")


bim <- 0
doe <- 0

Match does not return list in order. Must break out match into each individual gait feature.

bim <- 0
doe <- 0
repeat{
  j<-length(data1$x)
  i <- tail(doe, n=1) 
foo <- match("rhs", data1$x[i:j])
doe <- foo+i
cu <- data1$ID1[doe]
bim <- append(bim, cu)
  i <- tail(doe, n=1) 
foo <- match("lto", data1$x[i:j])
doe <- foo+i
cu <- data1$ID1[doe]
bim <- append(bim, cu)
  i <- tail(doe, n=1) 
foo <- match("lhs", data1$x[i:j])
doe <- foo+i
cu <- data1$ID1[doe]
bim <- append(bim, cu)
  i <- tail(doe, n=1) 
foo <- match("rto", data1$x[i:j])
doe <- foo+i
cu <- data1$ID1[doe]
bim <- append(bim, cu)
  if(tail(bim, n=1) > 9386){#6470) { #7560) { #15924) {
    break
  }
}

trun <- c(1,2,4,5,6,7,9,10,11,12,16,17,21)
trun1 <- c(1,2,5,6,7,8,11,12,13,14,15,16,17,19,20,22,26,29,33 )

gait <- bim[2:11489]-1 
gait[1] <- 1

rownum <- as.data.frame(gait)

names(rownum) <- "ID1"

gait <- rownum %>% mutate(gait1 = "gait") #%>% mutate(step = rep_len(c("rhs", "lto", "lhs", "rto"), length(rownum)))

test <- data1 %>% left_join(gait, by = "ID1") %>% filter(gait1 == "gait") %>% pivot_wider(names_from = x, values_from = other)

test <- test %>% pivot_longer(cols = c("rhs", "lto", "lhs", "rto")) %>% filter(value == 1)

test1 <- test %>% dplyr::select(rownum, gait1, name)

df <- df %>% left_join(test1, by = "rownum")
head(test1)
## # A tibble: 6 x 3
##   rownum gait1 name 
##    <dbl> <chr> <chr>
## 1     66 gait  rhs  
## 2     69 gait  lto  
## 3    110 gait  lhs  
## 4    114 gait  rto  
## 5    117 gait  rhs  
## 6    121 gait  lto
head(df)
##       spec1x      spec2x spec3x        xdec spec1y      spec2y      spec3y
## 1  0.1353014 -0.30450842      0 -0.16920704      0  0.07651245  0.00000000
## 2 -0.0100473 -0.26462043      0 -0.27466773      0  0.01334185 -0.03472984
## 3  0.0000000  0.00000000      0  0.00000000      0  0.00000000 -0.11342998
## 4  0.0000000  0.14942648      0  0.14942648      0 -0.03043552 -0.03941623
## 5  0.0000000  0.07618343      0  0.07618343      0  0.00000000  0.00000000
## 6  0.0000000  0.00000000      0  0.00000000      0  0.00000000  0.00000000
##          ydec         x           y          z user exp index xwin ywin rownum
## 1  0.07651245 1.4208334 -0.34027778 -0.1250000    1   1  7496    1    1      1
## 2 -0.02138799 1.0027778 -0.20416667 -0.1083333    1   1  7497    1    1      2
## 3 -0.11342998 0.6833333 -0.06111112 -0.1083333    1   1  7498    0    1      3
## 4 -0.06985175 0.7333334 -0.08333334 -0.1208333    1   1  7499    1    1      4
## 5  0.00000000 0.9569445 -0.26388890 -0.1375000    1   1  7500    1    0      5
## 6  0.00000000 1.0500000 -0.40277780 -0.1444445    1   1  7501    0    0      6
##   xmax xmin ymax ymin zmax rlidx rlidy rhs lto rto lhs gait1 name
## 1   NA   NA   NA   NA   NA     1     1  NA  NA  NA  NA  <NA> <NA>
## 2   NA   NA   NA   NA   NA     1     1  NA  NA  NA  NA  <NA> <NA>
## 3   NA    1    1   NA   NA     2     1  NA  NA  NA  NA  <NA> <NA>
## 4   NA   NA   NA   NA   NA     3     1  NA  NA  NA  NA  <NA> <NA>
## 5   NA   NA   NA   NA   NA     3     2  NA  NA  NA  NA  <NA> <NA>
## 6    1   NA   NA   NA   NA     4     2  NA  NA  NA  NA  <NA> <NA>

Results after new gait cycle rules

We have our 10 dots. Beautiful.

expAll <- df %>% dplyr::filter(gait1 == "gait")

expAll <- left_join(df, expAll, by = "rownum")

head(expAll)
##     spec1x.x    spec2x.x spec3x.x      xdec.x spec1y.x    spec2y.x    spec3y.x
## 1  0.1353014 -0.30450842        0 -0.16920704        0  0.07651245  0.00000000
## 2 -0.0100473 -0.26462043        0 -0.27466773        0  0.01334185 -0.03472984
## 3  0.0000000  0.00000000        0  0.00000000        0  0.00000000 -0.11342998
## 4  0.0000000  0.14942648        0  0.14942648        0 -0.03043552 -0.03941623
## 5  0.0000000  0.07618343        0  0.07618343        0  0.00000000  0.00000000
## 6  0.0000000  0.00000000        0  0.00000000        0  0.00000000  0.00000000
##        ydec.x       x.x         y.x        z.x user.x exp.x index.x xwin.x
## 1  0.07651245 1.4208334 -0.34027778 -0.1250000      1     1    7496      1
## 2 -0.02138799 1.0027778 -0.20416667 -0.1083333      1     1    7497      1
## 3 -0.11342998 0.6833333 -0.06111112 -0.1083333      1     1    7498      0
## 4 -0.06985175 0.7333334 -0.08333334 -0.1208333      1     1    7499      1
## 5  0.00000000 0.9569445 -0.26388890 -0.1375000      1     1    7500      1
## 6  0.00000000 1.0500000 -0.40277780 -0.1444445      1     1    7501      0
##   ywin.x rownum xmax.x xmin.x ymax.x ymin.x zmax.x rlidx.x rlidy.x rhs.x lto.x
## 1      1      1     NA     NA     NA     NA     NA       1       1    NA    NA
## 2      1      2     NA     NA     NA     NA     NA       1       1    NA    NA
## 3      1      3     NA      1      1     NA     NA       2       1    NA    NA
## 4      1      4     NA     NA     NA     NA     NA       3       1    NA    NA
## 5      0      5     NA     NA     NA     NA     NA       3       2    NA    NA
## 6      0      6      1     NA     NA     NA     NA       4       2    NA    NA
##   rto.x lhs.x gait1.x name.x spec1x.y spec2x.y spec3x.y xdec.y spec1y.y
## 1    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
## 2    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
## 3    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
## 4    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
## 5    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
## 6    NA    NA    <NA>   <NA>       NA       NA       NA     NA       NA
##   spec2y.y spec3y.y ydec.y x.y y.y z.y user.y exp.y index.y xwin.y ywin.y
## 1       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
## 2       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
## 3       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
## 4       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
## 5       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
## 6       NA       NA     NA  NA  NA  NA     NA    NA      NA     NA     NA
##   xmax.y xmin.y ymax.y ymin.y zmax.y rlidx.y rlidy.y rhs.y lto.y rto.y lhs.y
## 1     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
## 2     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
## 3     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
## 4     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
## 5     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
## 6     NA     NA     NA     NA     NA      NA      NA    NA    NA    NA    NA
##   gait1.y name.y
## 1    <NA>   <NA>
## 2    <NA>   <NA>
## 3    <NA>   <NA>
## 4    <NA>   <NA>
## 5    <NA>   <NA>
## 6    <NA>   <NA>
exp4plots <- expAll %>% filter(exp.x == c(1)) %>% group_by(exp.x) %>% slice(200:327)

ggplot(exp4plots, aes(rownum)) +
  #geom_line(aes(y = xdec, color = "xd")) + 
  #geom_line(aes(y = ydec, color = "yd")) + 
  geom_line(aes(y = x.x, color = "x")) + 
  geom_line(aes(y = y.x, color = "y")) + 
  geom_point(aes(y = rhs.y-1, color = "rhs")) +
  geom_point(aes(y = lhs.y-1, color = "lhs")) +
  geom_point(aes(y = rto.y-1, color = "rto")) +
  geom_point(aes(y = lto.y-1, color = "lto")) +
  #geom_line(aes(y = V4, color = "w")) + 
  labs(x = "Time", y = "Signal") +
  facet_wrap(vars(exp.x), scales = "free_x") 

  #scale_x_continuous(limits = c(min_xy, max_xy))

Gait Cycle Analysis

#gait <- read_csv("step_output_v1")
gait <- expAll
cycle <- gait %>% filter(name.x == "rhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x) 

step <- gait %>% filter(name.x == "rhs" | name.x == "lhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x)

lstep <- gait %>% filter(name.x == "lto" | name.x == "lhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x, name.x) %>% 
  filter(row_number() %% 2 != 0)

lswing <- gait %>% filter(name.x == "lto" | name.x == "lhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x, name.x, rownum) %>% 
  filter(row_number() %% 2 != 1)

rstep <- gait %>% filter(name.x == "rto" | name.x == "rhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x, name.x) %>% 
  filter(row_number() %% 2 != 0)

rswing <- gait %>% filter(name.x == "rto" | name.x == "rhs") %>% 
  mutate(diff = rownum - lag(rownum, default = first(rownum))) %>% 
  dplyr::select(diff, exp.x, name.x, rownum) %>% 
  filter(row_number() %% 2 != 1)

gaitsetSkinny <- cbind(cycle[1:1627,], lstep[1:1627,1], lswing[1:1627,1], rstep[1:1627,1], rswing[1:1627,1])
names(gaitsetSkinny) <- c("cycle", "exp", "lstep", "lswing", "rstep", "rswing")
gaitsetSkinny %>% group_by(exp) %>%  dplyr::select(cycle, exp) %>% 
  skimr::skim()
Data summary
Name Piped data
Number of rows 1627
Number of columns 2
_______________________
Column type frequency:
numeric 1
________________________
Group variables exp

Variable type: numeric

skim_variable exp n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
cycle 1 0 1 59.74 17.91 0 54.25 56.0 59.00 118 ▁▁▇▁▁
cycle 2 0 1 67.16 24.15 49 57.50 59.0 61.50 169 ▇▁▁▁▁
cycle 3 0 1 69.63 49.03 47 56.00 57.0 62.00 318 ▇▁▁▁▁
cycle 4 0 1 77.56 38.31 49 56.00 57.0 107.00 219 ▇▃▁▁▁
cycle 5 0 1 82.15 39.83 52 61.00 64.0 104.75 226 ▇▁▁▁▁
cycle 6 0 1 82.86 39.35 51 58.00 60.0 117.75 174 ▇▁▂▁▁
cycle 7 0 1 57.37 23.38 50 51.25 52.5 53.00 187 ▇▁▁▁▁
cycle 8 0 1 53.42 11.38 49 50.00 51.0 52.00 114 ▇▁▁▁▁
cycle 9 0 1 77.08 35.69 50 58.00 61.0 92.50 208 ▇▁▂▁▁
cycle 10 0 1 65.36 40.33 50 54.00 54.0 56.00 259 ▇▁▁▁▁
cycle 11 0 1 69.38 23.95 50 56.00 57.0 65.00 124 ▇▁▁▁▂
cycle 12 0 1 64.47 20.32 51 56.00 57.0 58.00 123 ▇▁▁▁▁
cycle 13 0 1 64.90 26.70 46 51.00 56.0 63.50 170 ▇▁▁▁▁
cycle 14 0 1 76.38 47.11 44 53.00 60.0 82.00 266 ▇▂▁▁▁
cycle 15 0 1 68.81 41.54 45 54.00 56.0 61.75 247 ▇▁▁▁▁
cycle 16 0 1 63.46 25.81 50 52.00 53.5 56.75 162 ▇▁▁▁▁
cycle 17 0 1 59.00 17.60 42 51.00 54.0 61.50 111 ▇▂▁▁▁
cycle 18 0 1 58.29 24.84 42 50.00 53.0 54.00 170 ▇▁▁▁▁
cycle 19 0 1 61.93 23.18 51 55.00 56.0 57.00 167 ▇▁▁▁▁
cycle 22 0 1 158.71 384.29 54 59.75 62.0 114.25 1956 ▇▁▁▁▁
cycle 23 0 1 65.56 34.64 56 57.00 58.0 58.25 248 ▇▁▁▁▁
cycle 24 0 1 61.26 15.33 49 53.00 59.0 62.00 123 ▇▁▁▁▁
cycle 25 0 1 76.00 38.61 46 52.50 57.0 90.50 177 ▇▁▂▁▁
cycle 26 0 1 60.62 23.81 46 53.00 55.0 61.00 184 ▇▁▁▁▁
cycle 27 0 1 57.25 27.50 41 50.00 51.0 54.00 197 ▇▁▁▁▁
cycle 28 0 1 83.96 44.28 52 59.00 66.5 83.50 222 ▇▁▂▁▁
cycle 30 0 1 181.81 438.48 55 58.00 59.0 116.00 2088 ▇▁▁▁▁
cycle 31 0 1 71.04 41.20 51 55.00 57.0 59.00 230 ▇▁▁▁▁
cycle 32 0 1 72.12 30.39 46 56.00 57.0 65.00 168 ▇▁▂▁▁
cycle 33 0 1 67.62 32.31 45 54.25 56.5 64.00 204 ▇▁▁▁▁
cycle 34 0 1 74.03 31.59 55 61.25 63.0 64.75 188 ▇▁▁▁▁
cycle 35 0 1 65.97 16.58 58 59.00 61.0 63.00 129 ▇▁▁▁▁
cycle 36 0 1 73.38 34.28 0 49.75 56.0 103.50 149 ▁▇▁▅▁
cycle 37 0 1 61.65 20.11 47 49.50 54.0 60.50 118 ▇▁▁▁▁
cycle 38 0 1 73.85 54.14 49 51.25 53.0 56.75 264 ▇▁▁▁▁
cycle 39 0 1 59.41 27.99 41 49.00 50.0 53.00 150 ▇▁▁▁▁
cycle 40 0 1 68.89 43.61 46 55.50 57.0 59.50 275 ▇▁▁▁▁
cycle 41 0 1 59.75 23.89 48 53.00 54.0 55.00 177 ▇▁▁▁▁
cycle 42 0 1 65.52 31.59 43 52.00 53.0 61.00 192 ▇▁▁▁▁
cycle 43 0 1 56.75 24.03 41 49.00 51.0 53.50 149 ▇▁▁▁▁
cycle 44 0 1 59.68 26.98 42 51.00 52.0 54.00 171 ▇▁▁▁▁
cycle 45 0 1 51.97 10.41 45 50.00 50.0 51.00 106 ▇▁▁▁▁
cycle 46 0 1 80.84 48.42 51 59.00 61.0 67.00 279 ▇▂▁▁▁
cycle 47 0 1 75.07 43.49 55 59.75 61.0 65.00 272 ▇▁▁▁▁
cycle 48 0 1 85.76 33.75 51 61.00 63.0 124.00 147 ▇▁▁▃▁
cycle 49 0 1 79.67 39.26 54 58.50 62.0 67.50 195 ▇▁▂▁▁
cycle 50 0 1 152.88 101.24 58 66.75 130.5 181.75 393 ▇▇▁▁▂
cycle 52 0 1 158.71 490.02 47 55.75 59.5 64.25 2655 ▇▁▁▁▁
cycle 53 0 1 68.97 32.58 51 58.25 59.5 62.75 219 ▇▁▁▁▁
cycle 54 0 1 69.74 34.68 49 54.50 56.0 61.00 170 ▇▁▁▁▁
cycle 55 0 1 64.47 19.87 48 54.00 56.0 61.75 113 ▇▁▁▁▂
cycle 56 0 1 76.30 40.19 48 56.50 58.0 68.00 195 ▇▁▂▁▁
cycle 57 0 1 60.50 17.10 46 53.25 55.0 60.50 111 ▇▂▁▁▁
cycle 58 0 1 66.30 32.26 47 56.00 57.0 58.00 209 ▇▁▁▁▁
cycle 59 0 1 63.64 27.53 51 56.00 57.0 58.00 192 ▇▁▁▁▁
cycle 60 0 1 92.71 40.45 54 61.75 64.0 130.25 184 ▇▁▂▂▁
cycle 61 0 1 85.08 46.32 51 61.00 65.0 108.00 255 ▇▂▁▁▁
gaitsetSkinny %>% group_by(exp) %>%  dplyr::select(cycle, exp) %>% 
  filter(cycle < 100 ) %>% filter(cycle > 0 ) %>%  
  skimr::skim()
Data summary
Name Piped data
Number of rows 1376
Number of columns 2
_______________________
Column type frequency:
numeric 1
________________________
Group variables exp

Variable type: numeric

skim_variable exp n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
cycle 1 0 1 56.82 7.37 46 54.00 56.0 58.00 99 ▇▆▁▁▁
cycle 2 0 1 58.82 3.11 49 57.00 59.0 60.00 67 ▁▁▇▃▁
cycle 3 0 1 57.48 5.28 47 56.00 57.0 62.00 67 ▂▁▇▃▂
cycle 4 0 1 58.56 10.61 49 55.00 56.0 57.00 99 ▇▁▁▁▁
cycle 5 0 1 61.53 5.11 52 59.50 62.0 64.50 71 ▃▂▇▇▂
cycle 6 0 1 58.40 3.74 51 58.00 58.0 60.00 67 ▂▁▇▂▁
cycle 7 0 1 52.39 1.32 50 51.00 52.0 53.00 55 ▇▆▇▂▂
cycle 8 0 1 51.69 4.73 49 50.00 51.0 52.00 78 ▇▁▁▁▁
cycle 9 0 1 58.74 3.71 50 57.50 59.0 61.00 64 ▂▁▇▇▅
cycle 10 0 1 54.36 1.96 50 54.00 54.0 55.00 58 ▂▂▇▆▂
cycle 11 0 1 58.00 8.18 50 56.00 57.0 57.00 93 ▇▁▁▁▁
cycle 12 0 1 56.69 1.72 51 56.00 57.0 58.00 60 ▁▁▇▇▁
cycle 13 0 1 55.70 6.66 46 49.50 55.0 63.00 65 ▆▃▃▁▇
cycle 14 0 1 58.57 12.78 44 52.00 55.0 63.00 96 ▇▇▁▁▂
cycle 15 0 1 56.17 4.45 45 54.00 55.0 57.50 65 ▁▂▇▁▂
cycle 16 0 1 55.61 9.90 50 52.00 53.0 55.00 99 ▇▁▁▁▁
cycle 17 0 1 55.50 11.89 42 50.75 53.5 59.25 97 ▇▇▁▁▁
cycle 18 0 1 51.11 4.67 42 48.50 52.0 53.00 61 ▂▂▇▂▁
cycle 19 0 1 56.04 2.34 51 55.00 56.0 57.00 62 ▂▃▇▂▁
cycle 22 0 1 60.31 3.55 54 58.00 60.5 62.00 69 ▂▇▇▁▁
cycle 23 0 1 57.97 1.99 56 57.00 58.0 58.00 65 ▇▇▁▁▁
cycle 24 0 1 57.55 5.16 49 53.00 59.0 61.00 68 ▆▅▇▅▂
cycle 25 0 1 57.59 11.84 46 51.00 56.0 59.00 99 ▇▅▁▁▁
cycle 26 0 1 55.44 5.17 46 53.00 55.0 59.50 64 ▆▇▇▃▇
cycle 27 0 1 51.10 5.20 41 50.00 51.0 52.75 61 ▂▁▇▁▂
cycle 28 0 1 61.44 6.40 52 56.75 62.0 67.00 71 ▆▅▅▇▅
cycle 30 0 1 57.58 1.68 55 56.00 58.5 59.00 59 ▂▂▁▁▇
cycle 31 0 1 56.09 2.94 51 54.00 56.0 58.50 62 ▆▂▇▅▂
cycle 32 0 1 56.68 5.19 46 55.00 57.0 58.00 65 ▂▁▇▂▂
cycle 33 0 1 58.26 10.34 45 54.00 56.0 62.00 98 ▇▇▁▁▁
cycle 34 0 1 62.50 2.66 55 61.00 63.0 63.00 68 ▁▂▆▇▂
cycle 35 0 1 62.03 6.59 58 59.00 61.0 63.00 95 ▇▁▁▁▁
cycle 36 0 1 57.47 14.51 46 49.00 54.0 57.00 95 ▇▂▁▁▁
cycle 37 0 1 54.81 9.22 47 49.00 54.0 55.50 93 ▇▂▁▁▁
cycle 38 0 1 52.81 2.32 49 51.00 52.0 54.00 58 ▂▇▅▂▂
cycle 39 0 1 50.12 3.84 41 49.00 50.0 52.25 60 ▁▂▇▃▁
cycle 40 0 1 57.00 4.23 46 54.00 57.0 58.25 66 ▁▅▇▃▂
cycle 41 0 1 55.41 6.67 48 53.00 54.0 55.00 86 ▇▂▁▁▁
cycle 42 0 1 52.96 4.48 43 51.00 52.0 54.25 63 ▁▃▇▁▂
cycle 43 0 1 50.80 5.52 41 47.00 51.0 53.00 61 ▂▂▇▁▂
cycle 44 0 1 53.00 10.15 42 51.00 52.0 53.00 97 ▇▂▁▁▁
cycle 45 0 1 50.10 2.09 45 50.00 50.0 51.00 57 ▁▁▇▁▁
cycle 46 0 1 59.79 4.14 51 58.00 60.0 62.00 67 ▂▂▇▅▅
cycle 47 0 1 60.92 2.98 55 59.00 61.0 62.25 67 ▂▃▇▂▃
cycle 48 0 1 60.46 5.95 51 59.00 61.0 63.00 71 ▃▁▇▁▂
cycle 49 0 1 60.43 3.92 54 58.00 61.0 63.00 69 ▅▃▇▃▁
cycle 50 0 1 63.40 3.58 58 62.00 64.0 66.00 67 ▃▁▃▃▇
cycle 52 0 1 57.75 5.58 47 55.00 58.0 61.25 67 ▆▂▇▇▅
cycle 53 0 1 59.52 3.51 51 58.00 59.0 61.00 67 ▂▃▇▅▃
cycle 54 0 1 56.13 3.38 49 54.00 55.0 57.50 63 ▁▅▇▂▃
cycle 55 0 1 57.38 8.09 48 54.00 56.0 60.50 91 ▇▃▁▁▁
cycle 56 0 1 57.29 5.82 48 55.00 57.0 60.00 68 ▅▂▇▂▂
cycle 57 0 1 55.19 5.69 46 52.00 55.0 57.50 66 ▅▃▇▂▃
cycle 58 0 1 56.50 2.32 47 56.00 57.0 58.00 59 ▁▁▁▃▇
cycle 59 0 1 56.73 2.03 51 56.00 57.0 58.00 60 ▁▁▆▇▃
cycle 60 0 1 61.36 3.69 54 60.25 62.0 62.75 69 ▂▂▇▁▁
cycle 61 0 1 62.00 5.90 51 58.50 62.5 67.25 70 ▃▅▆▆▇
gaitset_simple <- gaitsetSkinny %>% 
#filter(!exp %in% c(11, 13, 17, 22, 28, 30, 32, 36, 38, 42, 44, 46, 48, 50, 54, 56, 60)) %>% 
  filter(lswing < 50) %>%
  filter(lstep < 50) %>%
  filter(rswing < 50) %>%
  filter(rstep < 50) %>% 
  filter(cycle < 100)
  
gaitset_simple %>% 
  pivot_longer(c("cycle", "lstep", "lswing", "rstep", "rswing")) %>% 
  ggplot(aes(group=exp, value)) +
  geom_boxplot() +
  facet_wrap(vars(name))

gaitsetSkinny %>% filter_all(all_vars(. > 10)) %>% filter_all(all_vars(. < 100)) %>% 
  count(exp) %>% filter(n < 5)
##    exp n
## 1   30 3
## 2   38 3
## 3   39 3
## 4   43 3
## 5   46 4
## 6   47 4
## 7   48 4
## 8   50 1
## 9   54 4
## 10  55 4
## 11  60 1
## 12  61 1
gaitfinal <- gaitsetSkinny %>% 
  filter(lswing < 50) %>%
  filter(lstep < 50) %>%
  filter(rswing < 50) %>%
  filter(rstep < 50) %>%
  filter_all(all_vars(. < 100)) %>%
  filter_all(all_vars(. > 10)) %>%
  mutate(stepdiff = rstep - lstep) %>% 
  mutate(swingdiff = rswing - lswing) 

#gaitfinalhalf <- gaitfinal

gaitfinal %>% 
  pivot_longer(c("cycle", "lstep", "lswing", "rstep", "rswing", "swingdiff", "stepdiff")) %>% 
  ggplot(aes(group=exp, value)) +
  geom_boxplot() +
  facet_wrap(vars(name))

the_summary <- gaitfinal %>% group_by(exp) %>% skimr::skim()
summary <- the_summary %>% dplyr::select(skim_variable, exp, numeric.mean, numeric.sd) %>% pivot_wider(names_from = c(skim_variable), values_from = c(numeric.mean, numeric.sd))
skimr::skim(summary)
Data summary
Name summary
Number of rows 43
Number of columns 15
_______________________
Column type frequency:
numeric 15
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
exp 0 1.00 35.60 14.67 11.00 23.50 37.00 47.50 59.00 ▇▆▆▇▇
numeric.mean_cycle 0 1.00 56.41 5.17 45.00 53.00 56.70 60.19 67.00 ▂▆▇▇▃
numeric.mean_lstep 0 1.00 27.47 6.26 15.00 24.26 28.00 29.53 43.33 ▂▃▇▁▁
numeric.mean_lswing 0 1.00 29.06 5.57 13.00 25.27 27.82 31.68 42.00 ▁▃▇▅▂
numeric.mean_rstep 0 1.00 28.18 7.61 14.27 22.61 28.75 31.92 47.50 ▃▇▇▃▂
numeric.mean_rswing 0 1.00 29.57 7.88 12.00 23.00 30.00 36.47 41.27 ▁▇▅▆▇
numeric.mean_stepdiff 0 1.00 0.72 10.39 -15.45 -6.08 -0.60 5.95 29.00 ▅▇▇▂▁
numeric.mean_swingdiff 0 1.00 0.50 8.70 -22.50 -5.51 0.75 7.06 14.00 ▁▂▇▇▆
numeric.sd_cycle 4 0.91 5.06 4.68 0.00 2.30 3.81 6.11 20.83 ▇▅▁▁▁
numeric.sd_lstep 4 0.91 8.72 3.14 0.00 6.81 8.77 10.34 14.36 ▁▃▇▇▃
numeric.sd_lswing 4 0.91 9.59 3.32 3.54 7.27 9.33 11.58 19.09 ▃▇▇▂▁
numeric.sd_rstep 4 0.91 4.05 3.25 0.00 1.92 3.29 5.36 15.39 ▇▆▃▁▁
numeric.sd_rswing 4 0.91 4.76 2.95 0.71 2.51 3.78 7.28 10.78 ▇▇▅▅▃
numeric.sd_stepdiff 4 0.91 9.65 3.91 0.00 7.54 8.71 12.05 18.51 ▁▂▇▃▂
numeric.sd_swingdiff 4 0.91 10.64 3.87 4.95 7.57 9.98 13.13 18.68 ▇▇▅▃▂
summary_total <- summary %>%  
  dplyr::select(exp, numeric.mean_cycle, numeric.mean_lstep, numeric.mean_lswing, numeric.mean_rstep, numeric.mean_rswing,  numeric.mean_swingdiff, numeric.mean_stepdiff, numeric.sd_cycle, numeric.sd_lstep, numeric.sd_lswing, numeric.sd_rstep, numeric.sd_rswing, numeric.sd_swingdiff, numeric.sd_stepdiff) %>% 
  mutate(swingratio = numeric.mean_swingdiff/((numeric.mean_lswing + numeric.mean_rswing)/2)) %>% 
  mutate(stepratio = numeric.mean_stepdiff/((numeric.mean_lstep + numeric.mean_rstep)/2)) %>% 
  round(2)
summary_final <- summary_total %>%  
  filter(abs(stepratio) < .2) %>% 
  filter(abs(swingratio) < .2) 
names(summary_final) <- c("exp", "cycle_m", "lstep_m", "lswing_m", "rstep_m", "rswing_m", "swingdiff_m", "stepdiff_m", "cycle_sd", "lstep_sd", "lswing_sd", "rstep_sd", "rswing_sd", "swingdiff_sd", "stepdiff_sd", "swingratio", "stepratio")
summary_final$exp <- as.factor(summary_final$exp)
summary_final[,2:15] <- summary_final[,2:15]/50
summary_final %>% 
  dplyr::select(exp, cycle_m, lstep_m, lswing_m, rstep_m, rswing_m, swingdiff_m, stepdiff_m) %>% 
  pivot_longer(c("cycle_m", "lstep_m", "lswing_m", "rstep_m", "rswing_m", "swingdiff_m", "stepdiff_m")) %>%
  ggplot(aes(exp, value, color = name)) +
  geom_point(size = 3) +
  coord_flip() +
  scale_y_continuous()

#40 and 41 are the same person, 42 and 43 are also the same person
#11 our of 30 subjects passed symetry criteria
summary_final
## # A tibble: 13 x 17
##    exp   cycle_m lstep_m lswing_m rstep_m rswing_m swingdiff_m stepdiff_m
##    <fct>   <dbl>   <dbl>    <dbl>   <dbl>    <dbl>       <dbl>      <dbl>
##  1 13      1.04    0.576    0.748   0.5      0.648     -0.1       -0.076 
##  2 18      0.973   0.357    0.753   0.387    0.753      0          0.03  
##  3 22      1.2     0.58     0.695   0.575    0.79       0.095     -0.005 
##  4 24      1.06    0.632    0.516   0.552    0.6        0.084     -0.08  
##  5 27      0.998   0.458    0.63    0.388    0.67       0.04      -0.07  
##  6 28      1.21    0.6      0.636   0.588    0.708      0.072     -0.012 
##  7 35      1.22    0.542    0.63    0.498    0.73       0.1       -0.045 
##  8 37      1.21    0.472    0.59    0.565    0.488     -0.102      0.0924
##  9 40      1.14    0.587    0.487   0.693    0.5        0.0134     0.107 
## 10 41      1.07    0.574    0.509   0.634    0.451     -0.0572     0.06  
## 11 42      1.04    0.564    0.452   0.464    0.54       0.088     -0.1   
## 12 43      0.9     0.607    0.533   0.58     0.467     -0.0666    -0.0266
## 13 54      1.15    0.585    0.6     0.6      0.615      0.015      0.015 
## # ... with 9 more variables: cycle_sd <dbl>, lstep_sd <dbl>, lswing_sd <dbl>,
## #   rstep_sd <dbl>, rswing_sd <dbl>, swingdiff_sd <dbl>, stepdiff_sd <dbl>,
## #   swingratio <dbl>, stepratio <dbl>
skimr::skim(summary_final)
Data summary
Name summary_final
Number of rows 13
Number of columns 17
_______________________
Column type frequency:
factor 1
numeric 16
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
exp 0 1 FALSE 13 13: 1, 18: 1, 22: 1, 24: 1

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
cycle_m 0 1 1.09 0.10 0.90 1.04 1.07 1.20 1.22 ▂▃▇▃▇
lstep_m 0 1 0.55 0.08 0.36 0.54 0.58 0.59 0.63 ▁▁▁▅▇
lswing_m 0 1 0.60 0.10 0.45 0.52 0.60 0.64 0.75 ▆▃▇▂▆
rstep_m 0 1 0.54 0.09 0.39 0.50 0.56 0.59 0.69 ▃▆▃▇▃
rswing_m 0 1 0.61 0.12 0.45 0.50 0.62 0.71 0.79 ▇▂▆▃▆
swingdiff_m 0 1 0.01 0.07 -0.10 -0.06 0.01 0.08 0.10 ▅▂▅▂▇
stepdiff_m 0 1 -0.01 0.07 -0.10 -0.07 -0.01 0.03 0.11 ▇▃▆▃▃
cycle_sd 0 1 0.12 0.06 0.04 0.08 0.11 0.13 0.29 ▆▇▁▁▁
lstep_sd 0 1 0.21 0.04 0.15 0.18 0.20 0.22 0.29 ▇▆▆▂▃
lswing_sd 0 1 0.21 0.04 0.14 0.20 0.22 0.24 0.28 ▆▁▇▇▃
rstep_sd 0 1 0.12 0.08 0.02 0.07 0.11 0.15 0.31 ▇▆▃▂▂
rswing_sd 0 1 0.14 0.06 0.03 0.11 0.13 0.19 0.22 ▃▂▇▃▇
swingdiff_sd 0 1 0.23 0.07 0.14 0.20 0.24 0.26 0.36 ▆▆▇▂▃
stepdiff_sd 0 1 0.23 0.08 0.11 0.17 0.20 0.27 0.37 ▃▇▃▃▃
swingratio 0 1 0.02 0.13 -0.19 -0.12 0.03 0.13 0.18 ▆▁▅▂▇
stepratio 0 1 -0.02 0.13 -0.19 -0.14 -0.02 0.08 0.18 ▇▂▇▃▃
summary_final %>% 
  mutate(rownum = row_number()) %>% 
  filter(abs(stepratio) < .1) 
## # A tibble: 6 x 18
##   exp   cycle_m lstep_m lswing_m rstep_m rswing_m swingdiff_m stepdiff_m
##   <fct>   <dbl>   <dbl>    <dbl>   <dbl>    <dbl>       <dbl>      <dbl>
## 1 18      0.973   0.357    0.753   0.387    0.753      0          0.03  
## 2 22      1.2     0.58     0.695   0.575    0.79       0.095     -0.005 
## 3 28      1.21    0.6      0.636   0.588    0.708      0.072     -0.012 
## 4 35      1.22    0.542    0.63    0.498    0.73       0.1       -0.045 
## 5 43      0.9     0.607    0.533   0.58     0.467     -0.0666    -0.0266
## 6 54      1.15    0.585    0.6     0.6      0.615      0.015      0.015 
## # ... with 10 more variables: cycle_sd <dbl>, lstep_sd <dbl>, lswing_sd <dbl>,
## #   rstep_sd <dbl>, rswing_sd <dbl>, swingdiff_sd <dbl>, stepdiff_sd <dbl>,
## #   swingratio <dbl>, stepratio <dbl>, rownum <int>
kclusts <-
  tibble(k = 1:9) %>%
  mutate(
    kclust = map(k, ~ kmeans(summary_final[2:17], .x)),
    glanced = map(kclust, broom::glance),
  )

kclusts %>%
  unnest(cols = c(glanced)) %>%
  ggplot(aes(k, tot.withinss)) +
  geom_line(alpha = 0.5, size = 1.2, color = "midnightblue") +
  geom_point(size = 2, color = "midnightblue")

k <- kmeans(summary_final[2:17], 2, iter.max = 10, nstart = 10)
Clusters_for_Summary_Phases_of_Gait<- summary_final %>%  
  dplyr::select(exp, cycle_m, cycle_sd, lstep_m, rstep_m, lswing_m, rswing_m, swingdiff_m, swingratio, stepdiff_m, stepratio)

cluster::clusplot(Clusters_for_Summary_Phases_of_Gait[2:11], k$cluster, color=T, shade=F, labels=3, lines=0, plotchar=T, col.p = k$cluster, main = "Clusters for Summary Phases of Gait")

2,3,6,7,12,13 had less than 10% threshold 8,9,10,12 are group 1 1,2,3,4,5,6,7,11,13 Group 1 avg threshold 0.0166 Group 2 avg threshold -0.005

klustR::pacoplot(Clusters_for_Summary_Phases_of_Gait[2:11], k$cluster)
klustR::pcplot(Clusters_for_Summary_Phases_of_Gait[2:11], k$cluster,  barColor = "red",
       colorScheme = c("red", "green", "orange", "blue", "yellow"),
       labelSizes = list(yaxis = 20, yticks = 15, tooltip = 25),
       pcGridlines = TRUE, barGridlines = TRUE)
myPCA <- prcomp(Clusters_for_Summary_Phases_of_Gait[2:11], rank. = 2)

library(reshape2)
## 
## Attaching package: 'reshape2'
## The following objects are masked from 'package:data.table':
## 
##     dcast, melt
## The following object is masked from 'package:tidyr':
## 
##     smiths
melted <- melt(myPCA$rotation[,1:2])
melted %>% 
  mutate(Var1 = tidytext::reorder_within(Var1, value, Var2)) %>% 
ggplot(aes(x=Var1, y=value, fill=Var1)) +
        theme(legend.position = "none", axis.text.x = element_text(angle=90), 
              axis.ticks.x = element_blank()) + 
        xlab("Summary of Phases of Gait") +
        ylab("Relative importance in each principle component") +
        ggtitle("Variables in Principal Component Analysis") +
        geom_bar(stat="identity") +
        facet_wrap(~Var2, scales = "free") +
        tidytext::scale_x_reordered()