1 Motivation

Several approaches based on machine learning techniques have been proposed to increase MEMS inertial sensors’ performances. Generally, such approaches have assumed that errors cannot be explained by linear models, as well as the observed error probability distribution is not pure Gaussian. However, according to manufacturers the error non-linearity observed in MEMS sensors can be considered in most cases negligible. Therefore, it seems necessary to evaluate linear models in the context of MEMS inertial sensor errors. In particular the application of Time delayed Multiple Linear Regression models. (TD-MLR)

2 Main Hypotesis

Simpler Machine learning linear models are capable of compensating errors observed in MEMS sensors.

3 Models considered

Three models are considered:

  1. Time Delayed - Multi Linear Regression (TD-MLR) : A simple linear model
  2. Multi Layer Perceptron (MLP), a non linear model
  3. Moving Average (MA): the common strategy used in navigation systems for dealing with noise

A brief description of each one of the models is presented in the following subsections.

3.1 Multi Linear Regression Mode (MLR).

A TD-MLR model for representing the relationship between an explanatory variable \(x\) and a response variable \(y\) can be written as,

\[y_t = \alpha + \beta_t L^q( x_{t}) + \epsilon_{t}\]

where, the subscript \(t\) refers to the \(t^{th}\) time unit in the population of size \(M \in \mathbb{N}\). \(y_t\) denotes the observed response for experimental time unit \(t\). \(\epsilon_t\) and \(\alpha\) are respectively the residual error and the intercept terms for the same time unit \(t\). The expression \(L^q( x_{t})\) represents a tapped delay line for the \(x_t\) variable. A tapped delay line is a well-known method to model non-stationary time series implemented as a short-term memory with \(q\) unit delay operators

3.2 Multi Layer Perceptron (MLP)

(From Wikipedia) A multilateral perceptron (MLP) is a feed forward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning technique called back-propagation for training the network.[1][2] MLP is a modification of the standard linear perceptron and can distinguish data that are not linearly separable.

3.3 Moving Average (MA)

The usual technique for dealing with correlated noise consists of applying a Moving Average (MA) filter. The MA filter operates by averaging a predefined number of points from the input signal to produce each point in the output signal. As its name implies, a MA is an average that moves along the input signal values. An old value is dropped as a new value comes available. The number of values to keep will depend of the size of the time window considered. More formally, given an input signal \(x\) of size \(M\) defined as \({x_t}_{(t=1)}^{M}\), a MA with a time window of size \(q\) will output a new signal \({y_t}_{(t=1)}^{M-q}\) defined from the \(x_i\) by taking the arithmetic mean of sub-sequences of \(q\) terms,

\[y_t = \frac{1}{q} \sum_{s=0}^{M-q+1} x_{t-s}\]

4 Experiments

4.1 Dataset Description

The performance of TD-MLR models is evaluated on the well-known Melbourne dataset , which was kindly provided by research members from The Ohio State University (OSU) and the University of Melbourne. Since its publication in 2011, the Melbourne dataset has become a reference in the research field of MEMS inertial sensors.

The dataset contains measures provided by a mobile platform that includes various grades of IMUs, several GPS receivers and data-logging software. The IMUs included in the mobile platform varies from low-quality MEMS sensors to high-quality, navigation-grade sensors. The information provided by IMUs were logged during a predefined trajectory performed by a ground vehicle. As shown in Fig. , the trajectory has two well-defined stretches.

  trajectory1=read.csv("/home/harpo/Dropbox/shared/MEMS-ANN/datasets/dgps_T1.txt",header=F)
  trajectory2=read.csv("/home/harpo/Dropbox/shared/MEMS-ANN/datasets/dgps_T2.txt",header=F)
  par(mfrow=c(1,2))
  plot(trajectory1$V1,trajectory1$V2,type="l",lwd=4,ylab="Latitude [degree]",xlab="Longitude [degree]",main="Stretch 1 (T1)",col="orange",cex=1.2)
  plot(trajectory2$V1,trajectory2$V2,type="l",lwd=4,ylab="Latitude [degree]",xlab="Longitude [degree]",main="Stretch 2 (T2)",col="orange",cex=1.2)

In the first stretch (T1, for short) the vehicle performs several 8-pattern loops at various velocities and accelerations in a parking area at OSU for about 16 minutes. Then, in the second stretch (T2) the vehicle is driven on an internal OSU road for an 8 minutes trip. Notice that T1 has more rich dynamics than T2. The complete dataset covers about 24 minutes.

In the present work, four MEMS IMUs are considered:

  1. Gladiator Landmark 10 (Gladiator).
  2. Xbow IMU400CD (Crossbow)
  3. Xsens (xsns1)
  4. Xtal (xtal)

These four IMUs represent MEMS inertial sensors of different grades, Gladiator is a low quality IMU very sensible to noise, while the remaining are mid-range quality IMUS with variable sensor characteristics . A fifth Non-MEMS IMU is also considered as reference. Namely, the Honeywell H764G-1 (Honeywell), a well-established navigation-grade IMU in use on most military aircraft

4.2 Finding the optimal models

TD-MLR and MLP models are built for each one of the available sensors (i.e AccX, AccY, GyroZ). The models are built using the response observations (i.e. \(y_t\)) from nav-grade Honeywell IMU while the explanatory variables (i.e. \(X_t\)) will consist of time delayed values coming from the corresponding low and mid quality IMUs sensors (i.e Gladiator, Xbow, Xtal and Xsens).

Either for TD-MLR and for MA models, the optimal size of \(q\) should be determined. The strategy followed in this work consists of starting with a very short \(q\) value and incrementing it by one period. For each new \(q\) value the Root Square Mean Error (RMSE) of the resulting model is calculated. For improving RSME estimate, a k-fold cross validation with \(k=10\) is used to evaluate the models for every new lag term. The process is repeated until \(q\) reaches a predefined and sufficiently large value. Then, the optimal value for \(q\) is determined by the model with the lower RSME.

In the case of MLP, not only \(q\) but also the different number of hidden neurons \(h\) are evaluated. In this case, a similar approach is followed for a subset of the of possible network typologies (i.e. \(h\)), again a k-fold cross validation with \(k=10\) is used to evaluate the performance of every \(q\) for each one of the selected \(h\). Similarly to previous models, the model with the lower RMSE is used for determining the optimal values of \(q\) and \(h\)

Following standard machine learning methodology, the models will be adjusted and evaluated using data from Stretch T1. Then, after the optimal lag length (\(q\)) for each model had been selected, an independent evaluation on stretch T2 is conducted in order to get a better estimation of the error on unseen data.

4.2.1 Results for the X1 (AccX) sensor

The following box plot shows the distribution for the selected models according to the strategy described in the previous section. The plot shows the RSME distributions for sensor X1 provided by TD-MLR, MA and MLP considering different values for \(h\) (i.e. topologies)

# Select the models with the smallest RMSE
best_models=group_by(results,model,imu,timedelay) %>% summarise(mean=mean(X1),sd=sd(X1)) %>% filter(mean==min(mean)) %>% arrange(desc(imu))
# adding the desaggregated lag info
bw_data=inner_join(best_models,results, by=c("timedelay","imu","model"))
bwplot(X1~model|imu,groups=timedelay,data=bw_data,scale='free'
       ,auto.key = F
       ,panel = panel.superpose
       ,panel.groups=function(x,y,group.number,...){
         xt <- x[x==min(x)] # find latest year
         yt <- y[y==min(y)] # find value at latest year 
         panel.text(x,yt,labels=levels(factor(bw_data$timedelay))[group.number],pos=1,...)
         panel.bwplot(x,y,...)
             },ylab="X1"
         )

The plot shows also the value of \(q\) for all the models. In the case of the models based on MLP, only the one with the lower RMSE are considered during the testing stage. In this case the those models are:

mlp100 for Gladiator with \(q\) equals to 100

mlp40 for Crossbow with \(q\) equals to 40

mlp80 for Xsens with \(q\) equals to 40

mlp30 for Xtal with \(q\) equals to 10

4.2.2 Results for the X2 (AccY) sensor

The following box plot shows the distribution for the selected models according to the strategy described in the previous section. The plot shows the RSME distributions for sensor X2 provided by TD-MLR, MA and MLP considering different values for \(h\) (i.e typologies)

# Select the models with the smallest RMSE
best_models=group_by(results,model,imu,timedelay) %>% summarise(mean=mean(X2),sd=sd(X2)) %>% filter(mean==min(mean)) %>% arrange(desc(imu))
# adding the desaggregated lag info
bw_data=inner_join(best_models,results, by=c("timedelay","imu","model"))
bwplot(X2~model|imu,groups=timedelay,data=bw_data,scale='free'
       ,auto.key = F
       ,panel = panel.superpose
       ,panel.groups=function(x,y,group.number,...){
         xt <- x[x==min(x)] # find latest year
         yt <- y[y==min(y)] # find value at latest year 
         panel.text(x,yt,labels=levels(factor(bw_data$timedelay))[group.number],pos=1,...)
         panel.bwplot(x,y,...)
             },ylab="X2"
         )

The plot shows also the value of \(q\) for all the models. In the case of the models based on MLP, only the one with the lower RMSE are considered during the testing stage. In this case the those models are:

mlp20 for Gladiator with \(q\) equals to 100

mlp5 for Crossbow with \(q\) equals to 15

mlp60 for Xsens with \(q\) equals to 5

mlp5 for Xtal with \(q\) equals to 15

4.2.3 Results for the X6 (GiroZ) sensor

The following box plot shows the distribution for the selected models according to the strategy described in the previous section. The plot shows the RSME distributions for sensor X6 provided by TD-MLR, MA and MLP considering different values for \(h\) (i.e typologies)

# Select the models with the smallest RMSE
best_models=group_by(results,model,imu,timedelay) %>% summarise(mean=mean(X6),sd=sd(X6)) %>% filter(mean==min(mean)) %>% arrange(desc(imu))
# adding the desaggregated lag info
bw_data=inner_join(best_models,results, by=c("timedelay","imu","model"))
bwplot(X6~model|imu,groups=timedelay,data=bw_data,scale='free'
       ,auto.key = F
       ,panel = panel.superpose
       ,panel.groups=function(x,y,group.number,...){
         xt <- x[x==min(x)] # find latest year
         yt <- y[y==min(y)] # find value at latest year 
         panel.text(x,yt,labels=levels(factor(bw_data$timedelay))[group.number],pos=1,...)
         panel.bwplot(x,y,...)

             },ylab="X6"
      
         )

The plot shows also the value of \(q\) for all the models. In the case of the models based on MLP, only the one with the lower RMSE are considered during the testing stage. In this case the those models are:

mlp20 for Gladiator with \(q\) equals to 100

mlp5 for Crossbow with \(q\) equals to 15

mlp60 for Xsens with \(q\) equals to 5

mlp5 for Xtal with \(q\) equals to 15

## reshape results for selecting best models for ma and MLP
results_reduced=results %>% select(imuid,model,timedelay,X1,X2,X6)
results_reduced=melt(results_reduced,id.vars=c("imuid","model","timedelay"),variable.name="sensor",value.name="value")
best_taps=group_by(results_reduced,model,imuid,sensor,timedelay) %>% summarise(mean=mean(value),sd=sd(value)) %>% filter(mean==min(mean)) %>% arrange(desc(imuid)) %>% select(imuid,sensor,timedelay)
## Adding missing grouping variables: `model`
best_taps=reshape2::dcast(best_taps,imuid+model~sensor,value.var = 'timedelay')

4.2.4 Final Models

In summary, after a 10-Fold Cross-validation the models with the lower RMSE are:

4.2.4.1 For MLP

mlp=cbind(imuid=5,glad_mlp_model)
mlp=rbind(mlp,cbind(imuid=6,xbow_mlp_model))
mlp=rbind(mlp,cbind(imuid=8,xsens_mlp_model))
mlp=rbind(mlp,cbind(imuid=9,xtal_mlp_model))

selected_models_mlp_info=cbind(X1=inner_join(mlp,X1_best_models,by=c("imuid","X1.model"="model")) %>% select(imuid,X1.model,timedelay,mean,sd), X2=inner_join(mlp,X2_best_models,by=c("imuid","X2.model"="model")) %>% select(imuid,X2.model,timedelay,mean,sd),X6=inner_join(mlp,X6_best_models,by=c("imuid","X6.model"="model")) %>% select(imuid,X6.model,timedelay,mean,sd))

selected_models_mlp_info[,c(1:5)]
##   X1.imuid X1.X1.model X1.timedelay    X1.mean        X1.sd
## 1        5      mlp100          100 0.18295484 0.0017184835
## 2        6       mlp40           40 0.09681763 0.0007688444
## 3        8       mlp80           40 0.07473004 0.0004065169
## 4        9       mlp30           10 0.10157406 0.0010362920
selected_models_mlp_info[,c(6:10)]
##   X2.imuid X2.X2.model X2.timedelay    X2.mean       X2.sd
## 1        5       mlp20          100 0.21921947 0.004694435
## 2        6        mlp5           15 0.11489287 0.002785178
## 3        8       mlp60            5 0.08838199 0.000778862
## 4        9        mlp5           15 0.13205592 0.003246703
selected_models_mlp_info[,c(11:15)]
##   X6.imuid X6.X6.model X6.timedelay    X6.mean       X6.sd
## 1        5       mlp20          100 0.21921947 0.004694435
## 2        6        mlp5           15 0.11489287 0.002785178
## 3        8       mlp60            5 0.08838199 0.000778862
## 4        9        mlp5           15 0.13205592 0.003246703

4.2.4.2 For TD-MLR

mlr=best_taps %>% filter(model=='mlr') 
selected_models_mlr_info=cbind(X1=inner_join(mlr,X1_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd),X2=inner_join(mlr,X2_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd),X6=inner_join(mlr,X6_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd))
selected_models_mlr_info[,c(1:5)]
##   X1.imuid X1.model X1.timedelay    X1.mean        X1.sd
## 1        5      mlr          100 0.18889949 0.0010077841
## 2        6      mlr           95 0.09569152 0.0007235566
## 3        8      mlr           75 0.07560705 0.0006192475
## 4        9      mlr           95 0.09827752 0.0007294742
selected_models_mlr_info[,c(6:10)]
##   X2.imuid X2.model X2.timedelay    X2.mean        X2.sd
## 1        5      mlr           85 0.22290536 0.0019346862
## 2        6      mlr          100 0.10769060 0.0008168191
## 3        8      mlr          100 0.08915207 0.0004961713
## 4        9      mlr          100 0.12188105 0.0007277664
selected_models_mlr_info[,c(11:15)]
##   X6.imuid X6.model X6.timedelay    X6.mean        X6.sd
## 1        5      mlr           85 0.22290536 0.0019346862
## 2        6      mlr          100 0.10769060 0.0008168191
## 3        8      mlr          100 0.08915207 0.0004961713
## 4        9      mlr          100 0.12188105 0.0007277664

4.2.4.3 For MA

ma=best_taps %>% filter(model=='ma') 
selected_models_ma_info=cbind(X1=inner_join(ma,X1_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd),X2=inner_join(ma,X2_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd),X6=inner_join(ma,X6_best_models,by=c("imuid","model")) %>% select(imuid,model,timedelay,mean,sd))
selected_models_ma_info[,c(1:5)]
##   X1.imuid X1.model X1.timedelay    X1.mean        X1.sd
## 1        5       ma           50 0.21023893 0.0016912437
## 2        6       ma           15 0.12967777 0.0009625942
## 3        8       ma            1 0.08199104 0.0007452200
## 4        9       ma            1 0.12979918 0.0005084080
selected_models_ma_info[,c(6:10)]
##   X2.imuid X2.model X2.timedelay   X2.mean       X2.sd
## 1        5       ma           35 0.5996878 0.001635709
## 2        6       ma           15 0.1559341 0.001056591
## 3        8       ma            1 0.1123862 0.001083526
## 4        9       ma            1 0.1587269 0.001674404
selected_models_ma_info[,c(11:15)]
##   X6.imuid X6.model X6.timedelay   X6.mean       X6.sd
## 1        5       ma           35 0.5996878 0.001635709
## 2        6       ma           15 0.1559341 0.001056591
## 3        8       ma            1 0.1123862 0.001083526
## 4        9       ma            1 0.1587269 0.001674404

4.3 Evaluation on T2 Stretch

The 100% of the training dataset (T1) is used for building the models and then evaluated on unseen data. For performing the evaluation the model on unseen data, a sampling method is used on the Stretch T2. In this case, a fixed window encompasing the 10% of the Stretch T2 is randomly and uniformly sampled with replacement. Such process is repeteated 40 times.

load("/home/harpo/Dropbox/shared/MEMS-ANN/results/results_samples_mlrnew.Rda")

4.3.1 Comparing MLR vs Raw

This section analyze the results in terms of RMSE for the TD-MLR compared with the original RMSE.

4.3.1.1 Evaluation on Gladiator

glad_test_results=test_results %>% filter(imuid==5 &  model=='mlr') %>%select(imuid,X1,X1vsRaw,X2,X2vsRaw,X6,X6vsRaw)
names(glad_test_results)=c("imuid","X1_mlr","X1_raw","X2_mlr","X2_raw","X6_mlr","X6_raw")
glad_test_results=tidyr::gather(glad_test_results,"sensor","value",2:7)
glad_test_results=tidyr::separate(glad_test_results,sensor,c("sensor","source"),sep="_") 
px1=wilcox.test(value~source,data=glad_test_results %>% filter(sensor=="X1" & (source=="mlr"|source=="raw")))
px2=wilcox.test(value~source,data=glad_test_results %>% filter(sensor=="X2" & (source=="mlr"|source=="raw")))
px6=wilcox.test(value~source,data=glad_test_results %>% filter(sensor=="X6" & (source=="mlr"|source=="raw")))

The Figure below show the error distribution for the original Gladiator and the TD-MLR on the three sensors

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 1.860340410^{-23} 1.860340410^{-23} 1.860340410^{-23}

4.3.1.2 Evaluation on Xbow

xbow_test_results=test_results %>% filter(imuid==6 &  model=='mlr') %>%select(imuid,X1,X1vsRaw,X2,X2vsRaw,X6,X6vsRaw)
names(xbow_test_results)=c("imuid","X1_mlr","X1_raw","X2_mlr","X2_raw","X6_mlr","X6_raw")
xbow_test_results=tidyr::gather(xbow_test_results,"sensor","value",2:7)
xbow_test_results=tidyr::separate(xbow_test_results,sensor,c("sensor","source"),sep="_") 
px1=wilcox.test(value~source,data=xbow_test_results %>% filter(sensor=="X1" & (source=="mlr"|source=="raw")))
px2=wilcox.test(value~source,data=xbow_test_results %>% filter(sensor=="X2" & (source=="mlr"|source=="raw")))
px6=wilcox.test(value~source,data=xbow_test_results %>% filter(sensor=="X6" & (source=="mlr"|source=="raw")))

The Figure below show the error distribution for the original Xbow and the TD-MLR on the three sensors

bwplot(value~source|sensor,data=xbow_test_results,groups=source,layout=c(3,1),scales=list(relation='free'))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 6.259903210^{-6} 5.002387410^{-8} 1.860340410^{-23}

4.3.1.3 Evaluation on Xsens

xsens_test_results=test_results %>% filter(imuid==8 &  model=='mlr') %>%select(imuid,X1,X1vsRaw,X2,X2vsRaw,X6,X6vsRaw)
names(xsens_test_results)=c("imuid","X1_mlr","X1_raw","X2_mlr","X2_raw","X6_mlr","X6_raw")
xsens_test_results=tidyr::gather(xsens_test_results,"sensor","value",2:7)
xsens_test_results=tidyr::separate(xsens_test_results,sensor,c("sensor","source"),sep="_") 
px1=wilcox.test(value~source,data=xsens_test_results %>% filter(sensor=="X1" & (source=="mlr"|source=="raw")))
px2=wilcox.test(value~source,data=xsens_test_results %>% filter(sensor=="X2" & (source=="mlr"|source=="raw")))
px6=wilcox.test(value~source,data=xsens_test_results %>% filter(sensor=="X6" & (source=="mlr"|source=="raw")))

The Figure below show the error distribution for the original Xsens and the TD-MLR on the three sensors

bwplot(value~source|sensor,data=xsens_test_results,groups=source,layout=c(3,1),scales=list(relation='free'))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 0.005236 2.421971910^{-6} 1.860340410^{-23}

4.3.1.4 Evaluation on Xtal

xtal_test_results=test_results %>% filter(imuid==9 &  model=='mlr') %>%select(imuid,X1,X1vsRaw,X2,X2vsRaw,X6,X6vsRaw)
names(xtal_test_results)=c("imuid","X1_mlr","X1_raw","X2_mlr","X2_raw","X6_mlr","X6_raw")
xtal_test_results=tidyr::gather(xtal_test_results,"sensor","value",2:7)
xtal_test_results=tidyr::separate(xtal_test_results,sensor,c("sensor","source"),sep="_") 
px1=wilcox.test(value~source,data=xtal_test_results %>% filter(sensor=="X1" & (source=="mlr"|source=="raw")))
px2=wilcox.test(value~source,data=xtal_test_results %>% filter(sensor=="X2" & (source=="mlr"|source=="raw")))
px6=wilcox.test(value~source,data=xtal_test_results %>% filter(sensor=="X6" & (source=="mlr"|source=="raw")))

The Figure below show the error distribution for the original Xtal and the TD-MLR on the three sensors

bwplot(value~source|sensor,data=xtal_test_results,groups=source,layout=c(3,1),scales=list(relation='free'))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 5.46471210^{-5} 1.055711710^{-6} 1.860340410^{-23}

4.3.2 Comparing MLR vs MLP

The proposed method TD-MLR is now compared with an standard MLP on the four IMUS. Notice that model parameters result from Cross-validation on Stretch T1. For TD-MLR and MLP, the models with the Lower RSME are selected.

4.3.2.1 Evaluation on Gladiator

The Figure below show the error distribution for the TD-MLR and MLP for the three sensors on Gladiator

bwplot(value~model|sensor,data=glad_test_results %>% filter((model=="mlp"|model=="mlr")),groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X1" & (model=="mlp"|model=="mlr")))
px2=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X2" & (model=="mlp"|model=="mlr")))
px6=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X6" & (model=="mlp"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and the results are presented in the following table As can be observed the differences are NOT statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 0.8295338 0.628813 0.9123817

4.3.2.2 Evaluation on Xbow

xbow_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==6 & 
  (   model=='mlr' | 
      model==as.character(xbow_mlp_model$X1.model) | 
      model==as.character(xbow_mlp_model$X2.model) |
      model==as.character(xbow_mlp_model$X6.model) 
  ))

xbow_test_results=melt(xbow_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")

xbow_test_results=xbow_test_results %>% filter(model=='mlr' | 
  ( model==as.character(xbow_mlp_model$X1.model) & sensor==substr(names(xbow_mlp_model)[1], 1,2)  ) | 
  ( model==as.character(xbow_mlp_model$X2.model) & sensor==substr(names(xbow_mlp_model)[3], 1,2)) | 
  ( model==as.character(xbow_mlp_model$X6.model) & sensor==substr(names(xbow_mlp_model)[5], 1,2))  
    )
xbow_test_results$model=factor(xbow_test_results$model, levels=c(levels(xbow_test_results$model), "mlp"))
# rename mlpxxx to mlp
xbow_test_results[grepl("mlp",xbow_test_results$model),]$model='mlp'

The Figure below show the error distribution for the TD-MLR and MLP for the three sensors on Xbow

bwplot(value~model|sensor,data=xbow_test_results %>% filter( model=="mlp"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X1" & (model=="mlp"|model=="mlr")))
px2=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X2" & (model=="mlp"|model=="mlr")))
px6=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X6" & (model=="mlp"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and the results are presented in the following table As can be observed the differences are NOT statistically significant for the X1 and X2 sensors. However in this case the difference for the X6 sensor are significant with a p.value <0.05

Sensor Name X1 X2 X6
P Value 0.8370025 0.4072711 0.0024169

4.3.2.3 Evaluation on XSENS

xsens_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==5 & 
  (   model=='mlr' | 
      model==as.character(xsens_mlp_model$X1.model) | 
      model==as.character(xsens_mlp_model$X2.model) |
      model==as.character(xsens_mlp_model$X6.model) 
  ))

xsens_test_results=melt(xsens_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")

xsens_test_results=xsens_test_results %>% filter(model=='mlr' | 
  ( model==as.character(xsens_mlp_model$X1.model) & sensor==substr(names(xsens_mlp_model)[1], 1,2)  ) | 
  ( model==as.character(xsens_mlp_model$X2.model) & sensor==substr(names(xsens_mlp_model)[3], 1,2)) | 
  ( model==as.character(xsens_mlp_model$X6.model) & sensor==substr(names(xsens_mlp_model)[5], 1,2))  
    )
xsens_test_results$model=factor(xsens_test_results$model, levels=c(levels(xsens_test_results$model), "mlp"))
# rename mlpxxx to mlp
xsens_test_results[grepl("mlp",xsens_test_results$model),]$model='mlp'

The Figure below show the error distribution for the TD-MLR and MLP for the three sensors on Xsens

bwplot(value~model|sensor,data=xsens_test_results %>% filter( model=="mlp"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X1" & (model=="mlp"|model=="mlr")))
px2=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X2" & (model=="mlp"|model=="mlr")))
px6=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X6" & (model=="mlp"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are NOT statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 0.7850629 0.7485202 0.8820976

4.3.2.4 Evaluation on Xtal

xtal_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==5 & 
  (   model=='mlr' | 
      model==as.character(xtal_mlp_model$X1.model) | 
      model==as.character(xtal_mlp_model$X2.model) |
      model==as.character(xtal_mlp_model$X6.model) 
  ))

xtal_test_results=melt(xtal_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")

xtal_test_results=xtal_test_results %>% filter(model=='mlr' | 
  ( model==as.character(xtal_mlp_model$X1.model) & sensor==substr(names(xtal_mlp_model)[1], 1,2)  ) | 
  ( model==as.character(xtal_mlp_model$X2.model) & sensor==substr(names(xtal_mlp_model)[3], 1,2)) | 
  ( model==as.character(xtal_mlp_model$X6.model) & sensor==substr(names(xtal_mlp_model)[5], 1,2))  
    )
xtal_test_results$model=factor(xtal_test_results$model, levels=c(levels(xtal_test_results$model), "mlp"))
# rename mlpxxx to mlp
xtal_test_results[grepl("mlp",xtal_test_results$model),]$model='mlp'

The Figure below show the error distribution for the TD-MLR and MLP for the three sensors on Xtal

bwplot(value~model|sensor,data=xtal_test_results %>% filter( model=="mlp"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X1" & (model=="mlp"|model=="mlr")))
px2=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X2" & (model=="mlp"|model=="mlr")))
px3=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X6" & (model=="mlp"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are NOT statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 0.9199741 0.8896544 0.8820976

4.3.2.5 Discussion

After evaluating MLR and MLP, it is possible to conclude that both model performances are comparable in terms of RMSE for the four IMUs evaluated. MLR has shown an equivalent performance in most of the the cases yet outperforming in the case of sensor X6 (GyroZ) for the Crossbow(xbow).

4.3.3 Comparing MLR vs MA

In this section the proposed MLR model is compared with the standar MA method for filtering noise.

4.3.3.1 Evaluation on Gladiator

glad_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==5 & (model=='ma' | model=='mlr'))
glad_test_results=melt(glad_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")

The Figure below show the error distribution for the TD-MLR and MA for the three sensors on Gladiator

bwplot(value~model|sensor,data=glad_test_results %>% filter((model=="ma"|model=="mlr")),groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X1" & (model=="ma"|model=="mlr")))
px2=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X2" & (model=="ma"|model=="mlr")))
px6=wilcox.test(value~model,data=glad_test_results %>% filter(sensor=="X6" & (model=="ma"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are NOT statistically significant for the two out of the three sensors

Sensor Name X1 X2 X6
P Value 0.0757543 1.860340410^{-23} 0.5242813

4.3.3.2 Evaluation on Xbow

xbow_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==6 & (model=='ma' | model=='mlr' | model=='mlp80'| model=='mlp5'))
xbow_test_results=melt(xbow_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")
xbow_test_results=xbow_test_results %>% filter(model=='mlr' | model=='ma' | ( model=='mlp80' & (sensor=='X1' | sensor=='X6')) | ( model=='mlp5' & sensor=='X2'))

The Figure below show the error distribution for the TD-MLR and MA for the three sensors on Xbow

bwplot(value~model|sensor,data=xbow_test_results %>% filter( model=="ma"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X1" & (model=="ma"|model=="mlr")))
px2=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X2" & (model=="ma"|model=="mlr")))
px6=wilcox.test(value~model,data=xbow_test_results %>% filter(sensor=="X6" & (model=="ma"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 3.45965910^{-5} 1.230704410^{-7} 4.184204310^{-14}

4.3.3.3 Evaluation on XSENS

xsens_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==8 & (model=='ma' | model=='mlr' | model=='mlp80'| model=='mlp100'))
xsens_test_results=melt(xsens_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")
xsens_test_results=xsens_test_results %>% filter(model=='mlr' | model=='ma' | ( model=='mlp100' & (sensor=='X2' | sensor=='X6')) | ( model=='mlp80' & sensor=='X1'))

The Figure below show the error distribution for the TD-MLR and MA for the three sensors on Xsens

bwplot(value~model|sensor,data=xsens_test_results %>% filter( model=="ma"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X1" & (model=="ma"|model=="mlr")))
px2=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X2" & (model=="ma"|model=="mlr")))
px6=wilcox.test(value~model,data=xsens_test_results %>% filter(sensor=="X6" & (model=="ma"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 0.005236 2.421971910^{-6} 1.860340410^{-23}

4.3.3.4 Evaluation on Xtal

xtal_test_results=test_results %>% select(model,imuid,sample,X1,X2,X6) %>% filter(imuid==9 & (model=='ma' | model=='mlr' | model=='mlp80'| model=='mlp100'| model=='mlp5'))
xtal_test_results=melt(xtal_test_results,id.vars=c("imuid","model","sample"),variable.name="sensor",value.name="value")
xtal_test_results=xtal_test_results %>% filter(model=='mlr' | model=='ma' | ( model=='mlp80' & sensor=='X1') | ( model=='mlp100' & sensor=='X2')| ( model=='mlp100' & sensor=='X6'))

The Figure below show the error distribution for the TD-MLR and MA for the three sensors on Xtal

bwplot(value~model|sensor,data=xtal_test_results %>% filter( model=="ma"|model=="mlr") ,groups=factor(model),layout=c(3,1),scales=list(relation='free'))

px1=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X1" & (model=="ma"|model=="mlr")))
px2=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X2" & (model=="ma"|model=="mlr")))
px6=wilcox.test(value~model,data=xtal_test_results %>% filter(sensor=="X6" & (model=="ma"|model=="mlr")))

A man-Whitney-Wilcox test is conducted for the three sensors and results are presented in the following table As can be observed the differences are statistically significant for the three sensors

Sensor Name X1 X2 X6
P Value 6.251688610^{-5} 9.975520710^{-7} 1.860340410^{-23}

4.3.3.5 Discussion

In general the TD-MLR outperforms in three out of four IMUS. Namely, Xbow, Xsens and Xtal, while in the case of Gladiator, the performance of TD-MLR has not show a statistically significant difference. Clearly, given the simplicity of the MA method, it is fair to say that for the Gladiator IMU the application of a TD-MLR model is not justified. However, the MLR has shown a significant difference in the remaining IMUs.

5 Concluding Remarks