This report conducts data analysis of Xiaomi Mi Band’s data from android sqlite database.
For personalized purpose, the data can be obtained from android phone’s path “/data/data/com.xiaomi.hm.health”. Use RootExplorer to copy the whole directory named “databases” to your computer. Data used for this report can be accessed from github
if(!"MiBand" %in% installed.packages()){
devtools::install_github('MiBand_R_Package','BigBorg')
}
library(MiBand)
library(ggplot2)
library(plotly)
I’ve already packaged my code for data reading and cleaning inside MiBand package. If you are interested in the code inside the package, you can access them from my github repository.
MiData <- loadMiData("./data/databases","963276123")
str(MiData)
## List of 3
## $ data_clean:'data.frame': 193 obs. of 6 variables:
## ..$ date : Date[1:193], format: "2015-11-03" ...
## ..$ sleep.light: int [1:193] NA 286 247 238 290 278 284 214 251 229 ...
## ..$ sleep.deep : int [1:193] NA 106 153 167 134 162 151 179 178 169 ...
## ..$ step : int [1:193] 1015 13743 11000 14548 10582 10334 10652 16382 9026 8471 ...
## ..$ efficiency : num [1:193] NaN 0.27 0.383 0.412 0.316 ...
## ..$ weekday : Factor w/ 7 levels "Sunday","Monday",..: 6 7 5 1 3 4 2 6 7 5 ...
## $ data_week :'data.frame': 193 obs. of 6 variables:
## ..$ date : Date[1:193], format: "2015-11-03" ...
## ..$ sleep.light: num [1:193] 289 286 247 238 290 ...
## ..$ sleep.deep : num [1:193] 126 106 153 167 134 ...
## ..$ step : int [1:193] 1015 13743 11000 14548 10582 10334 10652 16382 9026 8471 ...
## ..$ efficiency : num [1:193] 0.288 0.27 0.383 0.412 0.316 ...
## ..$ weekday : Factor w/ 7 levels "Sunday","Monday",..: 6 7 5 1 3 4 2 6 7 5 ...
## $ avg_week :Classes 'tbl_df', 'tbl' and 'data.frame': 7 obs. of 5 variables:
## ..$ weekday : Factor w/ 7 levels "Sunday","Monday",..: 1 2 3 4 5 6 7
## ..$ sleep.light: num [1:7] 271 272 307 319 293 ...
## ..$ sleep.deep : num [1:7] 124 128 132 128 126 ...
## ..$ step : num [1:7] 7621 9002 6936 5757 7110 ...
## ..$ efficiency : num [1:7] 0.313 0.325 0.303 0.298 0.304 ...
head(MiData$data_clean)
## date sleep.light sleep.deep step efficiency weekday
## 1 2015-11-03 NA NA 1015 NaN Friday
## 2 2015-11-04 286 106 13743 0.2704082 Saturday
## 3 2015-11-05 247 153 11000 0.3825000 Thursday
## 4 2015-11-06 238 167 14548 0.4123457 Sunday
## 5 2015-11-07 290 134 10582 0.3160377 Tuesday
## 6 2015-11-08 278 162 10334 0.3681818 Wednesday
MiData is a list. Its element “data_clean” contains missing data, “data_week” is data with missing value substituted with mean value of the same day in week(mean value groupped by weekday).
datedur<-range(MiData$data_week$date)
nrow<-nrow(MiData$data_week)
summary(MiData$data_week$sleep.light)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 61.0 257.0 288.0 292.3 325.0 468.0
summary(MiData$data_week$sleep.deep)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 44.0 109.0 126.0 126.8 146.0 205.0
summary(MiData$data_week$step)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 642 4750 7872 7574 10410 19760
The date frame records data frome 2015-11-03 to 2016-05-13 and has 193 rows. Sleep duration is recorded as count of minutes.
Plotting:
Histogram
ggplotly(miPlot(MiData,"hist","sleep"))
ggplotly(miPlot(MiData,"hist","step"))
ggplotly(miPlot(MiData,"box","sleep"))
ggplotly(miPlot(MiData,"ts","sleep"))
ggplotly(miPlot(MiData,"ts","step"))
Time sequence analysis on steps:
weekly_ts_analysis <- function(data){
tsobj <- ts(data,start=1,frequency=7)
components <- decompose(tsobj)
plot(components)
}
weekly_ts_analysis(MiData$data_week$step)
Time sequence analysis on deep sleep:
weekly_ts_analysis(MiData$data_week$sleep.deep)
Time sequence analysis on light sleep:
weekly_ts_analysis(MiData$data_week$sleep.light)
ggplotly(miPlot(MiData,"week","sleep"))
ggplotly(miPlot(MiData,"week","step"))
MiData$data_week$month<-months(MiData$data_week$date)
vacation<-MiData$data_week[MiData$data_week$month %in% c("January","February","July","August"),]
schoolday<-MiData$data_week[!MiData$data_week$month %in% c("January","February","July","August"),]
boxplot(vacation$step,schoolday$step,names = c("vacation","school"))
title(main="Step")
As shown in the boxplot, mean step of school day is higher than that of vacation.
set.seed(0)
schoolresample<-matrix(sample(schoolday$step,1000,replace=T),nrow=100)
schoolmean<-apply(schoolresample,1,mean)
vacationresample<-matrix(sample(vacation$step,1000,replace = T),nrow = 100)
vacationmean<-apply(vacationresample,1,mean)
testresult<-t.test(schoolmean,vacationmean)
difference<-mean(schoolmean)-mean(vacationmean)
We are 1-3.749933910^{-57} confident to say step of school day is different from that of vacation. The mean difference is 3639.006(school Mean - Vacation Mean).
MiData$data_week$efficiency<-with(MiData$data_week,sleep.deep/(sleep.deep+sleep.light))
cors<-with(MiData$data_week,c(
cor(step,sleep.light),
cor(step,sleep.deep),
cor(step,sleep.light+sleep.deep),
cor(step,efficiency)
)
)
names(cors)<-c("step-sleep.light","step-sleep.deep","step-total sleep","step-efficiency")
cors
## step-sleep.light step-sleep.deep step-total sleep step-efficiency
## -0.271091602 -0.007771756 -0.252226057 0.139983026
Corelationship indicates that the longer you sleep, the fewer you are likely to walk. But such corelationship is quit weak. Note that within one row, which means on the same day, step of that day is recorded after sleep.
# Use manipulate if you are copy-pasting code to R studio environment
# manipulate({
# Y<-predict(loess(effciiency~I(sleep.light+sleep.deep),data=MiData$data_week),M)
# ggplot(data=MiData$data_week,aes(sleep.light+sleep.deep,efficiency))+
# geom_point()+geom_smooth(method="auto")+
# geom_vline(x=M)+labs(x="Total sleep")+labs(title=paste("Efficiency:
# ",Y,sep=""))
# },
# M=slider(
# min(MiData$data_week$sleep.light+MiData$data_week$sleep.deep),
# max(MiData$data_week$sleep.light+MiData$data_week$sleep.deep),
# initial = min(MiData$data_week$sleep.light+MiData$data_week$sleep.deep)
# )
#)
ggplotly(ggplot(data=MiData$data_week,aes(sleep.light+sleep.deep,efficiency))+
geom_point()+geom_smooth(method="auto")+
labs(title="Efficiency"))
The efficiency is extremely high when the total sleep is very small. That might be the body trying to compensate loss of total sleep time by increasing ratio of deep sleep. Though efficiency is high when you sleep for short time, deep sleep duration is not sufficent. As total sleep increase, we see a local high efficiency. Then efficiency goes down when you sleep for too long.
coefs<-summary(lm(step~sleep.light+sleep.deep,data=MiData$data_week))$coefficient
With deep sleep fixed, one minute increase of light sleep leads to -16.3624989 change of step. With light sleep fixed, one minute increase of deep sleep leads to -1.9898671 change of step.
The subject sleep longer on Wednesday and walk more on Monday. Step of school day is different from that of vacation. There is a weak corelation between sleep and step. Around 7 hours’ sleep has the relative high efficiency of sleep(deep sleep/total sleep).