# A tibble: 11 x 3
# Groups: year [?]
year month steps
<chr> <chr> <dbl>
1 2016 12 101189
2 2017 01 93218
3 2017 02 71086
4 2017 03 94068
5 2017 04 84502
6 2017 05 68170
7 2017 06 80500
8 2017 07 95243
9 2017 08 77403
10 2017 09 69795
11 2017 10 27339
# A tibble: 319 x 4
# Groups: date, month [?]
date month year steps
<chr> <chr> <chr> <dbl>
1 2016-12-01 12 2016 58
2 2016-12-02 12 2016 908
3 2016-12-03 12 2016 1077
4 2016-12-04 12 2016 1025
5 2016-12-05 12 2016 4071
6 2016-12-06 12 2016 1544
7 2016-12-07 12 2016 5574
8 2016-12-08 12 2016 1172
9 2016-12-09 12 2016 2481
10 2016-12-10 12 2016 2107
# ... with 309 more rows
# A tibble: 11 x 3
# Groups: year [?]
year month flights
<chr> <chr> <dbl>
1 2016 12 45
2 2017 01 45
3 2017 02 35
4 2017 03 53
5 2017 04 33
6 2017 05 53
7 2017 06 35
8 2017 07 59
9 2017 08 40
10 2017 09 44
11 2017 10 11
# A tibble: 198 x 4
# Groups: date, month [?]
date month year flights
<chr> <chr> <chr> <dbl>
1 2016-12-02 12 2016 2
2 2016-12-05 12 2016 1
3 2016-12-08 12 2016 1
4 2016-12-09 12 2016 1
5 2016-12-11 12 2016 1
6 2016-12-12 12 2016 13
7 2016-12-13 12 2016 1
8 2016-12-15 12 2016 3
9 2016-12-16 12 2016 1
10 2016-12-17 12 2016 2
# ... with 188 more rows
Summary Statistics by Month for 2016 and 2017 (december)
# A tibble: 11 x 8
month mean sd median max min `25%` `75%`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 01 3007.03 1684.16 3169.0 6183 268 1596.50 4349.5
2 02 2538.79 1922.85 2602.0 6064 160 573.25 4188.5
3 03 3034.45 1600.43 3602.0 5757 72 2092.50 4271.5
4 04 2816.73 1436.81 2786.0 5681 86 2031.75 3419.5
5 05 2199.03 1358.19 2064.0 5310 57 1217.00 2950.0
6 06 2683.33 1496.21 2583.5 5690 208 1640.50 3261.0
7 07 3072.35 1887.57 3076.0 7572 113 1723.50 3943.0
8 08 2496.87 1433.31 2398.0 6295 76 1729.50 3118.0
9 09 2326.50 1906.04 2004.0 7974 62 858.75 3062.5
10 10 1822.60 1505.98 1725.0 5855 93 720.50 2218.0
11 12 3264.16 1974.65 3071.0 8832 58 1591.00 4677.5
Summary Statistics by Day of Week for 2016 and 2017
# A tibble: 7 x 8
dayofweek mean sd median max min `25%` `75%`
<ord> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Friday 3796.78 1695.58 3672.0 7572 336 2550.00 5091.75
2 Tuesday 3063.87 1537.77 2947.0 6792 205 1823.00 4002.00
3 Wednesday 3166.09 1287.23 2875.0 5574 262 2281.00 4207.00
4 Thursday 2746.46 1247.41 2782.0 5690 58 1905.75 3589.25
5 Monday 2971.20 1620.79 2730.0 8832 57 1916.00 4058.00
6 Saturday 2090.72 1751.11 1595.5 7974 113 561.50 3142.50
7 Sunday 1115.20 1282.16 548.5 5855 62 208.50 1636.25
Summary Statistics by Month for 2016 and 2017 (december)
# A tibble: 11 x 8
month mean sd median max min `25%` `75%`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 01 2.37 2.45 2.0 12 1 1 2.00
2 02 1.84 1.07 2.0 5 1 1 2.00
3 03 2.65 3.47 1.0 16 1 1 2.25
4 04 1.94 1.56 1.0 6 1 1 2.00
5 05 2.52 2.18 2.0 8 1 1 3.00
6 06 1.94 1.11 1.5 4 1 1 3.00
7 07 2.95 5.07 2.0 24 1 1 2.00
8 08 2.35 2.71 1.0 11 1 1 2.00
9 09 2.44 3.40 1.0 15 1 1 2.75
10 10 1.22 0.44 1.0 2 1 1 1.00
11 12 2.25 2.69 1.0 13 1 1 2.25
Summary Statistics by Day of Week for 2016 and 2017
# A tibble: 7 x 8
dayofweek mean sd median max min `25%` `75%`
<ord> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Saturday 3.40 3.31 2 15 1 1.25 3.75
2 Sunday 2.41 4.39 1 24 1 1.00 2.00
3 Monday 2.75 3.75 1 16 1 1.00 2.00
4 Tuesday 1.77 1.24 1 5 1 1.00 2.00
5 Wednesday 2.00 1.96 1 11 1 1.00 2.00
6 Thursday 1.72 1.06 1 5 1 1.00 2.00
7 Friday 1.89 1.30 1 6 1 1.00 2.00
The comparative histogram analysis by month of year shows a decreasing trend of number of steps taken post July 2017. This makes perfect sense as I was promoted at work in the month of August and the new position includes a lot of computer work as opposed to my previous work responsibilities which was hands on laboratory experiments involving significant amount of walking.
The Boxplot also represents similar pattern. The less number of steps i.e less walking is pretty consistent with less variability. December month of last year (2016) included the maximum number of steps.
The box and whisker plot compares the pattern of number of steps i.e. walking/running pattern by day of week bettwen years 2016 and 2017.
Again, this plot makes great sense based on change of work pattern. In 2016, since I had recently joined this jobwhich included bench laboratory work, involved a great deal of walking which is being represent in the plot having higher number of steps than in year 2017 almost every day.
It is also worth noting that no weekdays show outlier(s) which suggests how predictable and consistent my work habbits been throught both years.
The box and whisker plot as well as the scatter plot with linear model line represents the difference in lifestle or the physical activity in weekends compared to weekdays.
No surprise here as both the vizualization shows significantly less physical activity in terms of number of steps taken on weekends compared to weekdays which is when the work hours are. Weekends has mostly been spent on school delieverables completion sitting in front of computer.
The heatmap is a very good tool to visualize a pattern. The heatmaps on the left brings out a very unique lifestyle at my workplace along with other important findings.
Consistent with other visualizations, weekends (Saturdays and Sundays) have been less physical active for me compared to weekdays overall.
The most activity can be observed in the morning hours between 9 and 11 which is when I start work and is usually the pick time based on type of work.
The most important finding of this visualization is Friday. Friday is a day when we mostly have a group meeting between 9 and 12 in the noon which can be seen as less red representing less physical activity in the heatmap. My Friday routine is to go to animal facility which is where we keep experimental animals and do hubandry and experiment related procedures which involves a lot of walking. This is the time between 2 and 6 pm in general and is being accurately picked up by the heatmap visualization.
The number of flights cilimbed visualization shows an important pattern. Both the bargraph and boxplot shows the number of flights climbed were more in year 2017 in which I started climbing 8 flights a day around the month of March during workhours to cope up with some work related stress.
The boxplot visualization of flights climbed by day of week comparing the years 2016 and 2017 shows kind of conflicting picture when compared with earlier viz as it shows taller boxes compared to 2017 for the most part. However, it’s worth noting that due to a routine exercise in 2016, the interquartile range in 2016 is way less than that seen in 2017. Also there is no outlier in any day of the week in year 2017 which can be interpreted as 2016 viz being a phenomena due to a routine and planned exercise while that in 2017 being a random phenomena.
While comparing weekdays and weekend, the weekends shows overall higher number of flights climbed which makes sense as I live on the second floor and most errands and daily activities do include going up and down the floor. In contrast, the weekdays where I am mostly at work which on the first floor does not show a similar trend.
The scaller plot, however, shows upward surge in number of flights climbed on weekdays of only specific months which goes well with previous viz where I started exercise of climbing floor from the month of March. However, the overall trend supports more number of flights climbed on weekends.
Finally, the heatmap of flights climbed shows a very intuitive pattern of what type of days (weekends vs weekdays) I climbed the most number of flights along with showing the activity by the time of the day.
This heatmap do also corraborates visualizations shown above i.e. showing higher number on weekends (darker red overall) compared to weekdays(lighter red overall). One outlier could also be seen on Sunday at 3:00 pm on the plot and other than that, it seems I had the most number of flight climb counts on Saturdays.
The quantified self is a self knowledge through numbers. It is an analysis (more of a visualization) of tracking of one’s self data and to improve quality of daily fuctioning.
For this project, I decided to use my health application which comes in-built in Apple’s iphone. This app records user’s daily physical activity such as miles walked/run, flights climbed, steps taken etc. depending upon the permission a user grants to the app.
The data that I used are
1. Number of Steps and
2. Flights climbed in the period betwwen December 2016 and October 2017.
The Apple’s health app’s statistics could be sent as an email as an attachment which itself is in xml format. I converted xml to csv file in R using xmlAttrsToDataFrame function from xml package. Using another package on R called lubridate, I added some date and time values such as month, date, day of week, hour and type of day (weekend or weekday).