Data tables

Row

Number of steps by month by year

# A tibble: 11 x 3
# Groups:   year [?]
    year month  steps
   <chr> <chr>  <dbl>
 1  2016    12 101189
 2  2017    01  93218
 3  2017    02  71086
 4  2017    03  94068
 5  2017    04  84502
 6  2017    05  68170
 7  2017    06  80500
 8  2017    07  95243
 9  2017    08  77403
10  2017    09  69795
11  2017    10  27339
# A tibble: 319 x 4
# Groups:   date, month [?]
         date month  year steps
        <chr> <chr> <chr> <dbl>
 1 2016-12-01    12  2016    58
 2 2016-12-02    12  2016   908
 3 2016-12-03    12  2016  1077
 4 2016-12-04    12  2016  1025
 5 2016-12-05    12  2016  4071
 6 2016-12-06    12  2016  1544
 7 2016-12-07    12  2016  5574
 8 2016-12-08    12  2016  1172
 9 2016-12-09    12  2016  2481
10 2016-12-10    12  2016  2107
# ... with 309 more rows

Number of Flights Climbed by Month by Year

# A tibble: 11 x 3
# Groups:   year [?]
    year month flights
   <chr> <chr>   <dbl>
 1  2016    12      45
 2  2017    01      45
 3  2017    02      35
 4  2017    03      53
 5  2017    04      33
 6  2017    05      53
 7  2017    06      35
 8  2017    07      59
 9  2017    08      40
10  2017    09      44
11  2017    10      11
# A tibble: 198 x 4
# Groups:   date, month [?]
         date month  year flights
        <chr> <chr> <chr>   <dbl>
 1 2016-12-02    12  2016       2
 2 2016-12-05    12  2016       1
 3 2016-12-08    12  2016       1
 4 2016-12-09    12  2016       1
 5 2016-12-11    12  2016       1
 6 2016-12-12    12  2016      13
 7 2016-12-13    12  2016       1
 8 2016-12-15    12  2016       3
 9 2016-12-16    12  2016       1
10 2016-12-17    12  2016       2
# ... with 188 more rows

Row

Summary Statistics of “steps”

Summary Statistics by Month for 2016 and 2017 (december)

# A tibble: 11 x 8
   month    mean      sd median   max   min   `25%`  `75%`
   <chr>   <dbl>   <dbl>  <dbl> <dbl> <dbl>   <dbl>  <dbl>
 1    01 3007.03 1684.16 3169.0  6183   268 1596.50 4349.5
 2    02 2538.79 1922.85 2602.0  6064   160  573.25 4188.5
 3    03 3034.45 1600.43 3602.0  5757    72 2092.50 4271.5
 4    04 2816.73 1436.81 2786.0  5681    86 2031.75 3419.5
 5    05 2199.03 1358.19 2064.0  5310    57 1217.00 2950.0
 6    06 2683.33 1496.21 2583.5  5690   208 1640.50 3261.0
 7    07 3072.35 1887.57 3076.0  7572   113 1723.50 3943.0
 8    08 2496.87 1433.31 2398.0  6295    76 1729.50 3118.0
 9    09 2326.50 1906.04 2004.0  7974    62  858.75 3062.5
10    10 1822.60 1505.98 1725.0  5855    93  720.50 2218.0
11    12 3264.16 1974.65 3071.0  8832    58 1591.00 4677.5

Summary Statistics by Day of Week for 2016 and 2017

# A tibble: 7 x 8
  dayofweek    mean      sd median   max   min   `25%`   `75%`
      <ord>   <dbl>   <dbl>  <dbl> <dbl> <dbl>   <dbl>   <dbl>
1    Friday 3796.78 1695.58 3672.0  7572   336 2550.00 5091.75
2   Tuesday 3063.87 1537.77 2947.0  6792   205 1823.00 4002.00
3 Wednesday 3166.09 1287.23 2875.0  5574   262 2281.00 4207.00
4  Thursday 2746.46 1247.41 2782.0  5690    58 1905.75 3589.25
5    Monday 2971.20 1620.79 2730.0  8832    57 1916.00 4058.00
6  Saturday 2090.72 1751.11 1595.5  7974   113  561.50 3142.50
7    Sunday 1115.20 1282.16  548.5  5855    62  208.50 1636.25

Summary Statistics of “Flights Climbed”

Summary Statistics by Month for 2016 and 2017 (december)

# A tibble: 11 x 8
   month  mean    sd median   max   min `25%` `75%`
   <chr> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
 1    01  2.37  2.45    2.0    12     1     1  2.00
 2    02  1.84  1.07    2.0     5     1     1  2.00
 3    03  2.65  3.47    1.0    16     1     1  2.25
 4    04  1.94  1.56    1.0     6     1     1  2.00
 5    05  2.52  2.18    2.0     8     1     1  3.00
 6    06  1.94  1.11    1.5     4     1     1  3.00
 7    07  2.95  5.07    2.0    24     1     1  2.00
 8    08  2.35  2.71    1.0    11     1     1  2.00
 9    09  2.44  3.40    1.0    15     1     1  2.75
10    10  1.22  0.44    1.0     2     1     1  1.00
11    12  2.25  2.69    1.0    13     1     1  2.25

Summary Statistics by Day of Week for 2016 and 2017

# A tibble: 7 x 8
  dayofweek  mean    sd median   max   min `25%` `75%`
      <ord> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
1  Saturday  3.40  3.31      2    15     1  1.25  3.75
2    Sunday  2.41  4.39      1    24     1  1.00  2.00
3    Monday  2.75  3.75      1    16     1  1.00  2.00
4   Tuesday  1.77  1.24      1     5     1  1.00  2.00
5 Wednesday  2.00  1.96      1    11     1  1.00  2.00
6  Thursday  1.72  1.06      1     5     1  1.00  2.00
7    Friday  1.89  1.30      1     6     1  1.00  2.00

Data Visualization

Steps by month by year


The comparative histogram analysis by month of year shows a decreasing trend of number of steps taken post July 2017. This makes perfect sense as I was promoted at work in the month of August and the new position includes a lot of computer work as opposed to my previous work responsibilities which was hands on laboratory experiments involving significant amount of walking.

The Boxplot also represents similar pattern. The less number of steps i.e less walking is pretty consistent with less variability. December month of last year (2016) included the maximum number of steps.

Steps by day of week year (Box Plot)


The box and whisker plot compares the pattern of number of steps i.e. walking/running pattern by day of week bettwen years 2016 and 2017.

Again, this plot makes great sense based on change of work pattern. In 2016, since I had recently joined this jobwhich included bench laboratory work, involved a great deal of walking which is being represent in the plot having higher number of steps than in year 2017 almost every day.

It is also worth noting that no weekdays show outlier(s) which suggests how predictable and consistent my work habbits been throught both years.

Steps: Weekdays vs Weekends


The box and whisker plot as well as the scatter plot with linear model line represents the difference in lifestle or the physical activity in weekends compared to weekdays.

No surprise here as both the vizualization shows significantly less physical activity in terms of number of steps taken on weekends compared to weekdays which is when the work hours are. Weekends has mostly been spent on school delieverables completion sitting in front of computer.

Heatmap of Steps by Day of Week-hour of Day


The heatmap is a very good tool to visualize a pattern. The heatmaps on the left brings out a very unique lifestyle at my workplace along with other important findings.

  1. Consistent with other visualizations, weekends (Saturdays and Sundays) have been less physical active for me compared to weekdays overall.

  2. The most activity can be observed in the morning hours between 9 and 11 which is when I start work and is usually the pick time based on type of work.

  3. The most important finding of this visualization is Friday. Friday is a day when we mostly have a group meeting between 9 and 12 in the noon which can be seen as less red representing less physical activity in the heatmap. My Friday routine is to go to animal facility which is where we keep experimental animals and do hubandry and experiment related procedures which involves a lot of walking. This is the time between 2 and 6 pm in general and is being accurately picked up by the heatmap visualization.

Number of Flights Climbed by Month by Year


The number of flights cilimbed visualization shows an important pattern. Both the bargraph and boxplot shows the number of flights climbed were more in year 2017 in which I started climbing 8 flights a day around the month of March during workhours to cope up with some work related stress.

Number of Flights Climbed by Day of Week-year (Box Plot)


The boxplot visualization of flights climbed by day of week comparing the years 2016 and 2017 shows kind of conflicting picture when compared with earlier viz as it shows taller boxes compared to 2017 for the most part. However, it’s worth noting that due to a routine exercise in 2016, the interquartile range in 2016 is way less than that seen in 2017. Also there is no outlier in any day of the week in year 2017 which can be interpreted as 2016 viz being a phenomena due to a routine and planned exercise while that in 2017 being a random phenomena.

Number of Flights Climbed: Weekdays vs Weekends


While comparing weekdays and weekend, the weekends shows overall higher number of flights climbed which makes sense as I live on the second floor and most errands and daily activities do include going up and down the floor. In contrast, the weekdays where I am mostly at work which on the first floor does not show a similar trend.

The scaller plot, however, shows upward surge in number of flights climbed on weekdays of only specific months which goes well with previous viz where I started exercise of climbing floor from the month of March. However, the overall trend supports more number of flights climbed on weekends.

Heatmap of Flights Climbed by Day of Week-hour of Day


Finally, the heatmap of flights climbed shows a very intuitive pattern of what type of days (weekends vs weekdays) I climbed the most number of flights along with showing the activity by the time of the day.

This heatmap do also corraborates visualizations shown above i.e. showing higher number on weekends (darker red overall) compared to weekdays(lighter red overall). One outlier could also be seen on Sunday at 3:00 pm on the plot and other than that, it seems I had the most number of flight climb counts on Saturdays.

Write Up (Discussion)

The quantified self is a self knowledge through numbers. It is an analysis (more of a visualization) of tracking of one’s self data and to improve quality of daily fuctioning.

The Apple’s health app’s statistics could be sent as an email as an attachment which itself is in xml format. I converted xml to csv file in R using xmlAttrsToDataFrame function from xml package. Using another package on R called lubridate, I added some date and time values such as month, date, day of week, hour and type of day (weekend or weekday).