The Quantified Self is a word that embodies, “self-knowledge through numbers (Ferriss,2013). This term was made popular in 2007 by Gary Wolf, that explains incorporation of technology into data acquisition on aspects of a person’s daily life. The tracking the details of a person’s daily life to find patterns or to determine causation has become quite popular since then, but originally the term became popular when Wolf first described it as the intersection of data and self-improvement. The overall intent of this assignment carried out by our team “AIrborne Analytics”, was to collect and then share data measures about ourselves, over a four-week period time.
How mood is affected by the digital screen time and is there any relationship between the two parameters? The overall intent of this assignment carried out by our team “AIrborne Analytics”, was to collect and then share data measures about ourselves, over a four-week period time and to determine the relationship between mood and screentime. For this analysis, we added different other parameters down the track like academic events (assignment deadlines) to determine its effect on the mood of a person.
Based on reports in popular media (Alltucker, Ken.2019), we hypothesised that digital addiction can negatively affect mood. The most intuitive way to collect digital addiction data seemed to be via an app which measures mobile phone usage and time spent on particular apps. Also, in this study we discussed that along with the screen usage, how other factors like academic workload (Events) in the forma of assignment deadline adds an impression (awful, neutral, happy) to your mood. A measurement of daily mood was collected through a self-report form via google sheets (Appendix-A2).
Slack team, “AIrborne Analytics” was established on the “#slack” (https://slack.com), so assignment issues and ideas could be quickly shared amongst all team members. At our first team considered several different options for collecting personal data and several apps were tried and reviewed in order to improve the reliability of their data collection methods. Several issues arouse with the smartphone applications due to the difference in the smartphone gadget, so after trials “Moment App” was selected as the application for collecting phone usage data as this app works both for android systems and iPhone. The datasets and methods listed in the following table were agreed and our team recorded data from March 24rd to April 28th, 2019. We also created a “Data Plan” which was uploaded to our Google Drive and specified what data has to be collected and how.
Screen Usage To measure smartphone usage, we experimented with several apps. The app had to satisfy the following two conditions: * Functional across iPhone and Android platforms (The app Moment was suggested and selected.) * The data derived should make sense
Moment Data Moment data was required to determine the app usage time by each user over a period of specific time for finding the impact of that time on a person’s mood. As Moment exports data as Json, that has hierarchical format so we converted the json to Excel by using Json to excel converter (Link: http://convertcsv.com/json-to-csv.htm and https://json-csv.com/.)
Data Frequency: The data needed to be aggregated to daily frequency starting from 25th of March for a period of five weeks. Each team members extracted their own data and then send it to the Google Sheets template on the shared folder on google drive. Once the data collection was done then all individual data sets are consolidated into an output google sheet. This output sheet was designed to output in normalised database format, and contained the following fields.
Events Events include the countdown for academic assignments for all the three courses STDS, DSI, DAM, the users are taking this semester. This feature really has an important role in mood analysis, as all the users are enrolled in at least two courses this semester. So, our aim was to determine how the deadlines for different assignments affect the mood of a person and too what level. As there are people who are not affected by deadlines and daily life challenges but others are greatly affected by approaching deadlines and tasks (Mentally and physically).
Mood Data: The state of mind/ feeling of the user was noted at a specific time of the day, to figure out if it affected by the factors like phone usage or the assignment deadline approaching. Also, one’s mood is greatly influenced by where (Location) that person is; suppose at University late night one would be exhausted, at home maybe a person is doing well than any other place and so on.
Collection of Mood Data: Google Forms was used to create a “Mood Form” which collected the mood of the person in free form text. The mood entries were performed at 9pm every night, to measure cumulative impact on any digital addiction from during the day. The google sheet behind the google form captured the following fields
My google timeline data to determine how many location points Google had saved about me. Getting the statistics of my timeline to determine the number of points, google has collected about me per day, per month and per year. Also generating heatmaps of where I travelled across the globe and frequency of travel.
Google Drive was used for all shared information and each team member uploaded the data they had collected. The datasets were small enough to be uploaded onto individual “Google Sheets” as Excel documents, without any problems. The data was then combined into one Excel workbook, with one spreadsheet per dataset containing the combined data for all team members.
Everyone in the team is happy with the general idea of collecting phone usage data and checking if mood is influenced by that. The data collection process seems smooth as well but there was a bigger challenge with the quality of data than we first expected. We expected that the Smartphone applications would perform efficiently but this was often far from the case.
Before the analysis the entire data set was cleaned and consolidated into a single datafile, on the basis of unique user_id assigned to each user in the datasets
Moment Dataset: Outliers and discrepancies (such as time spent on phone being 26 hours in a day, background apps being most used on a regular basis, etc) are being called out as and when noticed, were removed. Also, there were many background apps observed with high value of screen time in sec, so that was adjusted (Background value= 0 for app categories such as tools and system). These values are considered as outliers due to their unexceptional high value
Mood Dataset: There were some missing values observed because the user might have forgot to submit the mood entries on any specific date. Also, the start dates of data collection are different for all users that’s why there are missing entries in the initial few days for two of the users. In that case the data set is selected from 24th of March for all users (data present for all users). Those missing values were imputed by the mode of mood for that specific user.
Each team member was assigned a user number (1, 2, 3 & 4) Initially there were 6 group members but two of the members dropped leaving behind just four of us, and a unique user_id (by combining User # and Date) was generated in order to determine if multiple entries exist for a single day. In the final consolidated data set for analysis the names of team members are not mentioned to maintain the anonymity of the team members (Data ethics). The anonymity of users helps maintain the personal identity, their psychological behaviour, the places they have been to, and their feedback on the mood survey protected with in the group as well as from other people who have access to the data and this analysis. And so, the chances of misusing this data and disclosing user information would not be possible. The user information has been removed from the moment data as well for this analysis to preserve the personal identity of each individual user. Also doing so, would not identify specific user and the most frequently apps being used by them in this analysis. As moment has access to entire phone features so globally chances of data breaching are possible, however in this specific analysis masking user identity helps protecting user’s information and activities. Masking the user names and email addresses form the events data, also helps protect the user form being exposed.
Due to the significant under collection of data for 23rd March and 28th April by two of the users, I decided to remove this incomplete data set from the analysis moving forward, I have selected the data from march 24th to April 28th, 2019.
Tools used for data analysis * Microsoft Excel: Dataset arrangement and consolodation * Tableau: group data * R: Individual data
The analysis of the outliers and missing data revealed that we had some significant problems with the reliability and accuracy of our data recording. It was somewhat ironic that the automated smartphone applications were in fact the least reliable for some team members for recording data.
Question: Is there a relationship between the distribution of screentime Vs app_categories for different users?
Findings & Conclusions: As outlined in Fig.1. Avg-screentime for specific category varies from user to user based on the interest, demands and personal activities of individual user. From the figure the Avg-screentime recorded for social apps are generally higher than other users except user 3 that has a high reference app than social. In this case we can also conclude that since most of our assignment related and group conversations were through WhatsApp and other social app, so this could be one of the reasons of its high usage (usually high before assignment deadlines). However on the other side reference app includes apps like slack or other app that we use for our assignment conversations and work related, so it seems like user 3 has already installed this app on his/her phone (As user 3 has the highest screen time for reference, even than social). From the figure, in comparison to all other users, user 4 has a lower screen usage and User 3 has the highest recorded time (Appendix).
Fig. 1 Av-Screentime distribution across each user and category
Question: Is there a relationship between Mood (self-rated) and Phone screentime(Horwood, 2019)?
Studies shows; “Increased duration of mobile phone use is associated with unfavourable psychological mood, in particular, a depressed mood.”(Kayoko Ikeda,2014) Comparing the screentime on the phone each day against the team and its effect on mood revealed several insights that I was not expecting and very different from what studies reveal (Kayoko Ikeda,2014), are shown below:
Findings & Conclusions: As outlined in Fig.2a there is no negative effect of screentime upon self-rated mood however it also must be noted that positive moods are largely observed for Avg_screentime for all users. At this stage I was yet to determine the accuracy of the sentiment score applied to the unstructured data set (Mood notes), or the accuracy of the determining screentime by the Moment app, therefore this also serves as a sense check of analysis using large datasets (global level analysis). From table 1(Appendix), -1, 0 and +1 values are assigned to different mood types classifying it as awful (Sad), Neutral and Happy (Excited). In Fig 2a, the relationship between average screentime and different mood types for each user are plotted. From the figure, it shows that;
Fig. 2a Av-Screentime and mood distribution for each user
Fig. 2b Mood and Screentime correlation per User
Comparing the time spent on the phone each day against the team average revealed several insights that I was not expecting and its effect on mood (Appendix-A3_1).
Question: Is there any effect on the Mood (self-rated) and the place where the user is?
The Mood dataset required manual data recording at set time period per day with a location entry. Looking at the figure (Mood vs Location), it is clear that some team members got close to this, but were often entering a standard response each time, and maybe not putting in the thought required. Alternatively, it is possible that any team member with a bad mood would not want to record low moods into this dataset. In hindsight, we did not discuss this issue in our team and it is a difficult one to comment at this stage. Surprisingly for me, it actually took some thought to work out what my mood was and initially I often found myself entering the same as I had entered previously because I didn’t feel any different to the previous data entry time. No doubt others are more in touch with their mood and may therefore also have wider variations in readings than I had. From my experience, there is clearly a lot of variance in people’s susceptibility to mood changes and surprisingly I couldn’t see that sort of variance in the mood in our analysis.
Findings & Conclusions: As outlined in Fig.3, Using and comparing this sort of subjective data is always going to be difficult to draw meaningful conclusions. However, the analysis surprisingly shows a higher positive mood at all locations. Form the figure 3a, we can say that the “+1” value that is the overall positive mood value recorded at different locations is higher than the overall awful and neutral mood values. The figure also provides insights about a higher count of positive mood for users at home and then at University (The rest of the locations are filtered out due to relatively lower mood values for all mood types and only few including home and university are selected).
Fig. 3a Mood distribution across different locations
Fig. 3b Mood distribution across different locations for each user
As clear from Fig 3a, that user location home has the highest recorded positive mood with respect to other location. Figure 3b, shows than among all users, user 2 has the highest recorded positive mood at home followed by user 1.
Question: Is there any effect on the Mood (self-rated) upon the event (academic deadlines) taking place?
The events variables were added to the analysis in a very later stage, to figure out if our moods have been affected by the academic assignment deadlines we had. I was also keen to figure out how my mood has responded to the deadlines as I being new to this field, study load was affecting me inside out. But just like all other team members the mood observed in response to these events remained unaffected irrespective of the rough time I have been through meeting all those deadlines.
Findings and Conclusions: Fig 4 shows the mood distribution across different events for all users. Also, from Fig A3-g(Appendix), it is clear that an overall positive mood has been observed is response to all the academic events that includes countdown of subject assignment. The fig below shows that an overall positive mood has been recorded by all users.
Fig. 4 Mood distribution across different academic events
Question: Is it true my phone keeps a map that tracks my whereabouts? Why, and if so, how can I see it? (J. D. Biersdorfer, 2017) Findings: To find an answer to this question, I dig into my location history, to check what google has been recording about me for all these years since I am using a smart phone, and the results were surprisingly amazing than anticipated. What I found:
Question: For how long has google been collecting my Data?
Findings: From the google timeline analysis it has been shown that google has been collecting my data from Nov 2014 – present. However, this analysis has the data collected from Nov-2014 to May 2019(Appendix A3-e).
Question: What places I have visited around the globe, what locations?
Findings: Using my timeline data, I came to know about the places I have visited. Fig 5, shows the heat map of my visited places with the counts showing the number of data points google has collected about me. Zooming each location would give further information about the places I have been to in that specific locations, and as google is linked to my gallery and photos so it even provides me the snapshot and memories associated to that location that I have saved on my phone.
## user system elapsed
## 154.50 3.69 160.82