## `summarise()` has grouped output by 'date'. You can override using the
## `.groups` argument.
Write a 300 word piece containing your analysis and your reflection on this experience. You will write it all here in RStudio. Make sure you address the following questions:
# Write your code to create data visualization #1 here. If you make more charts, you can copy and paste this code chunk as many times as you want. REMEMBER TO ADD TITLES AND LABELS TO ALL OF YOUR CHARTS!
The question given was “How often do I go to the gym on days where I’m busy?” To answer this question we must first break the question down into smaller pieces. First, find out what a busy day looks like, how often the person goes to the gym,and examine the frequency of gym visits during those “busy” days.
To find out how often he goes to the gym, we first want to find the time spent on each activity from 10/16 to 10/34. A column chart was created to visually represent this data. Observation: The chart reveals days with the highest activity levels, indicated by varying colors on each column. Consequently, it becomes evident that Oct 16, Oct 18, Oct 23, Oct 25, and Oct 30 are the busiest days.
ggplot(calendar_data, mapping=aes(x=date, y=duration, fill=activity))+geom_col()+ggtitle("Time spent on each activity from 10/16 to 10/34")+xlab("Date")+ylab("Time (Minutes)")
Additionally, a bar graph was generated to illustrate the frequency of each activity over the three-week period. Observation: The bar graph indicates that both work and gym activities occurred frequently throughout the three weeks. Notably, gym activities were consistently present on all the busiest days identified in the column chart.
ggplot(calendar_data, mapping=aes(x=activity))+geom_bar()+ggtitle("Number of Times Activity was done between the 3 Week Period")+theme(axis.text.x = element_text(angle = 45,hjust=1))
This was then followed up by creating a table showing the total number of minutes spent on each activity per day.This table assisted with the creation of a wide calendar table illustrating the minutes used on each activity daily. Observation: This table aided in identifying the busiest days and quantifying the minutes allocated to each activity during those days. The key observation is that, on average, 99 minutes are spent at the gym on every busy day, as previously noted.
byactivityAdate <- calendar_data %>% group_by(date, activity) %>% summarize(duration = sum(duration) %>% as.numeric())
## `summarise()` has grouped output by 'date'. You can override using the
## `.groups` argument.
widecalendar_data <- byactivityAdate %>% select(activity, duration, date) %>% mutate(duration=as.numeric(duration)) %>% pivot_wider(id_cols=date, names_from=activity, values_from=duration, values_fill=0)
What difficulties in the data collection & analysis process did you encounter?
The retrieved data contained information unnecessary for the project, and sifting through over 300 rows of observations was overwhelming. However, the provided code for data cleaning and analysis made the process significantly easier.
When providing your data, what expectations do you have of the person receiving your data?
That they would be able to easily analyze my data after cleaning it. The intention is that the provided data, once processed and organized, facilitates a straightforward analysis for anyone accessing it.
As someone who receives and analyzes others’ data, what are your ethical responsibilities?
As a recipient and analyst of others’ data, ethical responsibilities include ensuring the confidentiality and privacy of the data received. It is essential to follow legal and ethical guidelines for data protection and to use the information only for the intended purpose. Transparency in the analysis process is also essential, as a result I must provide clear communication about how the data is utilized and interpreted. Additionally, any insights or findings I find from the data should be presented accurately and without bias. I must also obtain consent where required and respect the rights and intentions of the data provider as that should be a fundamental ethical consideration in the use and analysis of data