Project Plotly 2

Pavel

2024-03-07

Interactive graphics

Interactive user interface in graphics looks very attractive. You can click on some element of graph and get additional details.

Plot_ly is great tool to develop such interactive graphics. It is demonstrated below on the example of previous project on physical exercises with number of steps.

Basic graphics was replaced by plot_ly graphics that made report more animated.

And it is VERY EASY to switch from ggplot to plot_ly with ggplotly() function.

Loading and pre-processing the data

Activity data has been downloaded from Activity monitoring data on January 6 2024.

Zip file (“activity.zip”) has been downloaded and extracted (“activity.csv”) without any data pre-processing or modification. File “activity.csv” was used for analysis presented below.

R command read.csv() was used to load data into R.

        df <- read.csv("activity.csv")

What is mean total number of steps taken per day? 1/2

        suppressMessages(library(plotly))        

        total_step_number <- aggregate(steps ~ date, df, sum)
        s_mean <- as.integer(mean(total_step_number$steps))
        s_median <- as.integer(median(total_step_number$steps))
        cat("Number of steps per day:", 
            paste("Mean -",s_mean, " Median -", s_median))
## Number of steps per day: Mean - 10766  Median - 10765
        fig1 <- plot_ly(total_step_number,x=~steps, type="histogram", marker = list(color = "lightgreen")) %>% layout(title="Number of steps per day")

What is mean total number of steps taken per day? 2/2

What is the average daily activity pattern? 1/2

        average_activity <- aggregate(steps ~ interval, df, mean)
        max_int <- average_activity$interval[which.max(average_activity$steps)]
        cat(paste("Maximum activity interval -",max_int))
## Maximum activity interval - 835
        fig2 <- plot_ly(average_activity, x = ~interval, y = ~steps, type = "scatter", mode = "lines") %>% layout(title="Average activity across the day")

What is the average daily activity pattern? 2/2

Are there differences in activity patterns between weekdays and weekends? 1/2

        suppressPackageStartupMessages({
                library(dplyr)
                library(lubridate)
                library(ggplot2)
        })
        df <- mutate(df, day = wday(date))
        df <- mutate(df, day2 = "weekday")
        df[df$day %in% c(1,7),"day2"] <- "weekend"
        
        df_wd <- group_by(df, day2, interval)
        
        sum_df <- summarise(df_wd, av_steps = mean(steps, na.rm = TRUE),
                                    .groups = "drop_last" )
        
        plt <- ggplot(sum_df, aes(x=interval,y=av_steps))+geom_line()
        prn <- plt + ggtitle("Activity by working days and weekends") + 
                xlab("Interval") + ylab("Steps") + 
                facet_grid(day2 ~ .)
        prn <- ggplotly(prn)

Are there differences in activity patterns between weekdays and weekends? 2/2