Introduction
This report provides an exploratory data analysis (EDA) of the website usage data for PlanningAlerts.ie. The goal is to identify key trends and insights that can inform future marketing strategies.
Warning: package 'tidyverse' was built under R version 4.4.2
Warning: package 'ggplot2' was built under R version 4.4.2
Warning: package 'tibble' was built under R version 4.4.2
Warning: package 'tidyr' was built under R version 4.4.2
Warning: package 'readr' was built under R version 4.4.2
Warning: package 'purrr' was built under R version 4.4.2
Warning: package 'dplyr' was built under R version 4.4.2
Warning: package 'stringr' was built under R version 4.4.2
Warning: package 'forcats' was built under R version 4.4.2
Warning: package 'lubridate' was built under R version 4.4.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#Import the planning_alerts_data.csv file and create a new field called tfc_stamped_dt which contains a converted version of the tfc_stamped datetime field with values in the format of YYYY-MM-DD HH:MM:SS. Remove the old tfc_stamped field and rename the new one.
pa_data <- read_csv(“planning_alerts_data.csv”) %>% mutate(tfc_stamped_dt = dmy_hm(tfc_stamped)) %>% select(tfc_id, tfc_stamped_dt, tfc_cookie:tfc_referrer) %>% rename(tfc_stamped = tfc_stamped_dt)
```## Data Preprocessing
Extract the hour from the ‘tfc_stamped’ column
users_by_hour <- planning_data %>% mutate(hour = hour(tfc_stamped)) %>% group_by(hour) %>% summarise(user_count = n_distinct(tfc_cookie)) # Count unique users by hour
View the result
print(users_by_hour)