Planning Alerts Project

Introduction

This report provides an Exploratory Data Analysis (EDA) of website usage data for PlanningAlerts.ie, aiming to uncover key trends and insights in user behavior. The analysis includes examining session frequency, referral traffic, device usage, and user retention. Insights gained from the data will inform future marketing strategies, helping optimize resource allocation and user engagement. Understanding peak usage times, referral sources, and device preferences will guide improvements to the user experience. The goal is to leverage these findings to enhance platform performance and increase user retention.

library(tidyverse)
planning_data <- read_csv("planning_alerts_data.csv") %>%
  mutate(tfc_stamped_dt = dmy_hm(tfc_stamped)) %>%
  
  select(tfc_id, tfc_stamped_dt, tfc_cookie:tfc_referrer) %>%
  rename(tfc_stamped = tfc_stamped_dt)  

Data Preprocessing

# Load necessary library
library(dplyr)
library(lubridate)

# Extract the hour from the 'tfc_stamped' column and count unique users by hour
users_by_hour <- planning_data %>%
  mutate(hour = hour(tfc_stamped)) %>%
  group_by(hour) %>%
  summarise(user_count = n_distinct(tfc_cookie))  # Count unique users by hour
data <- read_csv("planning_alerts_data.csv")

# Convert date-time column to datetime format
data$tfc_stamped <- dmy_hm(data$tfc_stamped)

# Create additional time-related columns
data$date <- as.Date(data$tfc_stamped)
data$hour <- hour(data$tfc_stamped)
data$day_of_week <- wday(data$tfc_stamped, label = TRUE)
data$week <- isoweek(data$tfc_stamped)
data$month <- month(data$tfc_stamped, label = TRUE)
# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)
library(lubridate)  # Ensure 'hour()' function works if not already extracted

# Extract hour from 'tfc_stamped' if needed and calculate unique users by hour
unique_users_hour <- data %>%
  mutate(hour = hour(tfc_stamped)) %>%  # Extract hour if not already done
  group_by(hour) %>%
  summarise(unique_users = n_distinct(tfc_cookie))

# Display the table using kable and kable_styling
unique_users_hour %>%
  arrange(hour) %>%
  kable(caption = "Unique Users by Hour", col.names = c("Hour", "Number of Unique Users")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Unique Users by Hour
Hour Number of Unique Users
0 5907
1 5549
2 5360
3 7057
4 5895
5 5630
6 5556
7 6830
8 8768
9 9133
10 11012
11 10348
12 9577
13 9059
14 10130
15 11469
16 12032
17 10613
18 10224
19 9369
20 9769
21 9148
22 8731
23 7643
# Plot unique users by hour
ggplot(unique_users_hour, aes(x = hour, y = unique_users)) +
  geom_histogram(stat = "identity", fill = "#aed6f1") +
  labs(title = "Unique Users by Hour histogram Chart", x = "Hour", y = "Number of Unique Users") +
  theme_minimal()

Key Insights and Actionable Outcome:

  • Peak Hours: The bar chart will help you identify at which hours of the day the platform experiences the most user activity. For instance, you may notice that users are more active during the afternoon or evening hours.

    • Actionable Insight: This information can be used to optimize marketing campaigns, such as sending email notifications or promotions when users are most likely to be online. Additionally, if you identify peak times, the platform can ensure that it has sufficient server capacity to handle the load.
  • Off-Peak Hours: Conversely, the chart can reveal times when user activity is low. This may present an opportunity for off-peak maintenance, content updates, or targeted campaigns to increase engagement during quieter hours.

    • Actionable Insight: For example, if a platform experiences low user activity in the early morning hours, you can schedule new feature rollouts or maintenance tasks without affecting user engagement.

In Summary:

This analysis helps the team understand when users are most active, facilitating better resource allocation, targeted marketing, and operational improvements based on hourly user behaviour.

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)
library(lubridate)

# Calculate unique users by day
unique_users_day <- data %>%
  mutate(date = as.Date(tfc_stamped)) %>%  # Extract the date
  group_by(date) %>%
  summarise(unique_users = n_distinct(tfc_cookie))

# Display the table using kable and kable_styling
unique_users_day %>%
  arrange(date) %>%
  kable(
    caption = "Unique Users by Day",
    col.names = c("Date", "Number of Unique Users")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
Unique Users by Day
Date Number of Unique Users
2024-06-14 2067
2024-06-15 1549
2024-06-16 1079
2024-06-17 1277
2024-06-18 1447
2024-06-19 1710
2024-06-20 3344
2024-06-21 1957
2024-06-22 2511
2024-06-23 3576
2024-06-24 1913
2024-06-25 1843
2024-06-26 3008
2024-06-27 3174
2024-06-28 1394
2024-06-29 1308
2024-06-30 851
2024-07-01 2097
2024-07-02 1104
2024-07-03 1527
2024-07-04 3410
2024-07-05 2146
2024-07-06 1152
2024-07-07 1881
2024-07-08 1288
2024-07-09 1785
2024-07-10 2424
2024-07-11 2337
2024-07-12 1454
2024-07-13 1065
2024-07-14 4371
2024-07-15 1433
2024-07-16 2456
2024-07-17 4853
2024-07-18 4592
2024-07-19 3921
2024-07-20 3023
2024-07-21 2067
2024-07-22 4707
2024-07-23 4649
2024-07-24 3703
2024-07-25 2586
2024-07-26 1999
2024-07-27 2310
2024-07-28 3190
2024-07-29 3495
2024-07-30 3560
2024-07-31 4202
2024-08-01 3266
2024-08-02 3390
2024-08-03 2852
2024-08-04 2355
2024-08-05 2061
2024-08-06 3756
2024-08-07 2984
2024-08-08 2490
2024-08-09 1682
2024-08-10 1369
2024-08-11 2130
2024-08-12 2440
2024-08-13 2498
2024-08-14 2115
2024-08-15 2882
2024-08-16 2465
2024-08-17 1885
2024-08-18 2619
2024-08-19 2707
2024-08-20 4601
2024-08-21 3799
2024-08-22 3523
2024-08-23 3604
2024-08-24 2731
2024-08-25 2479
2024-08-26 4054
2024-08-27 2950
2024-08-28 2893
2024-08-29 359
# Plot unique users by day
ggplot(unique_users_day, aes(x = date, y = unique_users)) +
  geom_line(color = "#aed6f1", size = 1.2) +  # Line chart with custom color and thickness
  geom_point(color = "#3498db", size = 2) +   # Add points for clarity
  labs(
    title = "Unique Users by Day (Line Chart)",
    x = "Date",
    y = "Number of Unique Users"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    axis.text.x = element_text(angle = 45, hjust = 1)  # Rotate date labels for clarity
  )

Key Insights and Actionable Outcome:

  • Trends Over Time: The line plot will show how user engagement fluctuates on a daily basis. For example, there might be a spike in users on certain days, indicating special events, promotions, or higher interest during weekends.

    • Actionable Insight: If the platform experiences increased engagement on weekends, you can plan weekend promotions or events to maximize engagement. Similarly, if user activity is lower on certain days, the platform can identify opportunities to drive more engagement during those periods.
  • Identifying Peaks and Drops: Any sharp increases or decreases in the number of unique users can be easily spotted.

    • Actionable Insight: If there’s a significant drop in users on a specific day, the team might investigate if it correlates with issues like site downtime, poor user experience, or competition promotions. Similarly, identifying peak activity days can help optimize marketing efforts or content releases during those high-traffic periods.
  • Long-Term Engagement: The plot helps assess whether users are engaging consistently over time or if there are fluctuations that require attention. For example, sustained high engagement might indicate good user retention, while drops could highlight issues that need addressing.

In Summary:

The line graph gives a clear visual representation of user activity over time. By analyzing daily unique user trends, the team can make data-driven decisions on how to improve user engagement, optimize marketing efforts, and manage site traffic.

Strategic Recommendations Based on Insights:

  1. Enhance Engagement on Low-Traffic Days: If certain days have lower user activity, consider running targeted promotions, notifications, or content updates to boost engagement on those days.

  2. Leverage High-Traffic Days: If certain days (like weekends) see higher engagement, focus your marketing efforts or push content releases during these periods to maximize user interaction.

  3. Investigate User Drop-offs: Identify and investigate sharp drops in user activity. Look for possible causes, such as site performance issues, and address them to ensure consistent user engagement.

  4. Target Repeat Users: The data can also be used to track repeat engagement. If a day shows a large influx of new users, consider following up with personalized recommendations to encourage repeat visits.

Conclusion Summary:

The unique users by day analysis gives clear insights into user behaviour trends, which can be used to make strategic decisions about content scheduling, marketing campaigns, and platform optimization. By using these insights, the Planning Alerts platform can boost user engagement, identify peak activity times, and develop strategies to improve user retention and satisfaction.

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)

# Calculate user distribution by device type
device_type_counts <- data %>%
  group_by(tfc_device_type) %>%
  summarise(visits = n()) %>%
  arrange(desc(visits))

# Display the table using kable and kable_styling
device_type_counts %>%
  kable(
    caption = "User Distribution by Device Type",
    col.names = c("Device Type", "Number of Visits")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
User Distribution by Device Type
Device Type Number of Visits
Desktop 253774
Mobile (browser) 96146
Android App 44101
iPhone App 3133
Tablet (browser) 3061
# Plot device type distribution
ggplot(device_type_counts, aes(x = reorder(tfc_device_type, -visits), y = visits)) +
  geom_histogram(stat = "identity", fill = "#aed6f1", width = 0.4) +
  labs(
    title = "User Distribution by Device Type (histogram Chart)",
    x = "Device Type",
    y = "Number of Visits"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12)
  )

Key Insights and Actionable Outcome:

  • Device Preferences: The bar chart will allow you to easily see which device types are most commonly used to access the platform.

    • Actionable Insight: If mobile devices have a higher share of visits, consider optimizing the platform for mobile users. If desktop visits are predominant, there may be an opportunity to further enhance the desktop experience.
  • Marketing Strategies: Knowing which devices are most commonly used by visitors can guide marketing strategies.

    • Actionable Insight: If the platform is mostly accessed by mobile users, you can tailor marketing campaigns for mobile-first experiences (e.g., mobile app promotions, mobile-friendly designs).

    • Similarly, if desktop visits dominate, efforts can be made to enhance the desktop user interface.

  • Resource Allocation: This analysis also helps in understanding where to allocate resources in terms of optimization and user support.

    • Actionable Insight: If mobile visits are high, resources should be dedicated to mobile performance optimization, customer support for mobile users, and testing mobile features.

In Summary:

The device type distribution bar chart provides insights into how users are accessing the platform and which devices they prefer. By understanding the distribution, the team can focus on optimizing device-specific experiences, target marketing efforts effectively, and ensure the platform delivers the best performance across all devices.

Strategic Recommendations Based on Insights:

  1. Enhance Mobile Experience: If mobile devices account for a significant portion of visits, prioritize improving mobile user interface, performance, and responsive design.

  2. Optimize Desktop Features: If desktop usage is dominant, ensure that desktop-centric features are well optimized and that the platform supports larger screens, detailed content, and richer interactions.

  3. Targeted Marketing: Use the insights to design targeted marketing campaigns for different devices. For instance, mobile-specific ads or campaigns can be tailored for mobile users.

  4. Cross-Device Consistency: Ensure a consistent user experience across all devices. If there is a noticeable difference in user experience between mobile and desktop, it could affect user satisfaction and retention.

Conclusion Summary:

The user distribution by device type analysis reveals the extent of device-specific engagement on the platform. By identifying the most used devices, the platform can tailor its marketing, design, and resource allocation to provide an optimized experience for the majority of users. Ultimately, understanding user preferences based on device type allows the team to create more targeted campaigns and ensure the platform’s usability is maximized across different device categories.

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)
library(lubridate)

# Calculate sessions by hour
sessions_hour <- data %>%
  mutate(hour = hour(tfc_stamped)) %>%  # Extract the hour if not already done
  group_by(hour) %>%
  summarise(sessions = n_distinct(tfc_session)) %>%
  arrange(hour)

# Display the table using kable and kable_styling
sessions_hour %>%
  kable(
    caption = "Sessions by Hour",
    col.names = c("Hour", "Number of Sessions")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
Sessions by Hour
Hour Number of Sessions
0 8442
1 7738
2 7275
3 8859
4 7739
5 7463
6 7667
7 9098
8 11847
9 13389
10 16006
11 15319
12 14667
13 14095
14 15177
15 16752
16 17089
17 15062
18 13977
19 13216
20 13447
21 12875
22 12387
23 10838
# Plot sessions by hour as a bar chart with color
ggplot(sessions_hour, aes(x = hour, y = sessions)) +
  geom_bar(stat = "identity", fill = "steelblue") +  # Corrected to use geom_bar for pre-aggregated data
  labs(
    title = "Sessions by Hour (Bar Chart)",
    x = "Hour",
    y = "Number of Sessions"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12)
  )

Key Insights and Actionable Outcome:

  • Peak Activity Hours: The chart will highlight the hours with the highest number of sessions, showing when users are most active.

    • Actionable Insight: If there are clear peak hours (e.g., late evening or early morning), the platform can allocate more resources during those times to ensure it remains responsive and avoid slowdowns. Additionally, this insight can guide marketing efforts or targeted user engagement during these peak times.
  • Off-Peak Hours: The plot may also reveal off-peak hours when there are fewer sessions.

    • Actionable Insight: For low-traffic times, the platform can plan for maintenance, content updates, or experiments without disturbing user engagement. Off-peak times might also be a good time to run targeted promotions to encourage more user activity.
  • Session Trends: By analyzing the graph, the platform can also track whether sessions are increasing or decreasing at specific times. For example, if there is a steady increase in sessions at certain hours, this could indicate growing popularity during those hours or even the emergence of new user behaviors.

In Summary:

The sessions by hour analysis provides insights into user engagement based on the time of day. By understanding when users are most active and when engagement is low, the platform can optimize its infrastructure, marketing campaigns, and user experience to ensure that it delivers a seamless experience throughout the day.

Strategic Recommendations Based on Insights:

  1. Optimize Resources for Peak Hours: If certain hours show a significant increase in sessions, consider scaling server resources or implementing strategies to handle high traffic during these hours.

  2. Run Targeted Marketing Campaigns: Use the insights from the peak activity hours to schedule promotions, push notifications, or email campaigns during these times to maximize engagement.

  3. Schedule Maintenance During Off-Peak Hours: For low-traffic hours, consider scheduling website maintenance, updates, or testing new features without affecting user experience.

  4. Encourage Activity During Low-Traffic Times: If certain hours show a decline in sessions, design strategies to increase engagement during these periods, such as special offers, new content releases, or personalized notifications.

Conclusion Summary:

The sessions by hour analysis offers a valuable understanding of when users are most and least active on the platform. This insight enables the team to make data-driven decisions for optimizing resource allocation, marketing timing, and site performance. By aligning platform operations with user activity patterns, the business can ensure it delivers the best possible experience for users at all times of the day.

# Ensure necessary libraries are loaded
library(dplyr)
library(ggplot2)
library(lubridate)

# Calculate sessions by month (ensure data is aggregated correctly)
sessions_month <- data %>%
  mutate(month = floor_date(as.Date(tfc_stamped), "month")) %>%  # Extract the month
  group_by(month) %>%
  summarise(sessions = n_distinct(tfc_session)) %>%
  arrange(month)

# Plot sessions by month as a bar chart
ggplot(sessions_month, aes(x = month, y = sessions)) +
  geom_bar(stat = "identity", fill = "#aed6f1", width = 0.5) +  # Bar chart with color and width
  geom_text(aes(label = sessions), vjust = -0.3, size = 3) +  # Add labels on bars
  labs(
    title = "Sessions by Month (Bar Chart)",
    x = "Month",
    y = "Number of Unique Sessions"
  ) +
  scale_x_date(
    date_labels = "%b %Y",  # Format x-axis labels as "Month Year"
    date_breaks = "1 month"  # Show labels for every month
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    axis.text.x = element_text(angle = 45, hjust = 1)  # Rotate labels for better readability
  )

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)
library(lubridate)

# Calculate sessions by day
sessions_day <- data %>%
  mutate(date = as.Date(tfc_stamped)) %>%  # Extract the date if not already done
  group_by(date) %>%
  summarise(sessions = n_distinct(tfc_session)) %>%
  arrange(date)

# Display the table using kable and kable_styling
sessions_day %>%
  kable(
    caption = "Sessions by Day",
    col.names = c("Date", "Number of Sessions")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
Sessions by Day
Date Number of Sessions
2024-06-14 3163
2024-06-15 2277
2024-06-16 1738
2024-06-17 2241
2024-06-18 2377
2024-06-19 3301
2024-06-20 5058
2024-06-21 3223
2024-06-22 3357
2024-06-23 4634
2024-06-24 3600
2024-06-25 2870
2024-06-26 4258
2024-06-27 3900
2024-06-28 1912
2024-06-29 1913
2024-06-30 1342
2024-07-01 3320
2024-07-02 2150
2024-07-03 2554
2024-07-04 4093
2024-07-05 3797
2024-07-06 2422
2024-07-07 3124
2024-07-08 2759
2024-07-09 3132
2024-07-10 3272
2024-07-11 3500
2024-07-12 2629
2024-07-13 1501
2024-07-14 4929
2024-07-15 2142
2024-07-16 3233
2024-07-17 5525
2024-07-18 5473
2024-07-19 4711
2024-07-20 3659
2024-07-21 2632
2024-07-22 5567
2024-07-23 5720
2024-07-24 5064
2024-07-25 4292
2024-07-26 3264
2024-07-27 3271
2024-07-28 3898
2024-07-29 4354
2024-07-30 4557
2024-07-31 5641
2024-08-01 4870
2024-08-02 4804
2024-08-03 3963
2024-08-04 3433
2024-08-05 3377
2024-08-06 5419
2024-08-07 4614
2024-08-08 4314
2024-08-09 2771
2024-08-10 2295
2024-08-11 3009
2024-08-12 3914
2024-08-13 3683
2024-08-14 3682
2024-08-15 4291
2024-08-16 3932
2024-08-17 3198
2024-08-18 3957
2024-08-19 4505
2024-08-20 6209
2024-08-21 5250
2024-08-22 4407
2024-08-23 4911
2024-08-24 3600
2024-08-25 3206
2024-08-26 5535
2024-08-27 4804
2024-08-28 4449
2024-08-29 575
# Plot sessions by day as a line chart
ggplot(sessions_day, aes(x = date, y = sessions)) +
  geom_line(color = "#aed6f1", size = 1.2) +  # Line chart with color and thickness
  geom_point(color = "#3498db", size = 2) +   # Add points for clarity
  labs(
    title = "Sessions by Day (Line Chart)",
    x = "Date",
    y = "Number of Sessions"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    axis.text.x = element_text(angle = 45, hjust = 1)  # Rotate date labels for readability
  )

Key Insights and Actionable Outcome:

  • Daily Activity Trends: The plot allows you to identify trends in user activity over time, showing which days have high engagement and which have lower engagement.

    • Actionable Insight: If there are spikes in sessions on certain days (e.g., weekends or holidays), you can plan special promotions, content releases, or marketing campaigns to take advantage of high user activity.
  • Identifying Activity Lulls: The line plot may also show periods of inactivity or low engagement.

    • Actionable Insight: For days with fewer sessions, you can plan to re-engage users through targeted campaigns, new content, or personalized notifications.
  • Long-Term Engagement Trends: The line graph also helps assess whether user engagement is increasing, stable, or declining over time.

    • Actionable Insight: A steady decline in sessions over time could indicate potential issues, such as declining user interest or content relevance. It could prompt a deeper analysis of factors contributing to the decline, such as competition or user experience.

In Summary:

The sessions by day analysis provides a clear view of how user engagement fluctuates over time, helping to identify daily trends and patterns. By understanding these trends, you can optimize your marketing strategies, schedule content releases, and ensure that the platform has sufficient resources during peak user activity.

Strategic Recommendations Based on Insights:

  1. Target High-Traffic Days: If certain days show higher user activity, use these peaks for marketing campaigns, special events, or new content launches to maximize user interaction.

  2. Address Low-Traffic Days: If certain days show lower engagement, consider running promotions or special offers on those days to increase sessions.

  3. Track Long-Term Engagement: Continuously track daily sessions to identify long-term growth patterns or declines. A steady increase in sessions indicates positive platform growth, while a decline may indicate the need for user retention strategies.

  4. Optimize Resources for Busy Days: Identify days with high user activity and ensure that platform infrastructure (e.g., server capacity, load balancing) is optimized for increased traffic.

Conclusion Summary:

The sessions by day analysis is a valuable tool for understanding user behavior patterns over time. By visualizing daily session counts, the platform can make data-driven decisions for optimizing user engagement, improving operational efficiency, and aligning marketing efforts with peak activity periods. This ensures that the Planning Alerts platform is responsive to user needs and continues to enhance its overall user experience.

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(stringr)
library(kableExtra)

# Identify external referrals using specific keywords
referral_keywords <- c("google", "facebook", "bing", "instagram", "linkedin")
external_referrals <- data %>%
  filter(str_detect(tolower(tfc_referrer), paste(referral_keywords, collapse = "|")))

# Count by external referral source
referral_counts <- external_referrals %>%
  mutate(source = case_when(
    str_detect(tolower(tfc_referrer), "google") ~ "Google",
    str_detect(tolower(tfc_referrer), "facebook") ~ "Facebook",
    str_detect(tolower(tfc_referrer), "bing") ~ "Bing",
    str_detect(tolower(tfc_referrer), "instagram") ~ "Instagram",
    str_detect(tolower(tfc_referrer), "linkedin") ~ "LinkedIn",
    TRUE ~ "Other"
  )) %>%
  group_by(source) %>%
  summarise(sessions = n()) %>%
  arrange(desc(sessions))

# Display the table using kable and kable_styling
referral_counts %>%
  kable(
    caption = "External Referral Sources",
    col.names = c("Referral Source", "Number of Sessions")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
External Referral Sources
Referral Source Number of Sessions
Google 72422
Bing 1146
Facebook 29
Instagram 1
LinkedIn 1
# Plot external referral sources
ggplot(referral_counts, aes(x = reorder(source, -sessions), y = sessions)) +
  geom_bar(stat = "identity", width = 0.4, fill = "#aed6f1") +  # Corrected to geom_bar
  labs(
    title = "External Referral Sources (Bar Chart)",
    x = "Referral Source",
    y = "Number of Sessions"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12),
    axis.text.x = element_text(angle = 45, hjust = 1)  # Rotate x-axis labels for readability
  )

Key Insights and Actionable Outcome:

  • Referral Source Performance: The chart will help identify which referral sources (like Google, Facebook, etc.) are driving the most traffic to the platform.

    • Actionable Insight: If Google is driving a large portion of traffic, it indicates the effectiveness of SEO efforts or search ads. If Facebook is a major source, it might indicate the success of social media campaigns.
  • Marketing Strategy Optimization: Understanding which external sources bring the most sessions allows for a more targeted marketing strategy.

    • Actionable Insight: If a particular source (e.g., Google) is driving significant traffic, the platform can invest more in optimizing its Google Ads campaigns or improving SEO for even greater visibility.
  • Diversifying Referral Traffic: If a single source is dominating, it may be useful to diversify the referral sources to avoid over-reliance on one channel.

    • Actionable Insight: If Facebook and Google are the dominant sources, the platform can experiment with other platforms like LinkedIn, Instagram, or even explore affiliate marketing to increase traffic diversity.
  • Monitor “Other” Sources: The “Other” category can highlight external sources that aren’t specifically categorized but still contribute to traffic.

    • Actionable Insight: Investigate these unknown sources to see if they can be turned into strategic marketing channels.

In Summary:

The external referral sources analysis provides a clear picture of where users are coming from. By identifying the top referral sources, the platform can optimize marketing campaigns, increase visibility in high-traffic sources, and explore new channels to drive more traffic.

Strategic Recommendations Based on Insights:

  1. Optimize High-Performing Channels: Focus on the most successful referral sources (e.g., Google, Facebook) by enhancing SEO and social media strategies for further growth.

  2. Diversify Referral Traffic: If one or two referral sources dominate, explore additional channels such as Instagram, LinkedIn, or affiliate networks to broaden the reach.

  3. Target New Sources: Investigate sources categorized under “Other” to uncover potential new traffic channels that can be leveraged for growth.

  4. Measure the Impact of Campaigns: By analyzing the referral data regularly, the team can determine the effectiveness of marketing campaigns and adjust them accordingly.

Conclusion Summary:

The external referral sources analysis allows the platform to understand where its traffic is coming from and how effectively different external channels are driving users. By leveraging this data, the platform can optimize marketing efforts, diversify its traffic sources, and ensure that resources are allocated to the channels with the highest potential for growth.

# Load necessary libraries
library(dplyr)
library(ggplot2)
library(kableExtra)

# Count sessions per user
sessions_per_user <- data %>%
  group_by(tfc_cookie) %>%
  summarise(num_sessions = n_distinct(tfc_session))

# Classify users as one-time or repeat visitors and count them
visitor_counts <- sessions_per_user %>%
  mutate(visitor_type = ifelse(num_sessions == 1, "One-Time Visitors", "Repeat Visitors")) %>%
  group_by(visitor_type) %>%
  summarise(count = n()) %>%
  arrange(desc(count))

# Display the table using kable and kable_styling
visitor_counts %>%
  kable(
    caption = "One-Time vs Repeat Visitors",
    col.names = c("Visitor Type", "Number of Users")
  ) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
One-Time vs Repeat Visitors
Visitor Type Number of Users
One-Time Visitors 175408
Repeat Visitors 13620
# Plot one-time vs repeat visitors
ggplot(visitor_counts, aes(x = visitor_type, y = count)) +
  geom_histogram(stat = "identity", width = 0.4, fill = "#aed6f1") +
  labs(
    title = "One-Time vs Repeat Visitors (histogram Chart)",
    x = "Visitor Type",
    y = "Number of Users"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
    axis.title.x = element_text(size = 12),
    axis.title.y = element_text(size = 12)
  )

Key Insights and Actionable Outcome:

  • User Retention: The chart clearly shows the distribution of one-time visitors vs repeat visitors. This is a key indicator of user retention and the platform’s ability to engage users over time.

    • Actionable Insight: If repeat visitors make up a small portion of the total users, the platform may need to focus on improving user engagement to encourage return visits, such as offering personalized content, loyalty programs, or incentives.
  • Marketing and Engagement Strategies: The number of one-time visitors can give insights into how well the platform is converting new users into repeat users.

    • Actionable Insight: The platform could run targeted campaigns to encourage one-time visitors to return, such as follow-up emails, special offers, or content updates tailored to their interests.
  • Platform Health: A higher number of repeat visitors indicates a healthy platform with a strong user retention rate.

    • Actionable Insight: If repeat visits are high, the platform could capitalize on this by introducing new features, rewards, or exclusive content to keep users coming back.

In Summary:

The one-time vs repeat visitors analysis provides insights into user retention. By visualizing the distribution between one-time and repeat users, the platform can adjust its engagement strategies, improve user retention, and create more targeted campaigns to increase return visits.

Strategic Recommendations Based on Insights:

  1. Enhance Retention Programs: If one-time visitors are the majority, introduce personalized experiences, loyalty programs, or special offers to convert them into repeat visitors.

  2. Target One-Time Visitors: Create follow-up campaigns for one-time visitors, such as email marketing or personalized ads, to encourage them to return and explore more of the platform.

  3. Maximize Engagement for Repeat Visitors: For repeat users, introduce exclusive content, rewards, or new features to keep them engaged and encourage continued visits.

  4. Monitor Retention Metrics: Regularly track the ratio of one-time to repeat visitors to monitor the effectiveness of user retention strategies.

Conclusion Summary:

The one-time vs repeat visitors analysis highlights the level of user engagement and retention on the platform. By understanding the balance between new and returning users, the platform can implement strategies to increase user loyalty, improve conversion rates, and ensure a sustained growth trajectory for long-term success.


Final Conclusion and Recommendations

The analysis of PlanningAlerts.ie has provided valuable insights into user behavior and platform performance, offering opportunities to improve user engagement, optimize resources, and enhance overall platform effectiveness.

Key Insights:

1.Peak Usage Times: User activity peaks during specific hours and days, highlighting opportunities to strategically schedule campaigns and updates to maximize engagement.

2.Referral Sources: A significant portion of traffic originates from Google and social media platforms like Facebook, underscoring the importance of maintaining strong SEO practices and effective social media marketing strategies.

3.Device Usage: Mobile devices account for the majority of traffic, emphasizing the need for mobile-friendly designs and platform optimization to ensure a seamless user experience.

4.User Retention: A large number of users are one-time visitors, revealing an opportunity to implement retention strategies, such as personalized experiences and loyalty programs, to encourage repeat visits.

5.Sessions Trends: Analysis of monthly, weekly, and daily session patterns reveals actionable insights to optimize resource allocation and improve marketing efforts.

Recommendations:

Leverage Peak Activity:

Target users with tailored campaigns and promotions during peak hours and days. Schedule critical updates or content releases during high-traffic periods to maximize visibility and engagement

Enhance Mobile Optimization :

1.Prioritize a mobile-first approach, ensuring the platform is fully optimized for mobile users.

2.Continuously test and improve mobile performance for a smooth user experience. Focus on User Retention :

Implement follow-up offers, personalized emails, and exclusive rewards to convert one-time visitors into repeat users. Develop loyalty programs to encourage long-term user engagement.

Diversify Traffic Sources:

1.Expand beyond Google and Facebook by exploring additional referral channels such as LinkedIn, Instagram, and affiliate marketing partnerships.Analyze lesser-known referral sources to identify and capitalize on new traffic opportunities.

2.Optimize Resource Allocation: Use low-traffic periods to perform platform maintenance and updates to minimize user disruption. Allocate resources effectively to support peak traffic periods, ensuring smooth platform performance.