1. Introduction

For this project, we are working on the website quintadasquintas.com, a rural tourism property in Portugal offering accommodation, spa, and nature experiences. This A/B test evaluates whether resizing and repositioning the “Book Now” button on the accommodation page leads to a meaningful increase in click-through rate (CTR), ultimately driving more booking inquiries.

Motivation: Based on internal team experience with the website, the current button is considered undersized and poorly positioned. A larger, more centrally placed button is hypothesised to reduce friction in the user journey and increase the likelihood of a visitor initiating a booking.

Expected Impact: A more prominent call-to-action (CTA) button should:

Hypotheses

  • Null Hypothesis (H_0): The booking button click-through rate is the same in both the control and variant groups.
    \[H_0: \mu_A = \mu_B\]

  • Alternative Hypothesis (H_A): The variant (larger, repositioned button) produces a higher CTR than the control. \[H_1: \mu_B > \mu_A\]

  • Significance level (α): 0.05

  • Statistical power (1 - β): 0.80


2. Data Loading

library(dplyr)
library(ggplot2)
library(readr)

# Load traffic and button click data
traffic <- read_csv2("data/baseline_Jan-21-–Feb-20/Traffic_2026-01-21-2026-02-20.csv")
clicks  <- read_csv2("data/baseline_Jan-21-–Feb-20/button_clicks_table_2026-01-21-2026-02-20.csv")

3. Data Overview

# Inspect traffic data
head(traffic)
## # A tibble: 6 × 3
##   `Caminho da página`        `Sessões do site` `Visitantes únicos`
##   <chr>                                  <dbl>               <dbl>
## 1 /                                        433                 357
## 2 /alojamentos                             180                 153
## 3 /piscinas                                117                 103
## 4 /serviços                                 99                  82
## 5 /reservas/estúdio-oliveira                65                  52
## 6 /galeria                                  60                  51
# Inspect clicks data
head(clicks)
## # A tibble: 6 × 8
##   `Texto do botão`    `Tipo de botão` `Caminho da página` `Item vinculado`
##   <chr>               <chr>           <chr>               <chr>           
## 1 Ver fotos.          Link de texto   /serviços           Endereço web    
## 2 VER ALOJAMENTOS     Botão           /                   Endereço web    
## 3 Ver opções e preços Link de texto   /serviços           Endereço web    
## 4 SPA INTERIOR        Link de texto   /                   Endereço web    
## 5 VER MAIS FOTOS      Botão           /                   Endereço web    
## 6 SPA INTERIOR        Link de texto   /galeria            Endereço web    
## # ℹ 4 more variables: `Detalhes do link` <chr>, `Visitantes únicos` <dbl>,
## #   `Cliques únicos` <dbl>, CTR <chr>

3.1 Data Dictionary - Traffic

Column Type Description
Caminho da página Text URL path of the page visited
Sessões do site Numeric Total number of sessions on that page
Visitantes únicos Numeric Number of unique visitors to that page

3.2 Data Dictionary - Clicks

Column Type Description
Texto do botão Text Label/name of the button clicked
Tipo de botão Text Type of element (button, text link, image)
Caminho da página Text Page where the button is located
Item vinculado Text Destination the button links to
Detalhes do link Text Full URL or anchor of the destination
Visitantes únicos Numeric Unique visitors who saw the button
Cliques únicos Numeric Unique clicks on the button
CTR Text Click-through rate (clicks ÷ visitors)

4. Exploratory Data Analysis

4.1 Traffic Analysis

To establish the context for our experiment, we will begin by analyzing the website’s traffic data from January 21 to February 20, 2026. By understanding how visitors navigate the site, we can identify where the booking funnel begins and where users are lost before reaching the accommodations page.

summary(traffic)
##  Caminho da página  Sessões do site  Visitantes únicos
##  Length:38          Min.   :  1.00   Min.   :  1.00   
##  Class :character   1st Qu.:  5.25   1st Qu.:  5.00   
##  Mode  :character   Median : 11.00   Median : 10.00   
##                     Mean   : 38.00   Mean   : 31.89   
##                     3rd Qu.: 42.50   3rd Qu.: 33.75   
##                     Max.   :433.00   Max.   :357.00
traffic %>%
  arrange(desc(`Visitantes únicos`)) %>%
  head(10) %>%
  ggplot(aes(x = reorder(`Caminho da página`, `Visitantes únicos`),
             y = `Visitantes únicos`)) +
  geom_col(fill = "lightblue") +
  coord_flip() +
  labs(
    title = "Top 10 Pages by Unique Visitors",
    x     = "Page",
    y     = "Unique Visitors"
  ) +
  theme_minimal(base_size = 12)

Note: The traffic data confirms that the homepage is the primary entry point, receiving 357 unique visitors. However, only 153 of those visitors (43%) proceed to the accommodations page. This suggests a significant drop-off at the first conversion step and provides further evidence that the current placement and size of the booking button are ineffective at directing users. Interestingly, users who reach the accommodations page demonstrate clear intent. Individual room pages, such as /reservas/estudo-oliveira and /reservas/estudo-eira, appear among the top ten most visited pages. This indicates that the issue is not a lack of interest in the product but rather an inability to guide engaged users toward the booking funnel from the homepage, which is precisely what this A/B test aims to address.

4.2 Clicks Analysis

Next, we analyze the click distribution across all buttons on the website during the same observation period. This allows us to identify the most engaging buttons and isolate the performance of the VER ALOJAMENTOS button, which is our primary KPI and the focus of this A/B test.

summary(clicks)
##  Texto do botão     Tipo de botão      Caminho da página  Item vinculado    
##  Length:96          Length:96          Length:96          Length:96         
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Detalhes do link   Visitantes únicos Cliques únicos       CTR           
##  Length:96          Min.   :  5.0     Min.   : 1.000   Length:96         
##  Class :character   1st Qu.: 27.0     1st Qu.: 1.000   Class :character  
##  Mode  :character   Median : 51.5     Median : 1.000   Mode  :character  
##                     Mean   :105.6     Mean   : 2.729                     
##                     3rd Qu.:153.0     3rd Qu.: 3.000                     
##                     Max.   :357.0     Max.   :19.000
clicks %>%
  arrange(desc(`Cliques únicos`)) %>%
  head(10) %>%
  ggplot(aes(x = reorder(`Texto do botão`, `Cliques únicos`),
             y = `Cliques únicos`)) +
  geom_col(fill = "lightblue") +
  coord_flip() +
  labs(
    title = "Top 10 Buttons by Unique Clicks",
    x     = "Button",
    y     = "Unique Clicks"
  ) 

Note: The EDA reveals that, despite receiving the highest absolute click volume, VER ALOJAMENTOS’s CTR of 4.2% is notably low, considering it was exposed to 357 homepage visitors. The button’s current size and placement make it difficult to notice, effectively hiding the main conversion point from users who are engaged with the site, which is the primary driver of this A/B test. It is hypothesized that a larger, more centrally positioned button would increase its visibility and bring its CTR more in line with its business importance.

4.3 Baseline CTR Calculation

Click-through rate (CTR) is the percentage of unique visitors who click the booking button and is the primary key performance indicator (KPI) for this experiment. Although CTR is already available for each button in our dataset, the website operates in both Portuguese and English with separate button entries for each language. Therefore, we combine clicks and visitors across both language versions of the homepage to calculate a unified baseline CTR for the power calculation.

# All accommodation/booking button labels across languages
booking_buttons <- c("VER ALOJAMENTOS", "ACCOMMODATIONS")

booking_data <- clicks %>%
    filter(`Texto do botão` %in% booking_buttons,
           `Caminho da página` %in% c("/", "/en"))

total_clicks   <- sum(booking_data$`Cliques únicos`)
total_visitors <- sum(booking_data$`Visitantes únicos`)

p_control <- total_clicks / total_visitors
cat("Baseline CTR:", round(p_control * 100, 2), "%")
## Baseline CTR: 4.79 %

Note: The baseline CTR of 4.79% represents the pre-experiment conversion rate of the booking button under the current design (Control). This value is carried forward as p_control into the power calculation, where it is used to derive the variance of the data and determine the minimum sample size required to detect a meaningful improvement.


5. Power Calculation

Before running the test, we calculate the minimum sample size required per group to detect a meaningful lift with sufficient power.

Assumptions:

# Load necessary library for normal distribution functions
library(stats)

# Parameters
mu_0 <- p_control         # baseline CTR from real data
mu_1 <- p_control + 0.02  # MDE: +2 percentage points
sigma <- sqrt(mu_0 * (1 - mu_0))  # standard deviation derived from real data
effect_size <- (mu_1 - mu_0) / sigma

# Desired power and significance level
power <- 0.80
alpha <- 0.05

# Z-values for alpha and power
z_alpha_2 <- qnorm(1 - (alpha / 2))  # two-sided test
z_beta    <- qnorm(power)

# Calculate sample size using the formula
n <- ((z_alpha_2 + z_beta)^2 * 2 * sigma^2) / effect_size^2

# Round up since you can't have a fraction of a sample
n <- ceiling(n)

# Print required sample size per group
cat("Sample size per group:", n)
## Sample size per group: 82
cat("\nTotal sample size:", n * 2)
## 
## Total sample size: 164

Interpretation: We need approximately 82 users per group (164 total) to detect a +2pp lift in CTR at 80% power and 5% significance. Given the site receives ~357 homepage visitors per month, this sample size is achievable within the observation window.


6. A/B test

Important Note: Due to technical constraints of Wix, it was not possible to route traffic simultaneously between versions. Instead, the experiment was conducted sequentially. Version A was shown during week one, and Version B was shown during week two. Although this introduces potential temporal effects, the traffic levels and audience characteristics were similar in both weeks. This allowed us to approximate an A/B test and apply standard statistical methods for comparison.


6.1 Data Loading — New Data

# Version A
A_traffic <- read_csv2('data/Data_VA_Wk1/VA_traffic_Wk1.csv')
A_clicks <- read_csv2('data/Data_VA_Wk1/VA_clicks_Wk1.csv')

# Version B
B_traffic <- read_csv2('data/Data_VB_Wk2/VB_traffic_Wk2.csv')
B_clicks <- read_csv2('data/Data_VB_Wk2/VB_clicks_Wk2.csv')
head(A_clicks)
## # A tibble: 6 × 8
##   `Texto do botão` `Tipo de botão` `Caminho da página` `Item vinculado`
##   <chr>            <chr>           <chr>               <chr>           
## 1 VER ALOJAMENTOS  Botão           /                   Endereço web    
## 2 VER MAIS FOTOS   Botão           /                   Endereço web    
## 3 VER MAIS         Botão           /alojamentos        Endereço web    
## 4 Ver fotos.       Link de texto   /serviços           Endereço web    
## 5 VER MAIS         Botão           /alojamentos        Endereço web    
## 6 VER ALOJAMENTOS  Botão           /piscinas           Endereço web    
## # ℹ 4 more variables: `Detalhes do link` <chr>, `Visitantes únicos` <dbl>,
## #   `Cliques únicos` <dbl>, CTR <chr>
head(B_clicks)
## # A tibble: 6 × 8
##   `Texto do botão`    `Tipo de botão` `Caminho da página` `Item vinculado`
##   <chr>               <chr>           <chr>               <chr>           
## 1 VER DISPONIBILIDADE Botão           /                   Endereço web    
## 2 VER MAIS            Botão           /alojamentos        Endereço web    
## 3 ESTÚDIOS            Botão           /                   Endereço web    
## 4 VER MAIS            Botão           /alojamentos        Endereço web    
## 5 VER MAIS            Botão           /alojamentos        Endereço web    
## 6 SEE AVAILABILITY    Botão           /en                 Endereço web    
## # ℹ 4 more variables: `Detalhes do link` <chr>, `Visitantes únicos` <dbl>,
## #   `Cliques únicos` <dbl>, CTR <chr>

6.2 Adjusting the Data

To prepare the data for statistical analysis, we first filter the entries for the booking button for each version, using the Portuguese and English labels for the control version (“VER ALOJAMENTOS”/“ACCOMMODATIONS”) and the variant version (“VER DISPONIBILIDADE”/“SEE AVAILABILITY”). We restrict the entries to the homepage paths (“/” and “/en”). Then, we extract the total unique visitors and total unique clicks for each version from the traffic and clicks datasets, respectively. Finally, we calculate the observed click-through rate (CTR) for each group. From these values, we construct a user-level data frame, df_AB, where each row represents a unique visitor to the homepage. The data frame contains two columns:

  • Group: Indicates whether the visitor was exposed to the control (version A, original button) or the variant (version B, redesigned button).

  • Clicked: A binary variable equal to 1 if the visitor clicked the booking button and 0 if they did not.

# Version A (Control): original button
A_booking <- A_clicks %>%
  filter(`Texto do botão` %in% c("VER ALOJAMENTOS", "ACCOMMODATIONS"),
         `Caminho da página` %in% c("/", "/en"))

# Version B (Variant): redesigned button
B_booking <- B_clicks %>%
  filter(`Texto do botão` %in% c("VER DISPONIBILIDADE", "SEE AVAILABILITY"),
         `Caminho da página` %in% c("/", "/en"))

# Get homepage unique visitors from each version 
A_visitors <- A_traffic %>%
  filter(`Caminho da página` %in% c("/", "/en")) %>%
  summarise(total = sum(`Visitantes únicos`)) %>%
  pull(total)

B_visitors <- B_traffic %>%
  filter(`Caminho da página` %in% c("/", "/en")) %>%
  summarise(total = sum(`Visitantes únicos`)) %>%
  pull(total)

# Get total clicks for each version 
A_total_clicks <- sum(A_booking$`Cliques únicos`)
B_total_clicks <- sum(B_booking$`Cliques únicos`)

# Calculate CTR for each version
A_ctr <- A_total_clicks / A_visitors
B_ctr <- B_total_clicks / B_visitors

cat("Control CTR (Version A):", round(A_ctr * 100, 2), "%\n")
## Control CTR (Version A): 15 %
cat("Variant CTR (Version B):", round(B_ctr * 100, 2), "%\n")
## Variant CTR (Version B): 34.13 %
# Build user-level dataframe for t-test and regression
df_AB <- data.frame(
  group = c(rep("Control", A_visitors), rep("Variant", B_visitors)),
  clicked = c(
    rep(1, A_total_clicks), rep(0, A_visitors - A_total_clicks),
    rep(1, B_total_clicks), rep(0, B_visitors - B_total_clicks)
  )
)

# Check structure
head(df_AB)
##     group clicked
## 1 Control       1
## 2 Control       1
## 3 Control       1
## 4 Control       1
## 5 Control       1
## 6 Control       1
table(df_AB$group, df_AB$clicked)
##          
##             0   1
##   Control 102  18
##   Variant  83  43

6.3 Statistical Analysis

# Separate into groups
group_A <- df_AB %>% filter(group == "Control") %>% pull(clicked)
group_B <- df_AB %>% filter(group == "Variant")  %>% pull(clicked)

# Two-sample t-test (one-sided: variant > control)
t_result <- t.test(group_B, group_A, alternative = "greater")
print(t_result)
## 
##  Welch Two Sample t-test
## 
## data:  group_B and group_A
## t = 3.5704, df = 231.86, p-value = 0.0002166
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.1027997       Inf
## sample estimates:
## mean of x mean of y 
## 0.3412698 0.1500000
df_AB %>%
  group_by(group) %>%
  summarise(
    Visitors = n(),
    Clicks   = sum(clicked),
    CTR      = round(mean(clicked) * 100, 2)
  ) %>%
  knitr::kable(caption = "Table: Observed CTR by Group")
Table: Observed CTR by Group
group Visitors Clicks CTR
Control 120 18 15.00
Variant 126 43 34.13

Note: The Welch two-sample t-test confirms that the difference in click-through rate (CTR) between the control and variant groups is statistically significant (t = 3.57, degrees of freedom (df) = 231.86, p = 0.0002). With a p-value well below the \(\alpha = 0.05\) significance threshold, we reject the null hypothesis, \(H_0\) and conclude that the redesigned button produces a significantly higher CTR than the original. The CTR increased from 15% in the control group to 34.13% in the variant group, a +19.13 percentage point increase, which far exceeds the pre-specified minimum detectable effect (MDE) of +2pp. The 95% confidence interval confirms that the true difference in means is at least 10.3 percentage points, providing strong statistical evidence that the improvement is not due to random chance.


6.4 Regression Analysis

# Encode group as binary: Variant = 1, Control = 0
df_AB$variant <- ifelse(df_AB$group == "Variant", 1, 0)

# Linear regression: clicked ~ variant
model <- lm(clicked ~ variant, data = df_AB)
summary(model)
## 
## Call:
## lm(formula = clicked ~ variant, data = df_AB)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.3413 -0.3413 -0.1500 -0.1500  0.8500 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.15000    0.03860   3.886 0.000131 ***
## variant      0.19127    0.05393   3.546 0.000468 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4228 on 244 degrees of freedom
## Multiple R-squared:  0.04902,    Adjusted R-squared:  0.04512 
## F-statistic: 12.58 on 1 and 244 DF,  p-value: 0.0004682
library(broom)

# Show regression output in table
tidy(model) %>%
  knitr::kable(
    digits  = 4,
    caption = "Table: Regression Analysis: Effect of Button Variant on CTR"
  )
Table: Regression Analysis: Effect of Button Variant on CTR
term estimate std.error statistic p.value
(Intercept) 0.1500 0.0386 3.8860 1e-04
variant 0.1913 0.0539 3.5463 5e-04

Note: The regression confirms the t-test findings. The intercept of 0.15 represents the baseline click-through rate (CTR) of the control group (15%), and the coefficient of 0.191 on variant indicates that exposure to the redesigned button increases the probability of clicking by 19.1 percentage points compared to the control group. This coefficient is statistically significant (p = 0.0005, *** (strongest significance level possible), which confirms that the effect is not due to random chance. The R-squared value of 0.049 shows that group membership explains about 4.9% of the variation in click-through behavior. Although this value is relatively small, it is to be expected in a real-world behavioral setting, where many factors influence whether a user clicks. The key finding remains that the variant has a strong, positive, and significant effect on CTR.