Develop a script in R to create a funnel chart illustrating user drop-off on an e-commerce website using real dataset.
What we will do?
Load required libraries
Load dataset
Perform exploratory data analysis
Create funnel data
Create funnel chart
Analyze results
Step 1: Load Required Libraries
In this step, we initialize the R environment by loading specific software packages.
ggplot2: The primary engine for creating the map and charts.
dplyr: A tool for “data plumbing” used to filter, group, and calculate revenue.
maps: Provides the spatial coordinates (longitude and latitude) for world countries.
readr: Used to import your data.csv file quickly and accurately .
library(ggplot2)
Warning: package 'ggplot2' was built under R version 4.5.3
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(maps)
Warning: package 'maps' was built under R version 4.5.3
Step 2: Load Dataset
We bring your external CSV file into R’s memory as an object named data. Using head(data) allows us to verify that columns like Quantity, UnitPrice, and Country loaded correctly.
# Load the actual sales datadata <-read.csv("data.csv")# Verify the column names existnames(data)
Cleaning: We filter out transactions where Quantity or UnitPrice are zero or negative to ensure the revenue calculation is realistic.
Calculation: We create a new Revenue column by multiplying Quantity and UnitPrice .
Aggregation: We use group_by(Country) and summarise() to get one total revenue figure for each country rather than thousands of individual rows.
library(dplyr)# Check if columns exist before filteringif("Quantity"%in%names(data) &"UnitPrice"%in%names(data)) { data_clean <- data %>%filter(Quantity >0, UnitPrice >0) %>%mutate(Revenue = Quantity * UnitPrice)# Group by country for the map country_revenue <- data_clean %>%group_by(Country) %>%summarise(TotalRevenue =sum(Revenue)) %>%mutate(Country =tolower(Country))head(country_revenue)} else {print("Error: Columns 'Quantity' or 'UnitPrice' not found in the dataset.")}
# A tibble: 6 × 2
Country TotalRevenue
<chr> <dbl>
1 australia 138521.
2 austria 10199.
3 bahrain 754.
4 belgium 41196.
5 brazil 1144.
6 canada 3666.
Step 4: Create Funnel Data
Before we can map the data, R needs to know the “shapes” of the countries . We use map_data("world") to create a table of thousands of points that outline global borders. We convert all country names to lowercase to ensure they match our sales data perfectly .
world_map <-map_data("world")world_map$region <-tolower(world_map$region)# This will display the output for Step 4head(world_map)
long lat group order region subregion
1 -69.89912 12.45200 1 1 aruba <NA>
2 -69.89571 12.42300 1 2 aruba <NA>
3 -69.94219 12.43853 1 3 aruba <NA>
4 -70.00415 12.50049 1 4 aruba <NA>
5 -70.06612 12.54697 1 5 aruba <NA>
6 -70.05088 12.59707 1 6 aruba <NA>
Step 5: Merge Data with Map
This is the most critical technical step. We use a left_join to attach our calculated sales figures to the map coordinates. By matching on the “region” (country name), we tell R exactly which “shape” on the map should represent which “revenue” value .
# Synchronize country names for the mergecountry_revenue$region <- country_revenue$Countrycountry_revenue$region[country_revenue$region =="united kingdom"] <-"uk"country_revenue$region[country_revenue$region =="usa"] <-"usa"# Merge the datasetsmerged_data <-left_join(world_map, country_revenue, by ="region")# This will display the output for Step 5head(merged_data)
long lat group order region subregion Country TotalRevenue
1 -69.89912 12.45200 1 1 aruba <NA> <NA> NA
2 -69.89571 12.42300 1 2 aruba <NA> <NA> NA
3 -69.94219 12.43853 1 3 aruba <NA> <NA> NA
4 -70.00415 12.50049 1 4 aruba <NA> <NA> NA
5 -70.06612 12.54697 1 5 aruba <NA> <NA> NA
6 -70.05088 12.59707 1 6 aruba <NA> <NA> NA
Step 6: Create Choropleth Map
The final step is the visualization.
geom_polygon: Draws the country shapes using the coordinates from Step 4.
aes(fill = TotalRevenue): Tells R to color each country based on its sales volume.
scale_fill_gradient: Defines the color scheme (e.g., light blue for low sales, dark blue for high sales) .
labs(): Adds the professional finishing touches like the title and legend labels .
ggplot(merged_data, aes(x = long, y = lat, group = group)) +geom_polygon(aes(fill = TotalRevenue), color ="white", linewidth =0.1) +# Changed 'size' to 'linewidth'scale_fill_gradient(low ="lightblue", high ="darkblue", na.value ="grey90") +theme_minimal() +labs(title ="Global E-commerce Revenue (2010-2011)",x ="Longitude", y ="Latitude",fill ="Revenue ($)")
Global Revenue Distribution Map
LA-1
Problem Statement
Develop a script in R to create a funnel chart illustrating user drop-off on an e-commerce website using real dataset.
What we will do?
Load required libraries
Load dataset
Perform exploratory data analysis
Create funnel data
Create funnel chart
Analyze results
Step 1: Load Required Libraries
In this step , we load required Libraries
ggplot2: The primary engine for creating the map and charts.
dplyr: A tool for “data plumbing” used to filter, group, and calculate revenue.
maps: Provides the spatial coordinates (longitude and latitude) for world countries.
readr: Used to import your data.csv file quickly and accurately .
Step 2: Load Dataset
We bring your external CSV file into R’s memory as an object named data. Using head(data) allows us to verify that columns like Quantity, UnitPrice, and Country loaded correctly.
Load the actual sales data
data <- read.csv(“data.csv”)
Verify the column names exist
names(data)
Step 3: Exploratory Data Analysis
This is the “cleaning” phase .
Cleaning: We filter out transactions where Quantity or UnitPrice are zero or negative to ensure the revenue calculation is realistic.
Calculation: We create a new Revenue column by multiplying Quantity and UnitPrice .
Aggregation: We use group_by(Country) and summarise() to get one total revenue figure for each country rather than thousands of individual rows.
Step 4: Create Funnel Data
Before we can map the data, R needs to know the “shapes” of the countries . We use map_data("world") to create a table of thousands of points that outline global borders. We convert all country names to lowercase to ensure they match our sales data perfectly .
world_map <- map_data(“world”)
world_map\(region <- tolower(world_map\)region)
This will display the output for Step 4
head(world_map)
Step 5: Merge Data with Map
This is the most critical technical step. We use a left_join to a left_join to attach our calculated sales figures to the map coordinates. By matching on the “region” (country name), we tell R exactly which “shape” on the map should represent which “revenue” value .