filtered_data <- customer_data %>%
filter(age > 50,
purchase_amount > 500,
purchase_date >= (Sys.Date() - years(1))) %>%
arrange(desc(purchase_amount))
print(filtered_data)
## customer_id age region purchase_amount purchase_date
## 1 CUST026 69 East 978.33 2024-06-19
## 2 CUST006 59 South 959.58 2024-06-29
## 3 CUST024 56 North 916.77 2024-07-23
## 4 CUST005 59 North 892.83 2025-01-17
Explanation of Each Function:
filter() – selects rows where:
age > 50
purchase_amount > 500
purchase_date is within the past year (using lubridate::years()).
arrange(desc(purchase_amount)) – sorts the results in descending order by purchase amount.
%>% – the pipe operator connects each function, creating a readable data processing pipeline.
library(ggplot2)
print(
ggplot(customer_data, aes(x = region, y = purchase_amount, fill = region)) +
geom_boxplot() +
labs(title = "Distribution of Purchase Amounts by Region",
x = "Region",
y = "Purchase Amount ($)") +
theme_minimal() +
theme(legend.position = "none",
axis.text.x = element_text(angle = 45, hjust = 1))
)
````