R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Note: this analysis was performed using the open source software R and Rstudio.

library(readr)
data <- read_csv('purchase_behavior_data.csv')
## Rows: 150 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (6): Customer_ID, Gender, Region, Product_Category, Payment_Method, Loy...
## dbl  (2): Age, Purchase_Amount
## date (1): Purchase_Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
## # A tibble: 6 × 9
##   Customer_ID   Age Gender Region Product_Category Purchase_Amount
##   <chr>       <dbl> <chr>  <chr>  <chr>                      <dbl>
## 1 CUST0001       56 Male   East   Books                      148. 
## 2 CUST0002       69 Other  West   Home Goods                  67.9
## 3 CUST0003       46 Other  South  Books                      351. 
## 4 CUST0004       32 Female North  Clothing                   318. 
## 5 CUST0005       60 Female East   Clothing                   440. 
## 6 CUST0006       25 Male   North  Beauty                     370. 
## # ℹ 3 more variables: Payment_Method <chr>, Loyalty_Program <chr>,
## #   Purchase_Date <date>
# Convert Product_Category to factor if it's not already
data$Product_Category <- as.factor(data$Product_Category)

# You can't do `Product_Category ~ Purchase_Amount` directly in `plot()` for factor vs numeric
# Use boxplot or ggplot2
boxplot(Purchase_Amount ~ Product_Category, data = data,
        main = "Purchase Amount by Product Category",
        xlab = "Product Category", ylab = "Purchase Amount",
        col = "lightblue", las = 2)

#Answers

library(ggplot2)

ggplot(data, aes(x = Purchase_Amount)) +
  geom_histogram(binwidth = 20, fill = "steelblue", color = "black") +
  labs(title = "Distribution of Purchase Amounts",
       x = "Purchase Amount", y = "Frequency")