transaction_hour: The hour when the transaction occurred.
weekend_transaction: Indicates whether the transaction took place on a weekend (TRUE/FALSE).
velocity_last_hour: The number of transactions made by the customer in the last hour.
is_fraud: A binary label indicating whether the transaction was fraudulent (TRUE/FALSE).
This dataset provides the necessary information to explore patterns and behaviors associated with fraudulent transactions.
The goal of this project is to perform exploratory data analysis, apply statistical procedures to identify significant factors related to fraud, and ultimately develop insights that can aid in fraud detection.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
data<-read.csv("/Users/solenne/Desktop/synthetic_fraud_data.csv")
colnames(data)
## [1] "transaction_id" "customer_id" "card_number"
## [4] "timestamp" "merchant_category" "merchant_type"
## [7] "merchant" "amount" "currency"
## [10] "country" "city" "city_size"
## [13] "card_type" "card_present" "device"
## [16] "channel" "device_fingerprint" "ip_address"
## [19] "distance_from_home" "high_risk_merchant" "transaction_hour"
## [22] "weekend_transaction" "velocity_last_hour" "is_fraud"