setwd(“~/R TRAINING”)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Attaching package: 'mice'
The following object is masked from 'package:stats':
filter
The following objects are masked from 'package:base':
cbind, rbind
corrplot 0.95 loaded
'data.frame': 300 obs. of 14 variables:
$ TransactionID : int 1 2 3 4 5 6 7 8 9 10 ...
$ CustomerID : int 207 253 110 256 274 52 191 165 18 169 ...
$ ProductID : int 14 15 19 1 2 15 10 5 5 17 ...
$ Quantity : int 5 1 5 3 5 2 1 5 4 2 ...
$ PaymentMethod : chr "UPI" "Cash on Delivery" "Cash on Delivery" "Debit Card" ...
$ TransactionDate : chr "12/28/2023" "4/17/2023" "1/17/2023" "10/23/2023" ...
$ ProductCategory : chr "Clothing" "Electronics" "Clothing" "Home Appliances" ...
$ Price : num 33.4 389 145.9 215 325.5 ...
$ Rating : num 4.2 2.2 1.7 2.7 3.4 2.2 3.7 1.4 1.4 1.9 ...
$ TotalAmount : num 167 389 729 645 1627 ...
$ Age : int 50 19 37 50 24 50 40 37 25 53 ...
$ Gender : chr "Male" "Female" "Female" "Male" ...
$ Location : chr "Houston" "Chicago" "Houston" "Los Angeles" ...
$ MembershipStatus: chr "Basic" "Premium" "Basic" "Premium" ...
TransactionID CustomerID ProductID Quantity
Min. : 1.00 Min. : 4.0 Min. : 1.00 Min. :1.000
1st Qu.: 75.75 1st Qu.: 77.0 1st Qu.: 6.00 1st Qu.:2.000
Median :150.50 Median :156.5 Median :11.00 Median :3.000
Mean :150.50 Mean :154.8 Mean :10.66 Mean :3.127
3rd Qu.:225.25 3rd Qu.:229.2 3rd Qu.:15.00 3rd Qu.:5.000
Max. :300.00 Max. :299.0 Max. :20.00 Max. :5.000
PaymentMethod TransactionDate ProductCategory Price
Length:300 Length:300 Length:300 Min. : 33.36
Class :character Class :character Class :character 1st Qu.:145.89
Mode :character Mode :character Mode :character Median :215.02
Mean :253.52
3rd Qu.:350.44
Max. :466.49
Rating TotalAmount Age Gender
Min. :1.400 Min. : 33.36 Min. :18.00 Length:300
1st Qu.:2.100 1st Qu.: 350.44 1st Qu.:30.00 Class :character
Median :2.600 Median : 662.55 Median :42.00 Mode :character
Mean :2.928 Mean : 803.12 Mean :41.79
3rd Qu.:3.700 3rd Qu.:1134.50 3rd Qu.:52.25
Max. :5.000 Max. :2332.45 Max. :65.00
NA's :8
Location MembershipStatus
Length:300 Length:300
Class :character Class :character
Mode :character Mode :character
TransactionID CustomerID ProductID Quantity
0 0 0 0
PaymentMethod TransactionDate ProductCategory Price
0 0 0 0
Rating TotalAmount Age Gender
0 0 8 0
Location MembershipStatus
0 0
Call:
lm(formula = TotalAmount ~ Age + Price + Quantity + Rating, data = data)
Residuals:
Min 1Q Median 3Q Max
-493.37 -98.03 8.40 99.71 447.75
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -725.30851 52.37741 -13.848 <2e-16 ***
Age 0.08560 0.74111 0.116 0.908
Price 3.14353 0.08184 38.412 <2e-16 ***
Quantity 238.95715 6.71914 35.564 <2e-16 ***
Rating -8.46592 9.18348 -0.922 0.357
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 166.5 on 291 degrees of freedom
Multiple R-squared: 0.9076, Adjusted R-squared: 0.9063
F-statistic: 714.4 on 4 and 291 DF, p-value: < 2.2e-16
'data.frame': 283 obs. of 14 variables:
$ TransactionID : int 1 2 3 4 5 6 7 8 9 10 ...
$ CustomerID : int 207 253 110 256 274 52 191 165 18 169 ...
$ ProductID : int 14 15 19 1 2 15 10 5 5 17 ...
$ Quantity : int 5 1 5 3 5 2 1 5 4 2 ...
$ PaymentMethod : chr "UPI" "Cash on Delivery" "Cash on Delivery" "Debit Card" ...
$ TransactionDate : chr "12/28/2023" "4/17/2023" "1/17/2023" "10/23/2023" ...
$ ProductCategory : chr "Clothing" "Electronics" "Clothing" "Home Appliances" ...
$ Price : num 33.4 389 145.9 215 325.5 ...
$ Rating : num 4.2 2.2 1.7 2.7 3.4 2.2 3.7 1.4 1.4 1.9 ...
$ TotalAmount : num 167 389 729 645 1627 ...
$ Age : num 50 19 37 50 24 50 40 37 25 53 ...
$ Gender : chr "Male" "Female" "Female" "Male" ...
$ Location : chr "Houston" "Chicago" "Houston" "Los Angeles" ...
$ MembershipStatus: chr "Basic" "Premium" "Basic" "Premium" ...
- attr(*, "na.action")= 'omit' Named int [1:4] 90 220 244 255
..- attr(*, "names")= chr [1:4] "90" "220" "244" "255"
Two Sample t-test
data: TotalAmount by Gender
t = -2.448, df = 281, p-value = 0.01498
alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
95 percent confidence interval:
-286.44204 -31.10043
sample estimates:
mean in group Female mean in group Male
696.0371 854.8084
Welch Two Sample t-test
data: TotalAmount by Gender
t = -2.4726, df = 274.83, p-value = 0.01402
alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
95 percent confidence interval:
-285.1822 -32.3603
sample estimates:
mean in group Female mean in group Male
696.0371 854.8084
Pearson's Chi-squared test
data: table(data$ProductCategory, data$PaymentMethod)
X-squared = 2.2394, df = 6, p-value = 0.8964
Df Sum Sq Mean Sq F value Pr(>F)
MembershipStatus 2 451973 225986 0.762 0.467
Residuals 293 86840140 296383