Direct-Mail Fundraising Final Project

Project Objective

We are developing a predictive model for the American Legion to improve cost efficiency in their direct-mail marketing campaigns. The ultimate goal is to predict which of the donors should be sent a campaign letter in the mail.

Data Source

The American Legion provided the data consisting of 3,000 records. About 50% of those are donors and the other 50% are non-donors. Weighted sampling was used to avoid imbalance created by different distributions.

Content Descriptions: Variables

The variables in the provided data are listed below with their descriptions. (This information was provided and pulled from the assignment guidelines.)

  • zip: Zip Code Group
  • homeowner: Yes = homeowner, No = not a homeowner
  • num_child: Number of children
  • income: Household income
  • female: Yes = Female, No = Male
  • wealth: Wealth rating uses median family income and population statistics from each area to index relative wealth within each state. The segments are denoted 0 to 9, with 9 being the highest-wealth group and zero the lowest. Each rating has a different meaning within each state.
  • home_value: Average home value in potential donor’s neighborhood in hundreds of dollars.
  • med_fam_inc: Median family income in potential donor’s neighborhood in hundreds of dollars.
  • avg_fam_inc: Average family income in potential donor’s neighborhood in hundreds.
  • pct_lt15k: Percent earning less than $15K in potential donor’s neighborhood.
  • num_prom: Lifetime number of promotions received to date.
  • lifetime_gifts: Dollar amount of lifetime gifts to date.
  • largest_gift: Dollar amount of largest gift to date.
  • last_gift: Dollar amount of most recent gift.
  • months_since_donate: Number of months from last donation to July 2018.
  • time_lag: Number of months between first and second gift.
  • avg_gift: Average dollar amount of gifts to date.
  • target: Outcome variable: binary indicator for response Yes = donor, No = non-donor.

The provided data sets were loaded into the model using the following code:

fundraising <- read_rds("C:/Users/lacyb/Documents/2022 Spring/STA 6543 - Algorithms II/Final Project/fundraising.rds")
future.fundraising <- read_rds("C:/Users/lacyb/Documents/2022 Spring/STA 6543 - Algorithms II/Final Project/future_fundraising.rds")

Variable Transformation

The variables in the data needed to be converted to numeric values. All predictors were converted using the following code:

fund.num <- fundraising
fund.num <- as.data.frame(sapply(fund.num[, c(1:21)], as.numeric))

Exploratory Analysis

Several methods of exploring the data have been used. First, we create a correlation matrix. Then, we create pairwise plots. Finally we check variance and collinearity.

Correlation Matrix

The following was created as a correlation matrix:

round(cor(fund.num), 4)
##                     zipconvert2 zipconvert3 zipconvert4 zipconvert5 homeowner
## zipconvert2              1.0000      0.2490     -0.2742     -0.4150   -0.0224
## zipconvert3              0.2490      1.0000      0.2477      0.3750   -0.0812
## zipconvert4             -0.2742      0.2477      1.0000     -0.4130    0.0416
## zipconvert5             -0.4150      0.3750     -0.4130      1.0000   -0.0825
## homeowner               -0.0224     -0.0812      0.0416     -0.0825    1.0000
## num_child                0.0307      0.0080      0.0269     -0.0417   -0.0452
## income                  -0.0022      0.0942     -0.0203      0.0960   -0.3205
## female                   0.0108     -0.0235     -0.0226     -0.0108    0.0015
## wealth                   0.0425      0.0825     -0.0142      0.0434   -0.0658
## home_value              -0.1339      0.2377     -0.1831      0.4546   -0.1187
## med_fam_inc             -0.0894      0.0966     -0.0370      0.1835   -0.1383
## avg_fam_inc             -0.0792      0.1118     -0.0464      0.1950   -0.1335
## pct_lt15k                0.0495     -0.0320      0.0648     -0.1209    0.1375
## num_prom                -0.0337     -0.0440      0.0293     -0.0303   -0.0033
## lifetime_gifts          -0.0313      0.0104      0.0312      0.0091    0.0282
## largest_gift            -0.0184     -0.0018     -0.0158      0.0281    0.0333
## last_gift               -0.0087      0.0507     -0.0309      0.0745    0.0009
## months_since_donate      0.0158      0.0198      0.0253     -0.0179   -0.0214
## time_lag                -0.0185     -0.0439     -0.0148     -0.0084   -0.0263
## avg_gift                -0.0087      0.0622     -0.0277      0.0811    0.0083
## target                   0.0061     -0.0109      0.0053     -0.0211    0.0266
##                     num_child  income  female  wealth home_value med_fam_inc
## zipconvert2            0.0307 -0.0022  0.0108  0.0425    -0.1339     -0.0894
## zipconvert3            0.0080  0.0942 -0.0235  0.0825     0.2377      0.0966
## zipconvert4            0.0269 -0.0203 -0.0226 -0.0142    -0.1831     -0.0370
## zipconvert5           -0.0417  0.0960 -0.0108  0.0434     0.4546      0.1835
## homeowner             -0.0452 -0.3205  0.0015 -0.0658    -0.1187     -0.1383
## num_child              1.0000  0.0919  0.0296  0.0602    -0.0120      0.0470
## income                 0.0919  1.0000  0.0438  0.2090     0.2920      0.3675
## female                 0.0296  0.0438  1.0000  0.0294     0.0210      0.0235
## wealth                 0.0602  0.2090  0.0294  1.0000     0.2612      0.3778
## home_value            -0.0120  0.2920  0.0210  0.2612     1.0000      0.7382
## med_fam_inc            0.0470  0.3675  0.0235  0.3778     0.7382      1.0000
## avg_fam_inc            0.0473  0.3786  0.0252  0.3859     0.7526      0.9723
## pct_lt15k             -0.0317 -0.2832 -0.0552 -0.3751    -0.3991     -0.6654
## num_prom              -0.0864 -0.0690 -0.0382 -0.4121    -0.0645     -0.0508
## lifetime_gifts        -0.0510 -0.0196 -0.0371 -0.2255    -0.0241     -0.0352
## largest_gift          -0.0176  0.0332 -0.0014 -0.0253     0.0565      0.0470
## last_gift             -0.0129  0.1096  0.0464  0.0526     0.1589      0.1360
## months_since_donate   -0.0056  0.0772  0.0451  0.0337     0.0234      0.0323
## time_lag              -0.0061 -0.0015  0.0080 -0.0664     0.0007      0.0152
## avg_gift              -0.0197  0.1241  0.0745  0.0911     0.1688      0.1372
## target                 0.0423 -0.0360  0.0248 -0.0031    -0.0216     -0.0080
##                     avg_fam_inc pct_lt15k num_prom lifetime_gifts largest_gift
## zipconvert2             -0.0792    0.0495  -0.0337        -0.0313      -0.0184
## zipconvert3              0.1118   -0.0320  -0.0440         0.0104      -0.0018
## zipconvert4             -0.0464    0.0648   0.0293         0.0312      -0.0158
## zipconvert5              0.1950   -0.1209  -0.0303         0.0091       0.0281
## homeowner               -0.1335    0.1375  -0.0033         0.0282       0.0333
## num_child                0.0473   -0.0317  -0.0864        -0.0510      -0.0176
## income                   0.3786   -0.2832  -0.0690        -0.0196       0.0332
## female                   0.0252   -0.0552  -0.0382        -0.0371      -0.0014
## wealth                   0.3859   -0.3751  -0.4121        -0.2255      -0.0253
## home_value               0.7526   -0.3991  -0.0645        -0.0241       0.0565
## med_fam_inc              0.9723   -0.6654  -0.0508        -0.0352       0.0470
## avg_fam_inc              1.0000   -0.6803  -0.0573        -0.0403       0.0431
## pct_lt15k               -0.6803    1.0000   0.0378         0.0596      -0.0079
## num_prom                -0.0573    0.0378   1.0000         0.5386       0.1138
## lifetime_gifts          -0.0403    0.0596   0.5386         1.0000       0.5073
## largest_gift             0.0431   -0.0079   0.1138         0.5073       1.0000
## last_gift                0.1314   -0.0618  -0.0559         0.2021       0.4472
## months_since_donate      0.0313   -0.0090  -0.2823        -0.1446       0.0198
## time_lag                 0.0243   -0.0199   0.1196         0.0385       0.0400
## avg_gift                 0.1318   -0.0625  -0.1473         0.1823       0.4748
## target                  -0.0032    0.0008  -0.0684        -0.0196       0.0178
##                     last_gift months_since_donate time_lag avg_gift  target
## zipconvert2           -0.0087              0.0158  -0.0185  -0.0087  0.0061
## zipconvert3            0.0507              0.0198  -0.0439   0.0622 -0.0109
## zipconvert4           -0.0309              0.0253  -0.0148  -0.0277  0.0053
## zipconvert5            0.0745             -0.0179  -0.0084   0.0811 -0.0211
## homeowner              0.0009             -0.0214  -0.0263   0.0083  0.0266
## num_child             -0.0129             -0.0056  -0.0061  -0.0197  0.0423
## income                 0.1096              0.0772  -0.0015   0.1241 -0.0360
## female                 0.0464              0.0451   0.0080   0.0745  0.0248
## wealth                 0.0526              0.0337  -0.0664   0.0911 -0.0031
## home_value             0.1589              0.0234   0.0007   0.1688 -0.0216
## med_fam_inc            0.1360              0.0323   0.0152   0.1372 -0.0080
## avg_fam_inc            0.1314              0.0313   0.0243   0.1318 -0.0032
## pct_lt15k             -0.0618             -0.0090  -0.0199  -0.0625  0.0008
## num_prom              -0.0559             -0.2823   0.1196  -0.1473 -0.0684
## lifetime_gifts         0.2021             -0.1446   0.0385   0.1823 -0.0196
## largest_gift           0.4472              0.0198   0.0400   0.4748  0.0178
## last_gift              1.0000              0.1867   0.0751   0.8664  0.0777
## months_since_donate    0.1867              1.0000   0.0155   0.1891  0.1338
## time_lag               0.0751              0.0155   1.0000   0.0701 -0.0097
## avg_gift               0.8664              0.1891   0.0701   1.0000  0.0757
## target                 0.0777              0.1338  -0.0097   0.0757  1.0000

Pairwise Plots

The following plot was created:

ggpairs(fundraising[, c(10, 11, 12, 17, 20)], ggplot2::aes(colour = fundraising$target))

From the Correlation Matrix and Pairwise Plots, we can see that home_value, med_fam_inc, avg_fam_inc, last_gift, and avg_gift are all very highly correlated.

Variance

We used the following code to calculate variance:

var <- apply(fund.num, 2, var)
round(var, 4)
##         zipconvert2         zipconvert3         zipconvert4         zipconvert5 
##              0.1694              0.1500              0.1685              0.2368 
##           homeowner           num_child              income              female 
##              0.1768              0.1192              2.6877              0.2379 
##              wealth          home_value         med_fam_inc         avg_fam_inc 
##              6.4859         906581.4713          30183.1011          28528.3742 
##           pct_lt15k            num_prom      lifetime_gifts        largest_gift 
##            146.6433            518.9120          22314.4274            507.0452 
##           last_gift months_since_donate            time_lag            avg_gift 
##            109.7315             16.7713             31.3798             55.5123 
##              target 
##              0.2501

We then plotted the variance to produce the following graph:

op <- par(no.readonly = TRUE)
plot(var, type = "b", col = rainbow_hcl(10), xlim = c(1, 21), pch = 16, xlab = NA, xaxt='n',ylab="Variance")
axis(side = 1, at=1:21, las = 2, labels = names(fund.num))
abline(h=50, col = "darkgray")

Collinearity

We used the following code to check collinearity:

fund.nocat <- as.data.frame(fundraising[, c(6,7,9:20)])
vif <-  as.data.frame(usdm:: vif(fund.nocat))
vif
##              Variables       VIF
## 1            num_child  1.025019
## 2               income  1.194773
## 3               wealth  1.508819
## 4           home_value  2.493018
## 5          med_fam_inc 18.423616
## 6          avg_fam_inc 20.688945
## 7            pct_lt15k  2.040761
## 8             num_prom  1.962585
## 9       lifetime_gifts  1.994202
## 10        largest_gift  1.715238
## 11           last_gift  4.153071
## 12 months_since_donate  1.145515
## 13            time_lag  1.032467
## 14            avg_gift  4.469569

We then plotted the collinearity to produce the following graph:

op <- par(no.readonly = TRUE)
plot(vif$VIF, type = "b", col = rainbow_hcl(21), xlim = c(1,14), pch = 16, ylab = "Vif",  xlab = NA, xaxt='n')
axis(side = 1, at=1:14, las = 2, labels = names(fund.nocat))
abline(h=5, col = "blue")
abline(h=1, col = "red")

The variables med_fam_inc and avg_fam_inc have variance inflation of over 10. This means colinearity exists and that these predictors are highly unlikely to be significant. Based on collinearity values that are low, the following predictors are significant: * med_fam_inc * avg_fam_inc * last_gift * avg_gift Based on VIF values close to 1, the following predictors are possibly significant: * num_child * income * wealth * largest_gift * month_since_donate * time_lag

Partitioning the Data

We partition the original data into two sets. The training set has 80% of the records while the testing set has the remaining 20%. The following code was used to partition the data:

set.seed(12345)

inTrain <- caret::createDataPartition(fundraising$target, p = 0.8, list = FALSE)
inTrain <- sample(3000, 2400)
train <- fundraising[inTrain, ]
test <- fundraising[-inTrain, ]

Classification Models

The classification models used in this analysis are Logistic Regression, K-Nearest Neighbors, and Support Vector Machine with Radial as the Kernel.

Logistic Regression

Using the following chunks of code, we fit a logistic regression model to the data.

glm.fit = glm(target ~ homeowner + num_child + female + income + wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + num_prom + largest_gift + last_gift + avg_gift + months_since_donate + time_lag + home_value, data = train, family = "binomial")
summary(glm.fit)
## 
## Call:
## glm(formula = target ~ homeowner + num_child + female + income + 
##     wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + 
##     num_prom + largest_gift + last_gift + avg_gift + months_since_donate + 
##     time_lag + home_value, family = "binomial", data = train)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7783  -1.1505   0.4798   1.1614   1.7596  
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -1.613e+00  5.066e-01  -3.185  0.00145 ** 
## homeownerNo          8.207e-04  1.054e-01   0.008  0.99379    
## num_child            2.499e-01  1.290e-01   1.937  0.05275 .  
## femaleNo             2.126e-02  8.586e-02   0.248  0.80444    
## income              -6.796e-02  2.915e-02  -2.331  0.01973 *  
## wealth              -2.365e-02  1.995e-02  -1.185  0.23588    
## med_fam_inc         -1.143e-03  1.027e-03  -1.112  0.26601    
## avg_fam_inc          1.257e-03  1.126e-03   1.116  0.26440    
## pct_lt15k           -1.532e-03  4.876e-03  -0.314  0.75340    
## lifetime_gifts       1.283e-04  3.808e-04   0.337  0.73612    
## num_prom            -3.802e-03  2.564e-03  -1.483  0.13807    
## largest_gift        -1.618e-03  2.991e-03  -0.541  0.58854    
## last_gift            1.302e-02  8.651e-03   1.504  0.13246    
## avg_gift             1.322e-02  1.282e-02   1.031  0.30243    
## months_since_donate  5.508e-02  1.136e-02   4.851 1.23e-06 ***
## time_lag            -3.917e-03  7.499e-03  -0.522  0.60143    
## home_value          -1.120e-04  7.003e-05  -1.599  0.10991    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3327.1  on 2399  degrees of freedom
## Residual deviance: 3247.8  on 2383  degrees of freedom
## AIC: 3281.8
## 
## Number of Fisher Scoring iterations: 4
glm.prob <- predict(glm.fit, test, type = "response")
glm.pred <- as.factor(ifelse(glm.prob > 0.5, "Donor", "No Donor"))

car::vif(glm.fit)
##           homeowner           num_child              female              income 
##            1.134218            1.030344            1.017859            1.317792 
##              wealth         med_fam_inc         avg_fam_inc           pct_lt15k 
##            1.506162           18.084577           20.461406            2.020462 
##      lifetime_gifts            num_prom        largest_gift           last_gift 
##            1.995230            1.872537            1.842279            3.607303 
##            avg_gift months_since_donate            time_lag          home_value 
##            3.918775            1.123660            1.031667            2.495922

To check the accuracy of the model, we used the following code:

mean(glm.pred == test$target)
## [1] 0.43

This model can accurately predict only 43% of the data.

Next, we fit a model with the significant predictors to improve the accuracy. This becomes our Final Model. The following code was used:

glm.fi <- glm(target ~ num_child + income + months_since_donate, data =train, family = "binomial")
summary(glm.fit)
## 
## Call:
## glm(formula = target ~ homeowner + num_child + female + income + 
##     wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + 
##     num_prom + largest_gift + last_gift + avg_gift + months_since_donate + 
##     time_lag + home_value, family = "binomial", data = train)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7783  -1.1505   0.4798   1.1614   1.7596  
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -1.613e+00  5.066e-01  -3.185  0.00145 ** 
## homeownerNo          8.207e-04  1.054e-01   0.008  0.99379    
## num_child            2.499e-01  1.290e-01   1.937  0.05275 .  
## femaleNo             2.126e-02  8.586e-02   0.248  0.80444    
## income              -6.796e-02  2.915e-02  -2.331  0.01973 *  
## wealth              -2.365e-02  1.995e-02  -1.185  0.23588    
## med_fam_inc         -1.143e-03  1.027e-03  -1.112  0.26601    
## avg_fam_inc          1.257e-03  1.126e-03   1.116  0.26440    
## pct_lt15k           -1.532e-03  4.876e-03  -0.314  0.75340    
## lifetime_gifts       1.283e-04  3.808e-04   0.337  0.73612    
## num_prom            -3.802e-03  2.564e-03  -1.483  0.13807    
## largest_gift        -1.618e-03  2.991e-03  -0.541  0.58854    
## last_gift            1.302e-02  8.651e-03   1.504  0.13246    
## avg_gift             1.322e-02  1.282e-02   1.031  0.30243    
## months_since_donate  5.508e-02  1.136e-02   4.851 1.23e-06 ***
## time_lag            -3.917e-03  7.499e-03  -0.522  0.60143    
## home_value          -1.120e-04  7.003e-05  -1.599  0.10991    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3327.1  on 2399  degrees of freedom
## Residual deviance: 3247.8  on 2383  degrees of freedom
## AIC: 3281.8
## 
## Number of Fisher Scoring iterations: 4
glm.prob <- predict(glm.fit, test, type = "response")
glm.pred <- as.factor(ifelse(glm.prob > 0.5, "Donor", "No Donor"))

car::vif(glm.fit)
##           homeowner           num_child              female              income 
##            1.134218            1.030344            1.017859            1.317792 
##              wealth         med_fam_inc         avg_fam_inc           pct_lt15k 
##            1.506162           18.084577           20.461406            2.020462 
##      lifetime_gifts            num_prom        largest_gift           last_gift 
##            1.995230            1.872537            1.842279            3.607303 
##            avg_gift months_since_donate            time_lag          home_value 
##            3.918775            1.123660            1.031667            2.495922

To check the accuracy of the new model, we used the following code:

mean(glm.pred == test$target)
## [1] 0.43

This new model can accurately predict only 43.67% of the data. This is slightly better than the original model we created.

K-Nearest Neighbors

Using the following chunks of code, we fit a KNN model to the data.

# training matrix
train.x <- as.matrix(train[, c(6, 7, 9:20)])
# testing matrix
test.x <- as.matrix(test[, c(6, 7, 9:20)])

set.seed(12345)

knn.pred <- knn(train.x, test.x, train$target , k = 10)

# check accuracy 
mean(knn.pred == test$target)
## [1] 0.5033333

This produced an accuracy rate of 50.33%. Now we will fit a new model with the significant predictors to become our Final Model. The following code was used:

# training matrix
train.x <- as.matrix(train[, c(6, 7, 18)])
# testing matrix
test.x <- as.matrix(test[, c(6, 7, 18)])

set.seed(12345)

knn.pred <- knn(train.x, test.x, train$target , k = 10)

mean(knn.pred == test$target)
## [1] 0.5366667

This produced an accuracy rate of 53.66% which is higher than the original KNN model.

Support Vector Machine with Radial as the Kernel

Using the following chunks of code, we fit a SVM model to the data using Radial as the Kernel.

# cross validation
tune.rad <- tune(svm, target ~ homeowner + num_child + female + income + wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + num_prom + largest_gift + last_gift + avg_gift + months_since_donate + time_lag + home_value, data = train, kernel = "radial", ranges = list(cost=seq(0.01, 5, 0.333)))
op <- tune.rad$best.parameters$cost

# svm radial
svm.rad <- svm(target ~ homeowner + num_child + female + income + wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + num_prom + largest_gift + last_gift + avg_gift + months_since_donate + time_lag + home_value, kernel = "radial", data = train, cost = op)
summary(svm.rad)
## 
## Call:
## svm(formula = target ~ homeowner + num_child + female + income + 
##     wealth + med_fam_inc + avg_fam_inc + pct_lt15k + lifetime_gifts + 
##     num_prom + largest_gift + last_gift + avg_gift + months_since_donate + 
##     time_lag + home_value, data = train, kernel = "radial", cost = op)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1.342 
## 
## Number of Support Vectors:  2197
## 
##  ( 1100 1097 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  Donor No Donor
svm.pred <- predict(svm.rad, test)

mean(svm.pred == test$target)
## [1] 0.56

This model produced a 56% accuracy rate. We will now fit a new model with the significant predictors that will become our Final Model. The following code was used:

# cross validation
tune.rad <- tune(svm, target ~ num_child + income + months_since_donate, data = train, kernel = "radial", ranges = list(cost = seq(0.01, 5, 0.333)))
op = tune.rad$best.parameters$cost

# svm radial
svm.rad = svm(target ~ num_child + income + months_since_donate, kernel = "radial", data = train, cost = op)
summary(svm.rad)
## 
## Call:
## svm(formula = target ~ num_child + income + months_since_donate, 
##     data = train, kernel = "radial", cost = op)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.676 
## 
## Number of Support Vectors:  2205
## 
##  ( 1105 1100 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  Donor No Donor
svm.pred <- predict(svm.rad, test)

mean(svm.pred == test$target)
## [1] 0.57

This new model produced a 57% accuracy rate; 1% better than the first.

Types of Analysis Performed

We have performed Logistic Regression, K-Nearest Neighbors, and Support Vector Machine (with Radial as the Kernel) models on this data. These were the three most suitable for the data.

Exclusions

Through analysis, we identified variables that were insignificant (they made no difference in accuracy) and variables that had high VIF values. These two types of variables were excluded from the models for accuracy.

Model Performance

SVM with Radial Kernel had the best performance of the three models demonstrated. It produced a 57% accuracy rate. KNN had the second best accuracy rate at 53%. Logistic Regression performed the worst with only a 43% accuracy rate.

Best Model

Our best model was SVM where we used the .tune() function for cross-validation. The lowest cost parameter determined was 0.34.

Testing

The following code has been used to test the model:

future.value = predict(svm.rad, future.fundraising)

Final CSV File

To create a final CSV file, the following code was used:

write.table(future.value, file="Future Value Predictions.csv", col.names = c("value"),sep = ",", row.names = F, quote = F)