I. Introduction

According to the National Automobile Dealers Association (NADA) Car Price Guide, in the United States, “the number of used car transactions is approximately three times as large as the number of new car transactions. Furthermore, the dispersion of used car prices is approximately five times as large as the dispersion of new car prices, suggesting that secondary market play an important role in broadening the spectrum of goods available to consumers.” (Gavazza, 2012) Given the huge amount and wide-spreaded influence of used car market, I would like to figure out these questions: What factors are close related to the prices of used cars? How significant these factors will influence the used car prices? Can we find out a formula to predict the prices of certain used cars? By answering these questions, I hope to provide guidance for those who would like to buy a used car for their daily life.

II. Data and Methods

Data Sourse

The main data source for this project is from Kaggle – Used Car database.

Key variables include but not limited to: name (name of the car), seller (private or dealer), price (the price of the ad to sell the car), vehicle type, year or registration (at which year the car is first registered), gearbox, power PS (power of the car in the PS), model, kilometer (how many kilometers the car has driven), month of registration (at which month the car was first registered), fuel type, brand, not repaired damage (if the car has a damage which is not repaired yet), nrOfPictures (number of pictures in the ad), postal code, dateCrawled (when the ad was first crawled), and lastSeenOnline (when the crawler saw the ad last online). The dataset consists of 21 variables with more than 370,000 used car items.

The dataset consists of 21 variables with more than 370,000 used car items.

Data Analysis Methods:

To find out the answer to the research questions, I use different ways to explore the trend and correlation in the U.S. used cars market, including mapping, line plots, linear regression, etc. But before this, the first step is cleaning the data. After the data visualization, I do the linear regression to get the correlation among variables.

III. Data Analysis and Visualization

Data Cleaning

Import data

## Classes 'data.table' and 'data.frame':   371824 obs. of  20 variables:
##  $ dateCrawled        : Factor w/ 34 levels "3/10/2016","3/11/2016",..: 15 15 5 8 22 31 28 12 31 8 ...
##  $ name               : Factor w/ 233698 levels "!!!!!!!!!!!!!!!!!!!!!!!!__GOLF_3_CABRIO___!!!!!!!!!!!!!!!!!!!!!!!",..: 79297 4148 90664 76682 170655 27397 145438 188524 64280 190572 ...
##  $ seller             : Factor w/ 2 levels "commercial","private": 2 2 2 2 2 2 2 2 2 2 ...
##  $ offerType          : Factor w/ 2 levels "offer","petition": 1 1 1 1 1 1 1 1 1 1 ...
##  $ price              : int  480 18300 9800 1500 3600 650 2200 0 14500 999 ...
##  $ abtest             : Factor w/ 2 levels "control","test": 2 2 2 2 2 2 2 2 1 2 ...
##  $ vehicleType        : Factor w/ 9 levels "","bus","cabrio",..: 1 4 9 8 8 6 3 6 2 8 ...
##  $ yearOfRegistration : int  1993 2011 2004 2001 2008 1995 2004 1980 2014 1998 ...
##  $ gearbox            : Factor w/ 3 levels "","automatic",..: 3 3 2 3 3 3 3 3 3 3 ...
##  $ powerPS            : int  0 190 163 75 69 102 109 50 125 101 ...
##  $ model              : Factor w/ 252 levels "","100","145",..: 120 1 121 120 105 13 10 42 63 120 ...
##  $ kilometer          : int  150000 125000 125000 150000 90000 150000 150000 40000 30000 150000 ...
##  $ monthOfRegistration: int  0 5 8 6 7 10 8 7 8 0 ...
##  $ fuelType           : Factor w/ 8 levels "","cng","diesel",..: 8 3 3 8 3 8 8 8 8 1 ...
##  $ brand              : Factor w/ 40 levels "alfa_romeo","audi",..: 39 2 15 39 32 3 26 39 11 39 ...
##  $ notRepairedDamage  : Factor w/ 3 levels "","No","Yes": 1 3 1 2 2 3 2 2 1 1 ...
##  $ dateCreated        : Factor w/ 114 levels "1/10/2016","1/13/2016",..: 87 87 75 78 95 105 102 84 105 78 ...
##  $ nrOfPictures       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ postalCode         : int  70435 66954 90480 91074 60437 33775 67112 19348 94505 27472 ...
##  $ lastSeen           : Factor w/ 34 levels "3/10/2016","3/11/2016",..: 34 34 32 8 33 33 32 16 31 22 ...
##  - attr(*, ".internal.selfref")=<externalptr>

Cleaning Data Process:

1. Cleaning Data and Convert Date Time format

autos$offerType <- NULL
autos$name <- NULL
autos$seller <- NULL
autos$nrOfPictures <- NULL
autos$abtest <- NULL

autos$dateCrawled <- ymd_hms(autos$dateCrawled)
autos$dateCreated <- ymd_hms(autos$dateCreated)
autos$lastSeen <- ymd_hms(autos$lastSeen)

2. Cleaning on Variable “Price”

I observed that in “Price” variable, it contents $0, $99999999, $12345678, $11111111, $1000000, which might be some random numbers key in by the seller and not an authentic used car price. I consider those rows as noise, and decide to remove those from the dataset.

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 1.150e+03 2.950e+03 1.729e+04 7.200e+03 2.147e+09
##     95% 
## 19788.5
##   99% 
## 36000
## 1% 
##  0
##  5% 
## 200

From the density chart and boxchart above, the price variable become more normalize after removing the top/bottom 5%. So I decide to use only the 5% - 95% range of the price prevent the noise.

auto <- auto[(auto$price > quantile(auto$price,0.05)) & (auto$price < quantile(auto$price,0.95)),]

3. Cleaning on Variable “vehicleType”

The vehicle Type is a very important indicator for used car price, so I decide to remove data with no vehicle type.

summary(auto$vehicleType)
##                 bus    cabrio     coupe     kombi limousine     other 
##     30857     28054     19567     15214     62749     87923      2981 
## small car       suv 
##     74835     11582
ggplot(aes(x=auto$vehicleType, y=auto$price), data = auto) + 
geom_boxplot()

Remove vehicleType is missing.

summary(auto$vehicleType)
##                 bus    cabrio     coupe     kombi limousine     other 
##     30857     28054     19567     15214     62749     87923      2981 
## small car       suv 
##     74835     11582
auto <- auto[auto$vehicleType != "",]

4. Cleaning on Variable “yearOfRegistration”

By viewing the boxplot and density chart, I decide to use only 1975-2016 as the range of yearOfRegistration.

summary(auto$yearOfRegistration)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1910    1999    2003    2002    2007    2018
p1 <- ggplot(aes(x="yearOfRegistration", y=yearOfRegistration), data = auto) + 
      geom_boxplot()

p2 <- ggplot(aes(x="yearOfRegistration 1975-2016", y=yearOfRegistration), data = auto) + 
      geom_boxplot() +
      ylim(1975, 2016)

grid.arrange(p1, p2, ncol = 2)

d1 <- ggplot(auto, aes(x=auto$yearOfRegistration)) + geom_density()

d2 <- ggplot(auto, aes(x=auto$yearOfRegistration)) + geom_density() +
      xlim(1975, 2016)
grid.arrange(d1, d2, ncol = 2)

I removed yearOfRegistration before 1975 and after 2016.

auto <- auto[which((auto$yearOfRegistration >= 1975) & 
                       (auto$yearOfRegistration <= 2016)),]

5. Cleaning missing data in other variables.

I decide to replace missing data with “NA” within below variables. By replace with NA but not removing those missing data, can help us remain sufficeient data for future analsis.

summary(auto)
##   dateCrawled         price          vehicleType    yearOfRegistration
##  Min.   :NA       Min.   :  202   limousine:87235   Min.   :1975      
##  1st Qu.:NA       1st Qu.: 1350   small car:74625   1st Qu.:1999      
##  Median :NA       Median : 3100   kombi    :62684   Median :2003      
##  Mean   :NA       Mean   : 4762   bus      :27991   Mean   :2003      
##  3rd Qu.:NA       3rd Qu.: 6900   cabrio   :19334   3rd Qu.:2007      
##  Max.   :NA       Max.   :19780   coupe    :14886   Max.   :2016      
##  NA's   :301108                   (Other)  :14353                     
##       gearbox          powerPS            model          kilometer     
##           :  8490   Min.   :    0.0   golf   : 24158   Min.   :  5000  
##  automatic: 59736   1st Qu.:   75.0   andere : 20712   1st Qu.:125000  
##  manually :232882   Median :  109.0   3er    : 17779   Median :150000  
##                     Mean   :  115.9          : 10933   Mean   :128657  
##                     3rd Qu.:  145.0   polo   : 10664   3rd Qu.:150000  
##                     Max.   :20000.0   corsa  : 10102   Max.   :150000  
##                                       (Other):206760                   
##  monthOfRegistration    fuelType                brand      
##  Min.   : 0.000      petrol :190518   volkswagen   :64060  
##  1st Qu.: 3.000      diesel : 91474   bmw          :32825  
##  Median : 6.000             : 13382   opel         :32634  
##  Mean   : 6.005      lpg    :  4830   mercedes_benz:28466  
##  3rd Qu.: 9.000      cng    :   503   audi         :25988  
##  Max.   :12.000      hybrid :   213   ford         :20841  
##                      (Other):   188   (Other)      :96294  
##  notRepairedDamage  dateCreated       postalCode       lastSeen     
##     : 45225        Min.   :NA       Min.   : 1067   Min.   :NA      
##  No :227677        1st Qu.:NA       1st Qu.:30827   1st Qu.:NA      
##  Yes: 28206        Median :NA       Median :49824   Median :NA      
##                    Mean   :NA       Mean   :50998   Mean   :NA      
##                    3rd Qu.:NA       3rd Qu.:71579   3rd Qu.:NA      
##                    Max.   :NA       Max.   :99998   Max.   :NA      
##                    NA's   :301108                   NA's   :301108
auto$gearbox[auto$gearbox == ""] <- NA
auto$model[auto$model == ""] <- NA
auto$monthOfRegistration[auto$monthOfRegistration == "0"] <- NA
auto$fuelType[auto$fuelType == ""] <- NA
auto$brand[auto$brand == ""] <- NA
auto$notRepairedDamage[auto$notRepairedDamage == ""] <- NA
summary(auto)
##   dateCrawled         price          vehicleType    yearOfRegistration
##  Min.   :NA       Min.   :  202   limousine:87235   Min.   :1975      
##  1st Qu.:NA       1st Qu.: 1350   small car:74625   1st Qu.:1999      
##  Median :NA       Median : 3100   kombi    :62684   Median :2003      
##  Mean   :NA       Mean   : 4762   bus      :27991   Mean   :2003      
##  3rd Qu.:NA       3rd Qu.: 6900   cabrio   :19334   3rd Qu.:2007      
##  Max.   :NA       Max.   :19780   coupe    :14886   Max.   :2016      
##  NA's   :301108                   (Other)  :14353                     
##       gearbox          powerPS            model          kilometer     
##           :     0   Min.   :    0.0   golf   : 24158   Min.   :  5000  
##  automatic: 59736   1st Qu.:   75.0   andere : 20712   1st Qu.:125000  
##  manually :232882   Median :  109.0   3er    : 17779   Median :150000  
##  NA's     :  8490   Mean   :  115.9   polo   : 10664   Mean   :128657  
##                     3rd Qu.:  145.0   corsa  : 10102   3rd Qu.:150000  
##                     Max.   :20000.0   (Other):206760   Max.   :150000  
##                                       NA's   : 10933                   
##  monthOfRegistration    fuelType                brand      
##  Min.   : 1.0        petrol :190518   volkswagen   :64060  
##  1st Qu.: 4.0        diesel : 91474   bmw          :32825  
##  Median : 6.0        lpg    :  4830   opel         :32634  
##  Mean   : 6.4        cng    :   503   mercedes_benz:28466  
##  3rd Qu.: 9.0        hybrid :   213   audi         :25988  
##  Max.   :12.0        (Other):   188   ford         :20841  
##  NA's   :18590       NA's   : 13382   (Other)      :96294  
##  notRepairedDamage  dateCreated       postalCode       lastSeen     
##      :     0       Min.   :NA       Min.   : 1067   Min.   :NA      
##  No  :227677       1st Qu.:NA       1st Qu.:30827   1st Qu.:NA      
##  Yes : 28206       Median :NA       Median :49824   Median :NA      
##  NA's: 45225       Mean   :NA       Mean   :50998   Mean   :NA      
##                    3rd Qu.:NA       3rd Qu.:71579   3rd Qu.:NA      
##                    Max.   :NA       Max.   :99998   Max.   :NA      
##                    NA's   :301108                   NA's   :301108

IV.Univariate Analysis

1. Vehicle Type Frequency Chart

I can tell from the graph that Limousine, small car and kombi are the most popular type in the dataset.

ggplot(auto, aes(x=vehicleType)) + 
  geom_bar(fill= I('#6edeea'), color='black') +
  labs(x= 'Vehicle Type', y= 'number of cars',title='Vehicle Type Frequency Distribuation') 

2. Kilometer Histogram

I can tell from the chart that the Kilometer is concentrated at 150000. This value might be the upper limit of the used car dataset. So all value above will be consider as 150000.

ggplot(auto, aes(auto$kilometer)) +
  geom_histogram(color='black', fill= I('#a67cb7'),binwidth=20000) +
  labs(x= 'Kilometer', y='Number of Cars', title='Kilometer Histogram')

3. Fuel Type Frequency Chart

Most of used car are using petrol or diesel in this dataset.

ggplot(aes(x=fuelType),data=auto) +
  geom_bar(aes(fill=fuelType), color='black') +
  labs(x= 'Fuel Type', y='Number of Cars', title= 'Fuel Type Frequency Chart')

4. Top 10 Brand Frequency Chart

There are too many Brand in the dataset. But just the top 10 brands contribute 80% to the dataset. In order to see meaningful chart on the Brand, our team deicde only shown the top 10 brands frequency distribution.

top10 <- auto %>%
  count(brand) %>%
  top_n(10) %>%
  arrange(n, brand) %>%
  mutate(brand = factor(brand, levels = unique(brand)))
sumtop10 <-sum(top10$n)
top10
## # A tibble: 10 x 2
##            brand     n
##           <fctr> <int>
##  1          seat  5805
##  2          fiat  7747
##  3       peugeot  9691
##  4       renault 14626
##  5          ford 20841
##  6          audi 25988
##  7 mercedes_benz 28466
##  8          opel 32634
##  9           bmw 32825
## 10    volkswagen 64060

Most of used car is volkswagen in the dataset.

auto_subbrand <- subset (auto, brand %in% c("seat", "fiat","peugeot" ,"renault", "ford", "audi", "mercedes_benz" , "bmw", "opel", "volkswagen"))
ggplot(auto_subbrand, aes(x = reorder(brand, -table(brand)[brand]))) +
  geom_bar(fill='darkblue', color='black') +
  labs(x= 'Top 10 Brands', y= 'Number of Cars', title= 'Top 10 Brands Frequency Diagram')

5. Vehicle Price

V. Bivariate Analysis

1. Price vs. Vehicle Type

I find that the most expensive vehicle type is SUV and the cheapest is kleinwagen. However, kleinwagen have many outliers.

2. Engine Power vs. Price

There is a linear correlation between PowerPS and price.

3. Price vs. Age by Vehicle Type

I add a new variable here: age, which means the years between cars first registered and the years they sold. Price keeps decreasing between 0-20 years (age) but starts to increase after 20 years (between 20-30 years).

4. Price vs. Gearbox by Fuel Type

Fuel type seems do not to have significant influence to the correlation between price and gearbox.

5. Price vs. Fuel Type by Vehichle Type

VI. Correlation

VII. Linear Regression Model for Price

I have divided the dataset into 2 parts: training data and validation data. I would like to run regression model on the training data, and use the model results to predict validation data afterwards.

##  [1] "dateCrawled"         "price"               "vehicleType"        
##  [4] "yearOfRegistration"  "gearbox"             "powerPS"            
##  [7] "model"               "kilometer"           "monthOfRegistration"
## [10] "fuelType"            "brand"               "notRepairedDamage"  
## [13] "dateCreated"         "postalCode"          "lastSeen"           
## [16] "age"                 "bus"                 "cabrio"             
## [19] "coupe"               "kleinwagen"          "kombi"              
## [22] "limousine"           "suv"                 "automatik"          
## [25] "manuell"             "benzin"              "diesel"             
## [28] "lpg"
##  num [1:210775] 15 14 15 15 5 15 13 17 6 3 ...
## 
## Call:
## lm(formula = price ~ kilometer + age_num + powerPS + bus + cabrio + 
##     coupe + kleinwagen + kombi + limousine + suv + automatik + 
##     manuell + benzin + diesel + lpg, data = training_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -44514  -1877   -516   1181  23151 
## 
## Coefficients: (4 not defined because of singularities)
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)  1.150e+04  2.872e+01  400.326  < 2e-16 ***
## kilometer   -3.095e-02  2.187e-04 -141.519  < 2e-16 ***
## age_num     -3.667e+02  1.421e+00 -258.147  < 2e-16 ***
## powerPS      2.221e+00  3.989e-02   55.681  < 2e-16 ***
## bus          1.697e+03  2.809e+01   60.404  < 2e-16 ***
## cabrio       4.245e+03  3.075e+01  138.039  < 2e-16 ***
## coupe        4.139e+03  3.445e+01  120.138  < 2e-16 ***
## kleinwagen          NA         NA       NA       NA    
## kombi        1.712e+03  2.223e+01   77.043  < 2e-16 ***
## limousine    2.252e+03  1.949e+01  115.583  < 2e-16 ***
## suv          4.731e+03  3.892e+01  121.547  < 2e-16 ***
## automatik           NA         NA       NA       NA    
## manuell             NA         NA       NA       NA    
## benzin              NA         NA       NA       NA    
## diesel       1.534e+03  1.781e+01   86.114  < 2e-16 ***
## lpg          1.483e+02  5.443e+01    2.725  0.00644 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3089 on 196588 degrees of freedom
##   (14175 observations deleted due to missingness)
## Multiple R-squared:  0.5291, Adjusted R-squared:  0.5291 
## F-statistic: 2.008e+04 on 11 and 196588 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = price ~ kilometer + age_num + powerPS + bus + cabrio + 
##     coupe + +kombi + limousine + suv + diesel + lpg, data = training_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -46583  -1872   -518   1171  23118 
## 
## Coefficients:
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)  1.143e+04  2.837e+01  402.886  < 2e-16 ***
## kilometer   -3.081e-02  2.151e-04 -143.264  < 2e-16 ***
## age_num     -3.643e+02  1.399e+00 -260.415  < 2e-16 ***
## powerPS      2.326e+00  3.968e-02   58.625  < 2e-16 ***
## bus          1.721e+03  2.770e+01   62.126  < 2e-16 ***
## cabrio       4.212e+03  3.032e+01  138.892  < 2e-16 ***
## coupe        4.107e+03  3.408e+01  120.528  < 2e-16 ***
## kombi        1.701e+03  2.189e+01   77.714  < 2e-16 ***
## limousine    2.228e+03  1.920e+01  116.026  < 2e-16 ***
## suv          4.721e+03  3.843e+01  122.833  < 2e-16 ***
## diesel       1.487e+03  1.754e+01   84.754  < 2e-16 ***
## lpg          1.456e+02  5.390e+01    2.701  0.00692 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3085 on 201464 degrees of freedom
##   (9299 observations deleted due to missingness)
## Multiple R-squared:  0.5258, Adjusted R-squared:  0.5258 
## F-statistic: 2.031e+04 on 11 and 201464 DF,  p-value: < 2.2e-16
## Response variable: price 
## Total response variance: 20073465 
## Analysis based on 201476 observations 
## 
## 11 Regressors: 
## kilometer age_num powerPS bus cabrio coupe kombi limousine suv diesel lpg 
## Proportion of variance explained by model: 52.58%
## Metrics are normalized to sum to 100% (rela=TRUE). 
## 
## Relative importance metrics: 
## 
##                    lmg
## kilometer 0.1903063625
## age_num   0.4940801484
## powerPS   0.0337493254
## bus       0.0079067324
## cabrio    0.0504084411
## coupe     0.0301465268
## kombi     0.0102715681
## limousine 0.0201188747
## suv       0.0594323878
## diesel    0.1031269834
## lpg       0.0004526496
## 
## Average coefficients for different model sizes: 
## 
##                      1X           2Xs           3Xs           4Xs
## kilometer   -0.04664638   -0.04501501   -0.04336278   -0.04170257
## age_num   -465.95991639 -458.29559983 -449.83679638 -440.63000609
## powerPS      4.13883719    4.00587169    3.86855812    3.72597454
## bus        818.57117134  732.47004908  665.11708790  620.80212931
## cabrio    2192.44574709 2311.96801943 2438.58949334 2574.77654106
## coupe     1333.48267538 1541.86570739 1753.54173354 1970.72389326
## kombi      317.53163609  305.33249256  311.07933650  338.11959426
## limousine   19.08155833  171.51638365  328.76234629  492.63634391
## suv       4437.00303211 4289.75350565 4161.25871715 4055.74725392
## diesel    2721.41732736 2645.94710162 2568.87490528 2488.51718860
## lpg       -616.85645476 -457.20375020 -307.16862102 -168.39438942
##                     5Xs           6Xs           7Xs           8Xs
## kilometer   -0.04004788   -0.03841292   -0.03681191   -0.03525693
## age_num   -430.72884023 -420.20113733 -409.14199177 -397.69712795
## powerPS      3.57675095    3.41884960    3.24924076    3.06344710
## bus        605.30371435  626.25076064  693.50287722  819.53578316
## cabrio    2723.58846520 2888.95001751 3075.98247040 3291.42295870
## coupe     2196.48167206 2435.05292201 2692.22106960 2975.78924981
## kombi      390.88058307  475.13091089  598.26728700  769.61977883
## limousine  665.83804639  852.17765739 1056.83484933 1286.65475383
## suv       3978.65475071 3937.18402506 3940.94053438 4002.60921085
## diesel    2402.52902784 2307.68964445 2199.62518223 2072.46621363
## lpg        -42.98651731   66.19297946  155.09591529  217.88311141
##                     9Xs          10Xs          11Xs
## kilometer   -0.03375289   -0.03228762   -0.03081419
## age_num   -386.10354086 -374.75741913 -364.32289198
## powerPS      2.85489051    2.61391759    2.32627561
## bus       1019.79424167 1312.95095473 1720.98898808
## cabrio    3544.15045970 3845.83187361 4211.70874102
## coupe     3296.18035689 3667.20135284 4107.03815583
## kombi     1000.74934283 1305.69193828 1701.08699894
## limousine 1550.47853416 1859.50046504 2227.64372298
## suv       4138.60512797 4369.60051767 4720.80995921
## diesel    1918.45096837 1727.50208107 1486.82568697
## lpg        246.10483770  227.65618536  145.56399356

VIII. Discussion

First, I identify from the data graph that Limousine, small car and kombi are the most popular car type in the dataset. And the most expensive car type is SUV, the most cheapest car type is small car. Second, the most common car brands in the dataset are Volkswagen, Opel, BMW, Mercedes, Audi, Ford, Renault, Peugeot, Fiat and Seat. The top 10 brands contribute to 80% of the data. Also, according to the regression model, and car type, lpg and the age are the most important factors to predict used car price. In the mean time, I find that most of kilometer in the data set is 150,000. This might be the upper limit kilometer of data source platform. The kilometer above 150,000 will be considered 150,000. With the help of visualization, I read a story that most of the second-hand car transactions are made within 35 days, with a high selling ration for the first 10 days including the same day sale for day 0. It implies that Ebay-Kleinanzeigen does a very good job at marketing, especially searching and satisfying the requirement of customers. Also, which is beyond our preconception is that second-hand car market is very fluid. Specially, With the age of the cars increasing, the price continues to drop down and stops at 20 year, however, what is considered a little bit weird is that the price would increase between 20-30. The possible explanation for this maybe that antique emotional attachment could be found in second-hand car aged more than 20 years.

Reference

  1. Gavazza, A., Lizzeri, A., & Roketskiy, N. (2012). A Quantitative Analysis of the Used-Car Market. SSRN Electronic Journal. doi:10.2139/ssrn.2046363

  2. Dallegro, J. A. (2014, September 03). Just What Factors Into The Value Of Your Used Car? Retrieved from https://www.investopedia.com/articles/investing/090314/just-what-factors-value-your-used-car.asp

  3. Abigail, B. (n.d.). The Three Key Things that Affect the Price of a Used Car. Retrieved from https://instamotor.com/sell-car/car-value/3-key-things-that-affect-the-price-of-a-used-car